pepke@loligo (Eric Pepke) (05/04/89)
First of all, THINK C is great. It is a wonderful development environment, the compiler is fast, the inline assembler works well, and the code is adequate for most purposes. However, there is a major design flaw regarding the use of arrays larger than 32K in version 3.0. In one of my programs I need to use an array of 32*32*32 short integers, which is 64K long. The segment loader problem doesn't bother me. I am perfectly happy allocating the memory, passing a pointer into a function, and declaring it as I want it in the function. So, I tried something like AddHist(theHist, r, g, b) short theHist[32][32][32]; which caused a compiler error, in spite of the fact that the function does not have to define the array at all. So, I thought that somebody had just put the test in the wrong place in the compiler, and I tried AddHist(theHist, r, g, b) short theHist[][32][32]; which is perfectly legal C. It compiled without a complaint. Fortunately, I am paranoid, so I checked to see what code it generated. The code in this case was correct, but it is generally incorrect when a dimension of the array is other than a power of two. Consider these two functions: short Access7(array, a, b) short array[7][7]; short a, b; { register short result; result = array[a][b]; } Access4(array, a, b) short array[4][4]; short a, b; { register short result; result = array[a][b]; } Here is the code produced: ACCESS7 +0000 00196856 LINK A6,#$0000 | 4E56 0000 +0004 0019685A MOVE.L D7,-(A7) | 2F07 +0006 0019685C MOVE.W $000C(A6),D0 | 302E 000C +000A 00196860 MULS.W #$000E,D0 | C1FC 000E +000E 00196864 ADD.L $0008(A6),D0 | D0AE 0008 +0012 00196868 MOVE.W $000E(A6),D1 | 322E 000E +0016 0019686C EXT.L D1 | 48C1 +0018 0019686E ADD.L D1,D1 | D281 +001A 00196870 ADD.L D1,D0 | D081 +001C 00196872 MOVEA.L D0,A0 | 2040 +001E 00196874 MOVE.W (A0),D7 | 3E10 +0020 00196876 MOVE.L (A7)+,D7 | 2E1F +0022 00196878 UNLK A6 | 4E5E +0024 0019687A RTS | 4E75 ACCESS4 +0000 00196884 LINK A6,#$0000 | 4E56 0000 +0004 00196888 MOVE.L D7,-(A7) | 2F07 +0006 0019688A MOVE.W $000C(A6),D0 | 302E 000C +000A 0019688E EXT.L D0 | 48C0 +000C 00196890 ASL.L #$3,D0 | E780 +000E 00196892 ADD.L $0008(A6),D0 | D0AE 0008 +0012 00196896 MOVE.W $000E(A6),D1 | 322E 000E +0016 0019689A EXT.L D1 | 48C1 +0018 0019689C ADD.L D1,D1 | D281 +001A 0019689E ADD.L D1,D0 | D081 +001C 001968A0 MOVEA.L D0,A0 | 2040 +001E 001968A2 MOVE.W (A0),D7 | 3E10 +0020 001968A4 MOVE.L (A7)+,D7 | 2E1F +0022 001968A6 UNLK A6 | 4E5E +0024 001968A8 RTS | 4E75 The techniques in both functions generalize to higher order and larger arrays. ACCESS4 does an arithmetic shift on a longword, so it can be expected to work when the array is larger than 32K. ACCESS7 only does a single word multiply, so it can be expected to fail on large arrays. It is also a *signed* multiply, which makes little sense. The code from MPW C is much more reasonable. It does two unsigned word multiplies, one on the high word and one on the low word of the cumulative address, and adds the two together. It would be *trivial* for the THINK people to fix the code generation. If they are worried about efficiency, they could install one more check box into the options dialog. It would be almost as trivial to fix the parser so that large arrays could explicitly be declared as function parameters. As it is, THINK C is useless for my application. Just because I am currently using a quantization that is a power of two does not mean that I always will, and I cannot really trust future releases of THINK C to generate correct code even for the power of two case. Eric Pepke ARPA: pepke@gw.scri.fsu.edu Supercomputer Computations Research Institute MFENET: pepke@fsu Florida State University SPAN: pepke@scri Tallahassee, FL 32306-4052 BITNET: pepke@fsu Disclaimer: My employers seldom even LISTEN to my opinions. Meta-disclaimer: Any society that needs disclaimers has too many lawyers.
lippin@skippy.berkeley.edu (The Apathist) (05/06/89)
Recently pepke@gw.scri.fsu.edu (Eric Pepke) wrote: [...] ]In one of my programs I need to use an array of 32*32*32 short integers, ]which is 64K long. The segment loader problem doesn't bother me. I am ]perfectly happy allocating the memory, passing a pointer into a function, ]and declaring it as I want it in the function. So, I tried something like ] ]AddHist(theHist, r, g, b) ]short theHist[32][32][32]; ] ]which caused a compiler error, in spite of the fact that the function does ]not have to define the array at all. Looks like they blew that one. I'm surprised it hasn't come up before. ]short Access7(array, a, b) ]short array[7][7]; ]short a, b; ]{ ] register short result; ] result = array[a][b]; ]} [more code deleted] ]The techniques in both functions generalize to higher order and larger ]arrays. ACCESS4 does an arithmetic shift on a longword, so it can be ]expected to work when the array is larger than 32K. ACCESS7 only does a ]single word multiply, so it can be expected to fail on large arrays. It is ]also a *signed* multiply, which makes little sense. The code generated for Access7 looks correct to me. The key point is that the MULS.W instruction produces a long result. Since the code needs to multiply a, a *signed* word, by 14, MULS.W is the most straightforward way to get the right result. Problems with 32K will only arise if a>=32K (which it can't be) or if the row size>=32K, in which case I presume (or at least hope) different code is generated. --Tom Lippincott lippin@math.berkeley.edu "Foo, you are nothing but a charlatan!" --Adventure
pepke@loligo.cc.fsu.edu (Eric Pepke) (05/07/89)
In article <24093@agate.BERKELEY.EDU> lippin@math.berkeley.edu writes: >The code generated for Access7 looks correct to me. The key point is >that the MULS.W instruction produces a long result. Since the code >needs to multiply a, a *signed* word, by 14, MULS.W is the most >straightforward way to get the right result. Problems with 32K will >only arise if a>=32K (which it can't be) or if the row size>=32K, in >which case I presume (or at least hope) different code is generated. The same code is produced if I define access7 as short Access7(array, a, b) short array[][7]; This is perfectly legal C, and it makes no assumptions on how big the array is. If you have enough memory, a call to Access7(40000, 2) should be perfectly reasonable, but it will fail. Eric Pepke ARPA: pepke@gw.scri.fsu.edu Supercomputer Computations Research Institute MFENET: pepke@fsu Florida State University SPAN: pepke@scri Tallahassee, FL 32306-4052 BITNET: pepke@fsu Disclaimer: My employers seldom even LISTEN to my opinions. Meta-disclaimer: Any society that needs disclaimers has too many lawyers.
pepke@loligo.cc.fsu.edu (Eric Pepke) (05/08/89)
In article <696@loligo.cc.fsu.edu> I write: >This is perfectly legal C, and it makes no assumptions on how big the >array is. If you have enough memory, a call to Access7(40000, 2) should >be perfectly reasonable, but it will fail. This statement of mine is nonsense, of course.Such a call is only reasonable in a C where int is defined as bigger than 16 bits. When you use Lightspeed C, as you would have to declare a and b as long integers. When you do that, you get what looks like correct code. Also, my statement that Lightspeed C would generate bad code in some cases where a syntax check would fail is also wrong. The multiplier for the first subscript is determined by dimensions 2 through N, and cases where the multiplier would be greater than 32767 would be rejected first by the test for arrays with size greater than 32767. What is will do is reject certain array declarations needlessly, as in the case of hist[32][32][32]; Eric Pepke ARPA: pepke@gw.scri.fsu.edu Supercomputer Computations Research Institute MFENET: pepke@fsu Florida State University SPAN: pepke@scri Tallahassee, FL 32306-4052 BITNET: pepke@fsu Disclaimer: My employers seldom even LISTEN to my opinions. Meta-disclaimer: Any society that needs disclaimers has too many lawyers.
beard@ux3.lbl.gov (Patrick C Beard) (05/11/89)
In article <698@loligo.cc.fsu.edu> pepke@loligo.UUCP (Eric Pepke) writes: >In article <696@loligo.cc.fsu.edu> I write: >>This is perfectly legal C, and it makes no assumptions on how big the >>array is. If you have enough memory, a call to Access7(40000, 2) should >>be perfectly reasonable, but it will fail. > >This statement of mine is nonsense, of course.Such a call is only reasonable >in a C where int is defined as bigger than 16 bits. When you use Lightspeed >C, as you would have to declare a and b as long integers. When you do that, >you get what looks like correct code. > >Also, my statement that Lightspeed C would generate bad code in some cases >where a syntax check would fail is also wrong. The multiplier for the first >subscript is determined by dimensions 2 through N, and cases where the >multiplier would be greater than 32767 would be rejected first by the >test for arrays with size greater than 32767. > >What is will do is reject certain array declarations needlessly, as in the >case of hist[32][32][32]; > >Eric Pepke ARPA: pepke@gw.scri.fsu.edu >Supercomputer Computations Research Institute MFENET: pepke@fsu >Florida State University SPAN: pepke@scri >Tallahassee, FL 32306-4052 BITNET: pepke@fsu > >Disclaimer: My employers seldom even LISTEN to my opinions. >Meta-disclaimer: Any society that needs disclaimers has too many lawyers. I'm sorry, but I just couldn't let this discussion go by without making a comment. The problem you are discussing, passing static or automatic arrays, such as foo[40][50] to subroutines with a declaration such as foo[][50] is doomed to fail not because of a failing in THINK C (formerly LightSpeed C), but because of an inherent flaw (feature?) in the C programming language. Namely, the duality between arrays and pointers makes passing arrays of dimension >= 2 impossible. The reason is that the compiler assumes that a declaration of foo[][50] really means *foo[50], an array of pointers. The array foo[40][50] is really just one long chunk of memory, and an index into it, such as foo[y][x] is really just foo+40*y+x. The called subroutine has no knowledge of the inner dimensions of the array, and so. has to resort to foo[y][x] meaning *(*(foo+y)+x), or *(foo[y]+x). The solution to all of this, is to always pass around arrays of pointers, not true two-dimensional arrays. Then you can have arrays of dimensions of as large as you like. I hope this sheds some light on this discussion. __________________ Patrick Beard PCBeard@lbl.gov BSI, Berkeley, CA ___________________
lippin@ronzoni.berkeley.edu (The Apathist) (05/13/89)
Recently beard@ux3.lbl.gov (Patrick C Beard) wrote: ] I'm sorry, but I just couldn't let this discussion go by without ] making a comment. The problem you are discussing, passing static or ] automatic arrays, such as foo[40][50] to subroutines with a ] declaration such as foo[][50] is doomed to fail not because of a ] failing in THINK C (formerly LightSpeed C), but because of an inherent ] flaw (feature?) in the C programming language. Namely, the duality ] between arrays and pointers makes passing arrays of dimension >= 2 ] impossible. The reason is that the compiler assumes that a ] declaration of foo[][50] really means *foo[50], an array of pointers. ] The array foo[40][50] is really just one long chunk of memory, and an ] index into it, such as foo[y][x] is really just foo+40*y+x. The ] called subroutine has no knowledge of the inner dimensions of the ] array, and so. has to resort to foo[y][x] meaning *(*(foo+y)+x), or ] *(foo[y]+x). This isn't really a problem: First, a declaration of foo[][50] is treated (in most ways) as (*foo)[50], not *foo[50]. That is, as a pointer to an array rather than as an array of pointers. Second, to index into an array foo[40][50], one calculates &foo[y][x] as &foo[0][0]+50*y+x. The result is that addressing into an array depends on all dimensions but the first, and this is recognized in C syntax, by allowing the first index of a parameter array to be omitted. Note that this has little bearing on (one of) the original complaints, that LSC does not allow formal parameter declarations of the form foo[32][32][32]. While I haven't verified that this is the case, I have found that Sun cc, Gnu cc, and lint all agree with my belief that this is a legal declaration. --Tom Lippincott lippin@math.berkeley.edu "`The problem's all inside your head,' she said to me. `The program's easy if it's done recursively. I'd like to help you in your struggle for a B -- There must be fifty ways to write your program. Fifty ways to write your program.'"
pepke@loligo.cc.fsu.edu (Eric Pepke) (05/17/89)
In article <2595@helios.ee.lbl.gov> beard@ux3.lbl.gov (Patrick C Beard) writes: >I'm sorry, but I just couldn't let this discussion go by without making >a comment. The problem you are discussing, passing static or automatic arrays, >such as foo[40][50] to subroutines with a declaration such as foo[][50] is >doomed to fail not because of a failing in THINK C (formerly LightSpeed C), >but because of an inherent flaw (feature?) in the C programming language. >Namely, the duality between arrays and pointers makes passing arrays of >dimension >= 2 impossible. The reason is that the compiler assumes that >a declaration of foo[][50] really means *foo[50], an array of pointers. >The array foo[40][50] is really just one long chunk of memory, and an >index into it, such as foo[y][x] is really just foo+40*y+x. The called >subroutine has no knowledge of the inner dimensions of the array, and so. >has to resort to foo[y][x] meaning *(*(foo+y)+x), or *(foo[y]+x). Not true. I just tried compiling ArrayRef(foo, a, b, c) int foo[][50][50]; int a, b, c; { register int x; x = foo[a][b][c]; } PointerRef(foo, a, b, c) int *foo[50][50]; int a, b, c; { register int x; x = foo[a][b][c]; } in Lightspeed/THINK C. They generate different code. The code is correct in each case. ArrayRef indexes one long 3-D array, and PointerRef gets a pointer to some integers from a 2-D array of pointers and then adds to the pointer to get the address. So, it although the function bodies look the same, the compiler does the intuitively correct thing based on the declaration. By the way, I have gotten the application working using Lightspeed C, and it does include a big 3-D array which works fine. Thank you once again, Rich and company, for a wonderful development system. Had I not realized my my mistake and switched from MPW C I would probably still be waiting for it to compile. This was the first time I have used the Palette manager, and the speed of your C made exploring its behavior a lot less painful. Eric Pepke ARPA: pepke@gw.scri.fsu.edu Supercomputer Computations Research Institute MFENET: pepke@fsu Florida State University SPAN: pepke@scri Tallahassee, FL 32306-4052 BITNET: pepke@fsu Disclaimer: My employers seldom even LISTEN to my opinions. Meta-disclaimer: Any society that needs disclaimers has too many lawyers.