braner@batcomputer.tn.cornell.edu (braner) (04/17/88)
[] I found what I consider a bug in Laser C. It's been claimed that Laser C did away with ALL the 16-bit limits. Well, not quite, at least not in the case of arrays of structures: [The file test.c:] struct mystruct { /* structure 6 bytes long */ int i; struct mystruct *ptr; }; typedef struct mystruct MYSTRUCT; static MYSTRUCT a[12000]; func() { register MYSTRUCT *p, *q; register unsigned int n; /* 16 bits */ register unsigned long m; /* 32 bits */ q = &a[12000]; p = a; p += 12000; /* should add 6*12000 to the address in p */ m = n = 12000; /* 12000 is within the range of int */ p += n; /* should do the same */ p += m; /* same again */ p = &a[n]; /* same but through the 'index' notation */ } [compile the above code] dis test.o [run the Megamax disassembler] [comments added later: dis.ttp does NOT include the source code lines in the output :-( ] .text _func: LINK A6, #0x0 MOVEM.L D6/D7/A4/A5, -(A7) ; q = &a[12000]; LEA.L 0x11996.L, A0 ; correct: adds 6*12000 MOVEA.L A0, A4 ; (inefficient: needless copying) ; p = a; LEA.L 0x56.L, A0 MOVEA.L A0, A5 ; (inefficient again) ; p += 12000; ADDA.L #0x11940, A5 ; correct ; m = n = 12000; MOVE.W #0x2ee0, D6 ; first assign to m (efficient!) MOVE.W D6, D7 ; copy to n AND.L #0xffff, D6 ; correct: extend m to long ; p += n; MOVE.W D7, D0 MULU.W #0x6, D0 ; n*sizeof(), the result is LONG AND.L #0xffff, D0 ; kill the top word: ADDA.L D0, A5 ; *** WRONG!!! *** ; p += m; MOVE.L D6, D0 ; (unnecessary copying) MOVE.L D0, -(A7) MOVE.L #0x6, D0 MOVE.L D0, -(A7) ; m may be > 64K, so: JSR __ulmul.L ; do the m*sizeof() the hard way... MOVE.L (A7)+, D0 ADDA.L D0, A5 ; this time correct (but slow) ; p = &a[n]; MOVE.W D7, D0 AND.L #0xffff, D0 ; extend n to long MOVE.L D0, -(A7) MOVE.L #0x6, D0 MOVE.L D0, -(A7) JSR __lmul.L MOVE.L (A7)+, D0 LEA.L 0x52.L, A0 ; (could use A5 directly) ADD.L A0, D0 MOVEA.L D0, A5 ; correct (but slow) MOVEM.L (A7)+, D6/D7/A4/A5 UNLK A6 RTS .data In my view the code for "p += n;" should have looked like: MOVE.W D7, D0 EXT.L D0 MULU.L #0x6L, D0 ; (this is a fictitious 68000 op) ADDA.L D0, A5 i.e., either it should be: MOVE.L D7, D0 EXT.L D0 MOVE.L D0, -(A7) MOVE.L #6, -(A7) ; no extra copying JSR __ulmul.L ADDA.L (A7)+, A5 ; no extra copying or, since we know that n (D7) is an int and that MULU takes WORD args and results in a LONG: MOVE.W D7, D0 MULU #6, D0 ; much faster ADDA.L D0, A5 Conclusion: Laser C does not use the fact that the result of a 68000 MULU operation is a LONG. Worse, it truncates to an int (WORD) the offset that is to be added to a pointer (LONG), just because the offset was previously calculated from an int. Note that the latter mistake was NOT made in calculating the address of a[n] at the end. Other tests showed that if the array a[] is smaller than 64K then the calculation of &a[n] is done with a MULU rather than a call to __ulmul(). That's a nice touch (much faster). But the reference must be to the array itself. Assign "p=a;" and then do "p[i]" and it's back to the slow call to __ulmul(), even if there are no large arrays in your program anywhere! (This means that if an array is passed to a function and the function uses it, through an index or otherwise, bye-bye speed...) Moral: Reference to large (>64K) arrays THROUGH AN INDEX (as in a[i]) works OK (albeit slowly, even if i is an int), but to avoid a bug in Laser C, DO NOT ADD an int X to a pointer-to-Y if X*sizeof(Y) may be > 64K. Also, for speed's sake, avoid indexing on a pointer! Note: All this only holds for arrays of structures. In the case of arrays of ints, longs, floats or doubles Laser C knows to calculate the offset via a quick ASL.L. Thanks goodness. - Moshe Braner