braner@batcomputer.tn.cornell.edu (braner) (04/17/88)
[]
I found what I consider a bug in Laser C. It's been claimed that
Laser C did away with ALL the 16-bit limits. Well, not quite, at
least not in the case of arrays of structures:
[The file test.c:]
struct mystruct { /* structure 6 bytes long */
int i;
struct mystruct *ptr;
};
typedef struct mystruct MYSTRUCT;
static MYSTRUCT a[12000];
func()
{
register MYSTRUCT *p, *q;
register unsigned int n; /* 16 bits */
register unsigned long m; /* 32 bits */
q = &a[12000];
p = a;
p += 12000; /* should add 6*12000 to the address in p */
m = n = 12000; /* 12000 is within the range of int */
p += n; /* should do the same */
p += m; /* same again */
p = &a[n]; /* same but through the 'index' notation */
}
[compile the above code]
dis test.o [run the Megamax disassembler]
[comments added later: dis.ttp does NOT include
the source code lines in the output :-( ]
.text
_func:
LINK A6, #0x0
MOVEM.L D6/D7/A4/A5, -(A7)
; q = &a[12000];
LEA.L 0x11996.L, A0 ; correct: adds 6*12000
MOVEA.L A0, A4 ; (inefficient: needless copying)
; p = a;
LEA.L 0x56.L, A0
MOVEA.L A0, A5 ; (inefficient again)
; p += 12000;
ADDA.L #0x11940, A5 ; correct
; m = n = 12000;
MOVE.W #0x2ee0, D6 ; first assign to m (efficient!)
MOVE.W D6, D7 ; copy to n
AND.L #0xffff, D6 ; correct: extend m to long
; p += n;
MOVE.W D7, D0
MULU.W #0x6, D0 ; n*sizeof(), the result is LONG
AND.L #0xffff, D0 ; kill the top word:
ADDA.L D0, A5 ; *** WRONG!!! ***
; p += m;
MOVE.L D6, D0 ; (unnecessary copying)
MOVE.L D0, -(A7)
MOVE.L #0x6, D0
MOVE.L D0, -(A7) ; m may be > 64K, so:
JSR __ulmul.L ; do the m*sizeof() the hard way...
MOVE.L (A7)+, D0
ADDA.L D0, A5 ; this time correct (but slow)
; p = &a[n];
MOVE.W D7, D0
AND.L #0xffff, D0 ; extend n to long
MOVE.L D0, -(A7)
MOVE.L #0x6, D0
MOVE.L D0, -(A7)
JSR __lmul.L
MOVE.L (A7)+, D0
LEA.L 0x52.L, A0 ; (could use A5 directly)
ADD.L A0, D0
MOVEA.L D0, A5 ; correct (but slow)
MOVEM.L (A7)+, D6/D7/A4/A5
UNLK A6
RTS
.data
In my view the code for "p += n;" should have looked like:
MOVE.W D7, D0
EXT.L D0
MULU.L #0x6L, D0 ; (this is a fictitious 68000 op)
ADDA.L D0, A5
i.e., either it should be:
MOVE.L D7, D0
EXT.L D0
MOVE.L D0, -(A7)
MOVE.L #6, -(A7) ; no extra copying
JSR __ulmul.L
ADDA.L (A7)+, A5 ; no extra copying
or, since we know that n (D7) is an int and that
MULU takes WORD args and results in a LONG:
MOVE.W D7, D0
MULU #6, D0 ; much faster
ADDA.L D0, A5
Conclusion: Laser C does not use the fact that the result of a 68000
MULU operation is a LONG. Worse, it truncates to an int (WORD) the
offset that is to be added to a pointer (LONG), just because the offset
was previously calculated from an int. Note that the latter mistake was
NOT made in calculating the address of a[n] at the end.
Other tests showed that if the array a[] is smaller than 64K then the
calculation of &a[n] is done with a MULU rather than a call to __ulmul().
That's a nice touch (much faster). But the reference must be to the
array itself. Assign "p=a;" and then do "p[i]" and it's back to the
slow call to __ulmul(), even if there are no large arrays in your
program anywhere! (This means that if an array is passed to a function
and the function uses it, through an index or otherwise, bye-bye speed...)
Moral: Reference to large (>64K) arrays THROUGH AN INDEX (as in a[i])
works OK (albeit slowly, even if i is an int), but to avoid a bug in
Laser C, DO NOT ADD an int X to a pointer-to-Y if X*sizeof(Y) may be > 64K.
Also, for speed's sake, avoid indexing on a pointer!
Note: All this only holds for arrays of structures. In the case of arrays
of ints, longs, floats or doubles Laser C knows to calculate the offset
via a quick ASL.L. Thanks goodness.
- Moshe Braner