stuart@rennet.cs.wisc.edu (Stuart Friedberg) (12/03/88)
I have a program to compute convex hulls using short, rational arithmetic, that I have been using as a specialized benchmark to look at (1) integer arithmetic and comparison performance, (2) sequential access to large arrays, and (3) paging behavior. I have encountered an anomaly on the SPARC architecture (Sun-4/110, running SunOS 4.0) that does not appear on Vaxen or 68K based machines, and was hoping someone could explain it to me. I statically allocate a large array of either 2Meg or 4Meg bytes in size. Elements of the array are struct { short x, y }. I run the benchmark with a parameter specifying how much of the array to use. Access to the array is basically sequential. When the LARGER array is used, ALL runs of the benchmark for ALL values of the parameter are roughly ten percent FASTER. This is both repeatable and statistically significant. But the unused portion of the array is never accessed! I examined the assembler produced by the Sun-4 compiler, and they differ only in the values of the constants assembled. There are overflow checks against indices of 500,000 and 1,000,000, and address offsets of 2,000,000 and 4,000,000. That is the sole difference. The same instructions seem to be generated in both cases, at least the assembler mnemonics are the same, and I assume (perhaps incorrectly) that the alignment of the code is the same in both cases. The statically allocated array is loaded as an assembler .common . Can anyone suggest what might be going on here, or what I should check in more detail? I expected the larger array to either have no effect at all, or to produce slightly slower runs. Instead, I have seeing a significant speedup, even when I don't refer to the extra memory. I have not (yet) familiar with the SPARC and the Sun-4/110 memory system to have any good idea what is going on. Stu Friedberg stuart@cs.wisc.edu