[comp.arch] TLB problem

amull@Morgan.COM (Andrew P. Mullhaupt) (06/30/90)

I have been made aware of an interesting problem due to the behavior
of an application on a new RISC machine. The thing has a 4-way set
associative cache, but it seems that in some cases involving large
arrays experience an real choking performance drag due to address
collisions with fewer than 4 competing addresses. We suspect that
the TLB hashing scheme is giving us worst case behavior. 

1. Can anyone give a sure fire diagnostic?

2. What to do?  Jiggering the middle bits of the address seems
unpleasant. 

It used to be that the big project in getting the performance that
the box was capable of was more a matter of getting the operating
system out of your way. (I should point out that these are numerical
analytic jihad-style applications...) Now, many systems put assistance
for the operating system on the silicon. Thus, you cannot get it out
of your way. Would it be possible for system designers to put out a
style guide for high level applications so we don't trip over all 
the creases in their cache architecture? It's kind of depressing to
the UNIX people here to get less performance out of these 30 Mhz 
RISC units than I get out of my little old 25 Mhz i486. (I haven't
found the crease in its cache yet, but I don't harbor illusions,
either...)

Of course, this problem arose in the old wine/ new skin fashion;
the application is being ported and this was not a performance
problem on the previous hardware, so the application would not
necessarily have been written in accordance with the correct style
for this machine, but at least we wouldn't have been puzzled by
this one as much if we knew what hot spots to avoid.

Later,
Andrew Mullhaupt