[comp.arch] Cache Measurement

lindsay@gandalf.cs.cmu.edu (Donald Lindsay) (03/11/91)

In article <688@spim.mips.COM> mash@mips.com (John Mashey) writes:
	[concerning benchmark results of a 68040 with and without 
	 external cache]
>4) Let's consider some reasons why that might be:
>	a) ... using page-mode DRAM ..
>	it costs you some to switch between pages.
>	b) Perhaps there is something in the 68040-interface+external
>	secondary cache control that has a higher penalty than one
>	would expect.  I assume the secondary cache is writeback (?).
>	Maybe the design requires flushing dirty data back to DRAM
>	before initiating the refill?  Maybe there are extra cycles
>	for synchronizing everything?
>Anyway, maybe somebody who actually knows can post a few details,
>since the rest of us are just guessing.

It would be really nice if mere users, such as moi, could actually
get measurements from the systems they buy.

In particular, there's no reason why cache control logic couldn't
keep some software-controlled counters. The SPUR chip had 16 such
counters, with some 5 modes. (They had to have "off", for instance,
so that they could avoid measuring the null loop.)

The cost of the above is small, as long as cache control data paths
have to exist anyway. We are, after all, in the era when chip
designers privately boast about the number of ALUs in their next
design.

So, what would we want measured?


-- 
Don		D.C.Lindsay .. temporarily at Carnegie Mellon Robotics

sgolson@pyrite.East.Sun.COM (Steve Golson) (03/21/91)

In article <12316@pt.cs.cmu.edu> lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes:
>It would be really nice if mere users, such as moi, could actually
>get measurements from the systems they buy.
>
>In particular, there's no reason why cache control logic couldn't
>keep some software-controlled counters. The SPUR chip had 16 such
>counters, with some 5 modes.
>
>So, what would we want measured?

I worked on a single-chip cache memory for the 80386. We included
three statistics counters that could be software programmed to count
a variety of events:

	Total hits		Write hits		Read hits
	Total misses		Write misses		Read misses
	Total flushes		Write flushes		Read flushes
	Total wait states	Write wait states	Read wait states
	Total accesses		Pipelined accesses	Non-piped accesses
	Elapsed time

Separate counts for instructions and data were not included. Since
this is a modular cache and memory controller, code and data could be
segregated into separate memory banks, and statistics for each could
be gathered from their respective cache controllers.

For more info see "A 2Kbyte Fully-Associative Cache Memory with
On-Chip DRAM Control" by Scott Griffith and Steve Golson, Proc. 1989 CICC.

Steve Golson -- Trilobyte Systems -- Carlisle MA -- sgolson@east.sun.com
       (consultant for, but not employed by, Sun Microsystems)
"As the people here grow colder, I turn to my computer..." -- Kate Bush