angel@umigw.MIAMI.EDU (angel li) (10/24/89)
Does anyone know the performance difference of 386 boards with a cache against boards without a cache, both running Unix? I would like to know whether paying for the cache is worth the money. -- Angel Li University of Miami/RSMAS Internet: angel@flipper.miami.edu UUCP: ncar!umigw!angel
pb@idca.tds.PHILIPS.nl (Peter Brouwer) (10/25/89)
In article <919@umigw.MIAMI.EDU> angel@flipper.miami.edu (angel li) writes: >Does anyone know the performance difference of 386 boards with a cache >against boards without a cache, both running Unix? I would like to >know whether paying for the cache is worth the money. This depends on the size/working set of your applications you use. Most caches are 64k = 16 pages. So if you have large applications with a working set ( number of pages it used during execution ) the cache is'nt a great help. -- Peter Brouwer, # Philips Telecommunications and Data Systems, NET : pb@idca.tds.philips.nl # Department SSP-P9000 Building V2, UUCP : ....!mcvax!philapd!pb # P.O.Box 245, 7300AE Apeldoorn, The Netherlands. PHONE:ext [+31] [0]55 432523, # Never underestimate the power of human stupidity
akcs.larry@nstar.UUCP (Larry Snyder) (10/25/89)
>Does anyone know the performance difference of 386 boards with a cache >against boards without a cache, both running Unix? I would like to >know whether paying for the cache is worth the money. I know that boards with a cache in several installations have had problems
dhinds@portia.Stanford.EDU (David Hinds) (10/26/89)
In article <416@ssp2.idca.tds.philips.nl>, pb@idca.tds.PHILIPS.nl (Peter Brouwer) writes: > This depends on the size/working set of your applications you use. > Most caches are 64k = 16 pages. So if you have large applications with > a working set ( number of pages it used during execution ) the cache is'nt > a great help. > This is not really true. A RAM cache constantly turns over to reflect the current memory usage of whatever is running on a system. It stores recently used instructions and data. It is most effective at speeding up fairly small loops (a few K), or code which accesses the same data repeatedly. Fortunately, this represents the local activity of almost all programs, almost all of the time. I don't have any concrete data, but I've seen quotes for 80386 systems with 64K caches of hit rates of 90-95% for typical real applications. Context switching should not degrade performance much, because the time scale over which the cache works is much, much shorter than a task's time slice. The cache turns over fast enough that it recovers almost immediately from a context switch. RAM caches have been standard equipment on mainframes for decades. In fact, 64K is quite large as caches go; the 80486 chip has a cache of something like 256 bytes, but I wouldn't be surprised if this had hit rates of 80-90%, even with very large programs. -David Hinds dhinds@portia.stanford.edu
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/26/89)
In article <416@ssp2.idca.tds.philips.nl>, pb@idca.tds.PHILIPS.nl (Peter Brouwer) writes: | This depends on the size/working set of your applications you use. | Most caches are 64k = 16 pages. So if you have large applications with | a working set ( number of pages it used during execution ) the cache is'nt | a great help. Micro caches don't work in 4k pages, so what has this to do with anything? I suspect you're thinking of mainframe cache which may work in larger chunks. 4, 16, and 32 byte cache chunks are mentioned by manufacturers, I think someone used 64 bytes, but I haven't got the name. If anyone has an Intel cache controller spec sheet handy, please contribute. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
dhinds@portia.Stanford.EDU (David Hinds) (10/26/89)
In article <1480@crdos1.crd.ge.COM>, davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes: > If anyone has an Intel cache controller spec sheet handy, please > contribute. I don't have the spec's, but as I understand it, the Intel chip is a "2-way set-associative" cache controller. Each entry in the cache has two parts: a physical-memory address to which it corresponds, and a copy of the data at that address (32 bits, I think). To speed up the check to see if something is in the cache, there are only two entries in the cache where any given physical address might reside. A 'set' of physical addresses are 'associated' with each pair of cache entries. When the 80386 sends out a physical address, the cache controller figures out which two entries might contain the data (probably using the top bits of the address as an index into the cache), checks those two entries to see if either address matches the request, and either returns the data from the cache or polls main memory. A 64K cache is built from 8 8K-by-8bit static RAM chips, to be 64 bits wide, so that each cache entry has room for a 32-bit address and 32 bits of data. So it actually has 32K of data. -David Hinds dhinds@portia.stanford.edu
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/26/89)
In article <6101@portia.Stanford.EDU>, dhinds@portia.Stanford.EDU (David Hinds) writes: | In fact, 64K is quite large as caches go; the 80486 chip has | a cache of something like 256 bytes, but I wouldn't be surprised if | this had hit rates of 80-90%, even with very large programs. The 80486 that Intel sells has 8k. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
rcd@ico.isc.com (Dick Dunn) (10/27/89)
pb@idca.tds.PHILIPS.nl (Peter Brouwer) writes: > ... angel@flipper.miami.edu (angel li) writes: > >Does anyone know the performance difference of 386 boards with a cache > >against boards without a cache, both running Unix?... > This depends on the size/working set of your applications you use. It depends on the working set--data and code both, although separate I and D caches help a lot even with large amounts of data because the code is still likely to have good locality even if the data defeats the caching. > Most caches are 64k = 16 pages. So if you have large applications with > a working set ( number of pages it used during execution ) the cache is'nt > a great help. No. (I've never seen a page-oriented cache--refill would be nasty!) The normal organization caches a small amount of memory (~ 4 - 8 bytes) per cell, so the relevant question is not at page granularity. It is rare for a computation-intensive program to have a large amount of active code at any given time--in fact, somewhat the opposite, because "computation intensive" often means a few small loops. Also, remember that the 386 presents physical addresses, so cache flushes don't have to happen very often. (Don't confuse a cache flush with a TLB flush.) Some informal experiments we've done suggest that a decent cache does a lot. For example, a cached 25-MHz machine is easily twice as fast as an uncached 16-MHz even though the processor is only about 50% faster. Keep in mind that this is CPU speed. Look at your processing mix; if you're I/O bound, there are still secondary reasons that a cache can help but it's not such a big deal. -- Dick Dunn rcd@ico.isc.com uucp: {ncar,nbires}!ico!rcd (303)449-2870 ...No DOS. UNIX.
dave@mobile.UUCP (David C. Rein) (10/27/89)
In article <919@umigw.MIAMI.EDU>, angel@umigw.MIAMI.EDU (angel li) writes: > Does anyone know the performance difference of 386 boards with a cache > against boards without a cache, both running Unix? I would like to > [stuff deleted] > Angel Li Well, I have no performance specs on hand, but having the 386 get its stuff from 35 ns static RAM instead of 60-80 ns dynamic RAM tells you something right there. Also, on the IBM PS/2 Model 70 25Mhz machines, they found it important enough to give the top 128k to BIOS, so that it could be cached, instead of accessing ROM. Dave Rein UUCP: ..!kodak!gizzmo!lazlo!\ \/ "It just goes to show what you can do mobile!dave /\ if you're a total psychotic" Domain: dcr0801@ultb.isc.rit.edu / \ -- Woody Allen
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/28/89)
In article <1989Oct27.031800.4938@ico.isc.com>, rcd@ico.isc.com (Dick Dunn) writes: | Some informal experiments we've done suggest that a decent cache does a lot. | For example, a cached 25-MHz machine is easily twice as fast as an uncached | 16-MHz even though the processor is only about 50% faster. It really does depend on the machine, not just the speed. For instance a machine with 2 and 4 way interleave would benefit less from a cache than one with no interleave, and one with wait states benefits more than one without. 16MHz is the point at which it is still possible to do 0w/s with more or less standard memory parts. I have done some measurements on normal, interleaved, and 16 bit memory, and conclude that cache is a huge win as your memory gets slower, and that 64k will mask the effects of slow memory for many applications. note: I'm note disagreeing, just adding some clarifying information. I would not buy a 25/33 MHz machine w/o cache, because it adds so little to the price of the machine as a whole ($200-300). -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
rick@pcrat.uucp (Rick Richardson) (10/30/89)
>In article <919@umigw.MIAMI.EDU>, angel@umigw.MIAMI.EDU (angel li) writes: > Does anyone know the performance difference of 386 boards with a cache > against boards without a cache, both running Unix? I would like to Back in the bad old days of $30+ DRAMS, we bought a 16Mhz Mylex motherboard. This mother has a 64K cache and 120ns main memory. Running the Dhrystones (under 386/ix 1.0.4) on it gave 4950 'stones with the cache turned on (~0ws), and 3652 'stones with the cache turned off (1ws). Your results may vary... -- Rick Richardson | Looking for FAX software for UNIX/386 ?????? mention PC Research,Inc.| WE'RE SHIPPING your uunet!pcrat!rick| Ask about FaxiX - UNIX Facsimile System (tm) FAX # (201) 389-8963 | Or JetRoff - troff postprocessor for the HP {Laser,Desk}Jet
pb@idca.tds.PHILIPS.nl (Peter Brouwer) (10/30/89)
In article <1480@crdos1.crd.ge.COM> davidsen@crdos1.UUCP (bill davidsen) writes: >In article <416@ssp2.idca.tds.philips.nl>, pb@idca.tds.PHILIPS.nl (Peter Brouwer) writes: >| This depends on the size/working set of your applications you use. >| Most caches are 64k = 16 pages. So if you have large applications with >| a working set ( number of pages it used during execution ) the cache is'nt >| a great help. > > Micro caches don't work in 4k pages, so what has this to do with >anything? I suspect you're thinking of mainframe cache which may work >in larger chunks. 4, 16, and 32 byte cache chunks are mentioned by >manufacturers, I think someone used 64 bytes, but I haven't got the >name. > I must admit I did not know this. From the discussions uptill now I understand the cache contains a 32 bits address and 32 bit data. Does this means that it can contain 1K memory references. If this is the case and you have an application which uses for instance an memory array > 1K and does lots of accesses in it , will result in a low cache hit ratio. Is a correct assumtion? -- Peter Brouwer, # Philips Telecommunications and Data Systems, NET : pb@idca.tds.philips.nl # Department SSP-P9000 Building V2, UUCP : ....!mcvax!philapd!pb # P.O.Box 245, 7300AE Apeldoorn, The Netherlands. PHONE:ext [+31] [0]55 432523, # Never underestimate the power of human stupidity