[comp.unix.i386] ESDI caching disk controllers: Reprise

baxter@ics.uci.edu (Ira Baxter) (05/02/90)

In article <1424@ssbn.WLK.COM> bill@ssbn.WLK.COM (Bill Kennedy) writes:

> With a 768K cache memory on one card I routinely get 32% cache hits
> and with 4Mb on the one in ssbn I get 48-52% cache hits.

I have never understood this.  If a caching disk controller with 4Mb gives
you a high hit rate, why not put the 4Mb of RAM into your CPU, and
let UNIX use it as a cache?  The hit rate should be the same.
The win is that the OS should be able to divide the RAM between disk
cache and process space (I assume UNIX isn't brain-dead in this regard?).
Then you have the best of both worlds.

Or am I missing something?
--
Ira Baxter

bill@ssbn.WLK.COM (Bill Kennedy) (05/02/90)

In article <1736@serene.UUCP> rfarris@serene.UU.NET (Rick Farris) writes:
>In article <1424@ssbn.WLK.COM> I wrote:
>
>> With a 768K cache memory on one card I routinely get 32% cache hits
>> and with 4Mb on the one in ssbn I get 48-52% cache hits.

I need to make that more clear.  I managed to confuse Rick and confused
myself when I re-read it.  The CompuAdd utility only runs under DOS, it
won't work in VP/ix, maybe with its own driver.  I usually boot native
DOS to check the stats on the way back up from a full backup.  That's got
to wreak havoc with the hit rate.

>That brings up an interesting point.  I think I remember learning
>that without a cache-hit rate of 90% or better, you're generally
>better off without a cache, due to the overhead involved in cache
>misses.  

I think that this is probably true of a memory cache and might be for
a disk cache, I'm not well versed in such things.  In a disk cache the
overhead for a miss is only a little greater than a non-caching card
but subsequent reads are a big win.  CompuAdd keeps "sets" of sectors
and dumps least recently used stuff first.  It also reads ahead betting
that if you wanted sector 2, you'll probably want sector 3 next.  Couple
that with the buffers that the kernel keeps and I'm not sure that there
is a good number for what hit rate is optimal.  From my universal sample
of one :-) the hard cache is the most visible performance improvement
I've seen on this old (1987) system.

>Yet Bill tells us that subjectively the machine seems faster.  Is
>this a special case?  Is the 90% number only effective with the much
>smaller differential in access speed between cache and core?  Does
>the larger spread between core and disk mean that a lower hit rate is
>effective?

Again, I'll speculate that the 90% number is for a main memory cache.
When you add up the wait states to dump to slow memory and reload from
slow memory it would appear that you're lots better off staying in
the cache.  I'm dubious that there is a "right number" (my quotes, not
Rick's) for a disk controller cache because of the varieties of
caching techniques (the DPT is visibly faster than the CompuAdd) and
configuration of kernel buffers.  I think my NBUFS is either 1024 or
2048, but you get my point.  My sar says I'm getting 90%+ hit rate
on the kernel buffers; that would actually aggravate the LRU scheme in
the disk controller.  Example: ps -ef is really fast when the system
is busy and very slow when it's near idle for a long time.

>And Bill, have you found out who *really* makes that card?  If
>CompuAdd can sell them for $500, somebody ought to have them for
>$300... :-)

No I haven't, nor have I found anyone on CompuAdd's payroll who really
understands how it works.  Their panacea is to suggest the most recent
firmware and BIOS EPROM's and in my case that was a retrograde performance
change.  You're also on thin ice with that board with ISC, it's not on
the known-to-work list.  I've got two of them and am not inclined to
change, I got what I wanted.  Another neighbor had to send his back when
it simply would *not* work with a CDC Wren >600Mb.  CompuAdd sells the
same drive that wouldn't work for him, new or old firmware/BIOS, didn't
matter.  The WD1007-SE2 works just dandy, so caveat emptor.
>
>Rick Farris   RF Engineering  POB M  Del Mar, CA  92014   voice (619) 259-6793
>rfarris@serene.uu.net      ...!uunet!serene!rfarris       serene.UUCP 259-7757

Sorry for the length, I don't know what the "right number" is and I doubt
there is one.  People I know who have tried both say that the DPT is a much
better performer, but I couldn't justify the additional $500; my throughput
suits me just fine.
-- 
Bill Kennedy  usenet      {texbell,att,cs.utexas.edu,sun!daver}!ssbn!bill
              internet    bill@ssbn.WLK.COM   or attmail!ssbn!bill

bill@ssbn.WLK.COM (Bill Kennedy) (05/02/90)

In article <263E613E.23840@paris.ics.uci.edu> baxter@ics.uci.edu (Ira Baxter) writes:
>In article <1424@ssbn.WLK.COM> I wrote:
>
>> With a 768K cache memory on one card I routinely get 32% cache hits
>> and with 4Mb on the one in ssbn I get 48-52% cache hits.

I'm sorry I didn't see this when I followed up the earlier post, I'd have
combined them.

>I have never understood this.  If a caching disk controller with 4Mb gives
>you a high hit rate, why not put the 4Mb of RAM into your CPU, and
>let UNIX use it as a cache?  The hit rate should be the same.

This is certainly true for reads and, depending on NAUTOUP, for writes.  My
system is on a UPS so I set NAUTOUP to two minutes so the buffers are not
flushed so frequently.

>The win is that the OS should be able to divide the RAM between disk
>cache and process space (I assume UNIX isn't brain-dead in this regard?).
>Then you have the best of both worlds.

I don't think it works this way, so put a question mark in if approrpiate.
You specify the size of the kernel cache with the NBUFS tunable parameter,
I don't think that the kernel sizes up or down on the fly.  It probably
does something like that with the process space when it starts swapping,
but I think the disk cache is fixed at wherever you have it set.  In my
case, I had already ascertained the "optimal" number of buffers so I was
going for the time wasted waiting on the spindle/heads to get positioned.

>Or am I missing something?
>--
>Ira Baxter

Yes, Ira, I think you are.  I agree that things get fuzzy with regard to
reading, kernel buffers work at memory speed and the controller cache at
I/O speed.  Writing is a different story though.  The logical write is
disconnected from the physical write.  The kernel dumps off its stuff at
I/O speed without regard to seek time or rotational latency and the
controller worries about the physical write.  The kernel buffers can't
help you a bit if you have to wait on the disk mechanism to get to the
right place, a caching controller can.  The other issue (in my case) was
cost.  SIMM's for the controller are a lot cheaper than column static
main memory.
-- 
Bill Kennedy  usenet      {texbell,att,cs.utexas.edu,sun!daver}!ssbn!bill
              internet    bill@ssbn.WLK.COM   or attmail!ssbn!bill

richard@pegasus.com (Richard Foulk) (05/03/90)

>> With a 768K cache memory on one card I routinely get 32% cache hits
>> and with 4Mb on the one in ssbn I get 48-52% cache hits.
>
>I have never understood this.  If a caching disk controller with 4Mb gives
>you a high hit rate, why not put the 4Mb of RAM into your CPU, and
>let UNIX use it as a cache?  The hit rate should be the same.
>The win is that the OS should be able to divide the RAM between disk
>cache and process space (I assume UNIX isn't brain-dead in this regard?).
>Then you have the best of both worlds.

I have vague recollections of several different sets of research into
the disk cache versus more main-memory question using Unix.  Adding to
main memory almost always won out as I recall -- at least with proper
tuning.

The disk cache is certainly a win with MESS-DOS since it doesn't have
any idea what to do with more memory.  I think that's where these
boards are coming from.

-- 
Richard Foulk		richard@pegasus.com

lws@comm.wang.com (Lyle Seaman) (05/03/90)

bill@ssbn.WLK.COM (Bill Kennedy) writes:
...
>configuration of kernel buffers.  I think my NBUFS is either 1024 or
>2048, but you get my point.  My sar says I'm getting 90%+ hit rate
>on the kernel buffers; that would actually aggravate the LRU scheme in
>the disk controller.  Example: ps -ef is really fast when the system
>is busy and very slow when it's near idle for a long time.

This brings up an interesting question.  If both the kernel and the
controller are using the same algorithm (LRU), then there's going to
be a considerable duplication, isn't there?  Hmm, on the other hand,
the data in the controller cache will be that of the last controller
accesses, so they'll be misses on the kernel cache.  Problem is that 
when something is accessed from the controller cache, it goes into the
kernel cache, but something in the kernel cache gets thrown away,
which, by definition, was used more recently than what we just got
from the controller.  That seems to leave a hole in the algorithm.  If
only there was some way to modify the kernel cache code so it moves
buffers to the controller cache.

Also, consider paging.  When I page memory out, it goes directly to
the swap device, but not through the kernel cache.  In this case, it
goes into the controller cache, which might subvert the algorithm.  If
you're using a caching controller, might it be worthwhile to reduce
your kernel cache so as to avoid paging?. 

I guess there isn't much point in putting a caching controller on a
machine that hasn't maxed its primary memory, eh?

-- 
Lyle                      Wang             lws@comm.wang.com
508 967 2322         Lowell, MA, USA       uunet!comm.wang.com!lws

rcd@ico.isc.com (Dick Dunn) (05/04/90)

baxter@ics.uci.edu (Ira Baxter) writes:
> I have never understood this.  If a caching disk controller with 4Mb gives
> you a high hit rate, why not put the 4Mb of RAM into your CPU, and
> let UNIX use it as a cache?  The hit rate should be the same.

But the caching controller can slurp up data while the CPU is busy doing
something else.  You go grab one (or a few) sectors from a track, and hand
them over to a process to start munching.  In the meantime, the controller
swallows the rest of the track and has it ready when you go back and ask
for the next chunk of data, where if you waited until the program needed it
until you went to disk, it might already have gone under the heads.

Note that you can't just transfer the track into main memory the way the
controller does, without a serious performance hit:  Bill K was talking
about getting ~ 50% cache hits, so that would mean you'd be bringing in
about 50% more data than you need.  (Remember that hard disk I/O is PIO,
not DMA.)  You can't really "track-cache" data into main memory cheaply.
-- 
Dick Dunn     rcd@ico.isc.com    uucp: {ncar,nbires}!ico!rcd     (303)449-2870
   ...Lately it occurs to me what a long, strange trip it's been.

gerry@zds-ux.UUCP (Gerry Gleason) (05/04/90)

In article <1990May3.050416.14124@pegasus.com> richard@pegasus.UUCP (Richard Foulk) writes:
>>> With a 768K cache memory on one card I routinely get 32% cache hits
>>> and with 4Mb on the one in ssbn I get 48-52% cache hits.

>The disk cache is certainly a win with MESS-DOS since it doesn't have
>any idea what to do with more memory.  I think that's where these
>boards are coming from.

Not quite, there are caching drivers that can be installed on top of your
regular drivers.  We ship such a thing with DOS, but don't ask me about
it since I haven't even used it.

Gerry Gleason

clewis@eci386.uucp (Chris Lewis) (05/05/90)

All things being equal, adding main memory and kernel buffers would be 
better than adding yet another layer of caching, *but*:

	- the controller is more likely to be better tuned to the geometry 
	  of the disk than bio and fio are and know how to optimize operations 
	  better.  (fio is the file level handler and bio is the buffered-disk
	  level handler in most UNIX kernels)
	- since the controller is a somewhat "simpler" environment (there 
	  ain't no uproc's etc. getting in the way ;-), adding improved 
	  algorithms is *easier*.  (look-ahead-cancel/defer, track-cache,
	  elevator algorithm, cache-locking, cache partitioning etc.)
	- you only have to tune the heck out of the controller once, rather
	  than having to retune (rewrite) the driver for each port.
	- Controller manufacturers usually know a heck of a lot more about
	  disks than UNIX porting people who have other problems to deal with.
	- If you put a UPS on the controller and disk, disk sync order
	  ain't that particularly important, so the controller can be 
	  considerably more free in operation ordering....  In fact,
	  given appropriate conditions, the controller may *never* have
	  to write the disk...  (oversimplified, requires UPS, and
	  pretty fail-safe controller/disk - Not that you want it
	  really to do this - the DPT controller does timeout and
	  forces writes after a moderate amount of time).  It's the
	  neatest thing to see a system panic, and the disk abruptly
	  gets very busy for another 30 seconds or so....
    
And finally, since no UNIX kernel that I'm aware of resizes buffer cache 
dynamically, you end up with a lot more memory to put processes in rather
than trading off all over the place, and the kernel buffer size isn't 
that important anymore.

Mind you, there are some pretty awesome things you could do if you make
these controllers a bit smarter and get fio to know about them.
(eg: cancellable/deferable next-n-blocks-in-a-file look-ahead instead 
of consequitive physical block).

The DPT controllers are pretty amazing.  I've used DPT's ESDI and ST506
controllers on SCSI bus.  Talk about making NCR Towers scream!  (factors
of 30 upon occasion).  And I understand the AT-bus ones are just as good.
-- 
Chris Lewis, Elegant Communications Inc, {uunet!attcan,utzoo}!lsuc!eci386!clewis
Ferret mailing list: eci386!ferret-list, psroff mailing list: eci386!psroff-list

aland@infmx.UUCP (Dr. Scump) (05/05/90)

In article <1736@serene.UUCP> rfarris@serene.UU.NET (Rick Farris) writes:
>In article <1424@ssbn.WLK.COM> bill@ssbn.WLK.COM (Bill Kennedy) writes:
>
>> With a 768K cache memory on one card I routinely get 32% cache hits
>> and with 4Mb on the one in ssbn I get 48-52% cache hits.
>
>That brings up an interesting point.  I think I remember learning
>that without a cache-hit rate of 90% or better, you're generally
>better off without a cache, due to the overhead involved in cache
>misses.  
>
>Yet Bill tells us that subjectively the machine seems faster.  Is
>this a special case?  Is the 90% number only effective with the much
>smaller differential in access speed between cache and core?  Does
>the larger spread between core and disk mean that a lower hit rate is
>effective?  ...

That surprises me, but I'm no expert.  Keep in mind, however, that
some controllers do other things besides simple caching: readahead,
sorted writeback, DMA, adjustable bus speeds, etc.  The caching 
controllers I'm using (Consensys PowerStor) make the machines 
*scream*, even when simple cache hit percentages aren't that high.

>Rick Farris   RF Engineering  POB M  Del Mar, CA  92014   voice (619) 259-6793

--
Alan Denney  @  Informix Software, Inc.          "We're homeward bound
aland@informix.com  {pyramid|uunet}!infmx!aland   ('tis a damn fine sound!)
-----------------------------------------------   with a good ship, taut & free
 Disclaimer:  These opinions are mine alone.      We don't give a damn, 
 If I am caught or killed, the secretary          when we drink our rum
 will disavow any knowledge of my actions.        with the girls of old Maui."

gustwick@wf-aus.cactus.org (Bob Gustwick ) (05/07/90)

In article <1736@serene.UUCP> rfarris@serene.UU.NET (Rick Farris) writes:
>In article <1424@ssbn.WLK.COM> bill@ssbn.WLK.COM (Bill Kennedy) writes:
>
>> With a 768K cache memory on one card I routinely get 32% cache hits
>> and with 4Mb on the one in ssbn I get 48-52% cache hits.
>
>That brings up an interesting point.  I think I remember learning
>that without a cache-hit rate of 90% or better, you're generally
>better off without a cache, due to the overhead involved in cache
>misses.

this is really an implementation question.  a cache miss should be
a fast search thru some sort of table/tree/etc.  a 50/50 hit/miss
ratio should still provide a great performance increase.  why?
a search thru memory is orders of magnitudes cheaper in wall
clock time than a disk seek, unless the cache is implemented very
poorly.

cpcahil@virtech.uucp (Conor P. Cahill) (05/08/90)

In article <1990May3.050416.14124@pegasus.com> richard@pegasus.UUCP (Richard Foulk) writes:
>the disk cache versus more main-memory question using Unix.  Adding to
>main memory almost always won out as I recall -- at least with proper
>tuning.
>
>The disk cache is certainly a win with MESS-DOS since it doesn't have
>any idea what to do with more memory.  I think that's where these
>boards are coming from.

The disk cache is also a win when you have already maxed out main
memory.  My 386 has 16 MB of memory and 2 1/2 MB of disk cache.


-- 
Conor P. Cahill            (703)430-9247        Virtual Technologies, Inc.,
uunet!virtech!cpcahil                           46030 Manekin Plaza, Suite 160
                                                Sterling, VA 22170