[comp.periphs.scsi] Controller Cache vs. Software C

neese@adaptx1.UUCP (06/16/91)

>You mention several intersting points with regards to drive arrays. In order to
>take full advantage of a dirve array you need to have the array working on
>multiple requests simultaneously. Scsi 2 allows for command queing and
>overlapped commands, just the kind of thing necessary to maximize performance 
>from a disk array. The problem as I see it is that most OS device drivers do
>not yet support SCSI 2 and hence would limit the potential effectiveness of
>an array. Rather than change all the device drivers for all OS's might it
>be easier to implement these features with a smart caching host adaptor?

SCSI-2 does make provisions for tag queuing, but the feature is an option.
Out of all the SCSI-2 drives I have seen (4 manufacturers), only one of them
*correctly* supported tag queuing.  One other reports it supports tag
queuing, but in fact does not have it correctly implemented, the other two
do not support tag queuing and properly claim so.  So watch it when buying
a SCSI-2 device.  If you need tag queuing, be sure to specify so.

			Roy Neese
			Adaptec Senior SCSI Applications Engineer
			UUCP @  neese@adaptex
				uunet!cs.utexas.edu!utacfd!merch!adaptex!neese

neese@adaptx1.UUCP (06/18/91)

>/* Written  8:35 pm  Jun 15, 1991 by rat.UUCP!harv in adaptx1:comp.periphs.s */
>/* ---------- "Re: Controller Cache vs. Software C" ---------- */
>In article <633@zds-ux.UUCP> you write:
>>In article <30738@hydra.gatech.EDU> jt34@prism.gatech.EDU (THOMPSON,JOHN C) writes:
>>>How does the performance of a scsi host adaptor with built in caching hardware
>>>compare to the performance of a software caching program or OS caching?
>>>Which is faster? Is onboard drive caching even faster? Any definitive research
>>>on this subject? Is there a source on the net for a scsi perpiheral benchmark
>>>program/source code? Thanks
>>
>>Logically, software caching must be faster (assuming reasonalble
>>implementations in both cases), in the case of a cache hit because it
>>can just hand over the data rather than needing to perform an I/O
>>operation.  On the other hand, there is one type of hardware caching
>>that does make sense, read-ahead track buffering, but even this can
>>be handled to some extent in software, and it's better done in the
>>drive itself if your going to do it at all (and some drives do).
>>
>>Now, this doesn't mean there aren't other reasons for wanting a cache
>>on the controller; for example, so you can implement special multi-drive
>>features such as mirroring, arrays, etc.  If the controller designer
>>does it right, they could introduce a simple caching controller and
>>later provide these advanced features as a firmware upgrade.
>>
>>Gerry Gleason
>
>Almost any time you have bottlenecks between processing nodes, installing some
>kind of cache can potentially speed up the application.  The question of
>whether a hardware cache is better than a software cache can depend on
>what application set you are running.  If you don't want the host cpu spending
>time performing caching while it could be executing an application, then a
>hardware cache makes sense.  Also, a hardware cache, if implemented with a
>suitable processor, can perform some interesting heuristics to achieve an
>impressively good hit rate.  The software cache definitely has the advantage
>of having the most direct route from cache to host, but it steals resources
>such as host memory and cpu cycles which are better used for running the
>host's applications.
>
>The best solution might be some kind of cache controller which can reside
>on the motherboard (maybe an IDE cache controller) with its own memory and
>support some high-performance method of moving data into host memory without
>dealing with the ISA or EISA busses.

Not quite.  In UNIX, the data comes from the controller into the kernel
buffer cache and the CPU moves it to user space from kernel space.  If the
data resides in the kernel cache, many steps are saved.  A hardware cache,
upon a cahce hit, saves the time of getting the data from the device and that
is about the only thing saved.  But upon a cache miss, the controller will
suffer an enormous penalty in terms of command overhead for this command.
Most caching controllers obtain thier best performance from the write
back method of caching.  But this has the drawbacks of increasing the
already volitile UNIX filesystem to new highs.  By telling the kernel
a write is complete and waiting a specific time period before actually
doing the write.  During this time, if your system goes down for any reason
you will lose data.  To further complicate things, some caching controllers
sort the commands to the device and send them out.  Now if the superblock
gets updated before the actual data is written and the system goes down,
you not only have lost the data, but the system may not know about it
until you read the data again.
Now for read commands, the kernel (these days) will do read ahead and cache
as well.  It is very difficult for the controller to have a significant
impact on the performance of reads, unless doing raw I/O, where the kernel
cache is bypassed.  Therefore the overhead penalty negates any real performance
gains for reads.  Of course, there are applications that use raw I/O
exclusively, but these will be moving to the block device in the near future
due to the problems associated with doing raw I/O in a demand paged
environment.  The data area in user memory must be locked into memory during
the data I/O and cannot be swapped out.  In a multiuser application, this
can have devastating effects.
I am not saying caching is not effective, but I do question the cost/
performance factors.  As well as the data integrity.  I know I wouldn't use
a caching controller, with write-back, unless I had a good BPS.

			Roy Neese
			Adaptec Senior SCSI Applications Engineer
			UUCP @  neese@adaptex
				uunet!cs.utexas.edu!utacfd!merch!adaptex!neese