howardl@wb3ffv.ampr.org ( WB3FFV) (08/25/89)
From article <121@mdi386.UUCP>, by bruce@mdi386.UUCP (Bruce A. McIntyre): > > The Caching disk controllers from Distributed Processing Technology would > be an excellent choice for your application. With one of these, the > average access time of your drives would approach .5 ms (not 5, but .5) > and with the cache being able to be set up to 4MB, it would turn that > 300Mb drive into a real screamer. > > They also offer an add on board for disk mirroring, and can support > any variation of ESDI, SCSI or MFM drive. Since the definition of the > drive to the PC can be redefined from the actual physical drive config, > with the controller doing the translation, you can use virtually any drive. > > The controller also does the error correction and bad tracking... Hello All, I just purchased one of the DPT controllers for use on a 15mbps ESDI drive here in the office. I would be very interested to see if anybody else on the net has been using the DPT contorller, and under what configuration. Prior to using the DPT product I had the Adaptec 2322B ESDI controller, but needed a 15mbps (not 10) for the new 780 megabyte harddisk. I decided to give an intelegent harddisk controller a shot, and so far it appers to be working very well. At present I am using the basic configuration of 512K of cache/buffer on the controller, but would like to hear of others experiences with the board. I would also be interested to see if anybody increased the onboard RAM, and how much improvement did the increased RAM give you. I believe the price for the controller is fair, but the cost to add RAM adds up real quick. So as the story goes, I wouldn't buy the additional RAM unless the cost was justified... ------------------------------------------------------------------------------- Internet : howardl@wb3ffv.ampr.org | Howard D. Leadmon UUCP : wb3ffv!howardl | Fast Computer Service, Inc. TELEX : 152252474 | P. O. Box 171 Telephone : (301)-335-2206 | Chase, MD 21027-0171
cpcahil@virtech.UUCP (Conor P. Cahill) (08/26/89)
In article <1474@wb3ffv.ampr.org>, howardl@wb3ffv.ampr.org ( WB3FFV) writes: > From article <121@mdi386.UUCP>, by bruce@mdi386.UUCP (Bruce A. McIntyre): > > > > The Caching disk controllers from Distributed Processing Technology would > > be an excellent choice for your application. With one of these, the > > average access time of your drives would approach .5 ms (not 5, but .5) > > and with the cache being able to be set up to 4MB, it would turn that > > 300Mb drive into a real screamer. You can add up to 12 meg to the DPT controller (totaly separate from the standard memory). But you loose one of the slots for one of the daughter cards (the first one). I wasn't too happy about this since I have all eight slotts filled and would like to have a little breathing room. > > > > They also offer an add on board for disk mirroring, and can support > > any variation of ESDI, SCSI or MFM drive. Since the definition of the > > drive to the PC can be redefined from the actual physical drive config, > > with the controller doing the translation, you can use virtually any drive. > > Note that the disk will appear as an ST506 to the rest of the system which makes integrations real easy. > > The controller also does the error correction and bad tracking... > > > Hello All, > > I just purchased one of the DPT controllers for use on a 15mbps ESDI drive > here in the office. I would be very interested to see if anybody else on > the net has been using the DPT contorller, and under what configuration. I have a DPT controller with 2.5 meg of cache memory and a 650meg (formatted) ESDI drive. > Prior to using the DPT product I had the Adaptec 2322B ESDI controller, > but needed a 15mbps (not 10) for the new 780 megabyte harddisk. I decided > to give an intelegent harddisk controller a shot, and so far it appers to > be working very well. At present I am using the basic configuration of > 512K of cache/buffer on the controller, but would like to hear of others > experiences with the board. I would also be interested to see if anybody > increased the onboard RAM, and how much improvement did the increased RAM > give you. I believe the price for the controller is fair, but the cost to > add RAM adds up real quick. So as the story goes, I wouldn't buy the > additional RAM unless the cost was justified... I never ran the system without the 2 meg daughter card, so I cant tell you what the difference would be. I am very pleased with the disk system (I really appreciate the fact that uncompresses of 100/200k files immediately return the prompt (The first few times I thought something was wrong, but I easily got used to it)). -- +-----------------------------------------------------------------------+ | Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 ! | Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 | +-----------------------------------------------------------------------+
brad@looking.on.ca (Brad Templeton) (08/26/89)
While cylinder or track caching is an eminently sensible idea that I have been waiting to see for a long time, what is the point in the controller or drive sotring more than that? Surely it makes more sense for the OS to do all other cache duties. Why put the 512K in your drive when you can put it in your system and bump your cache there? Other than the CPU overhead of maintaining the cache within the OS, I mean. I would assume the benefit from having the cache maintained by software that knows a bit about what's going on would outweigh this. -- Brad Templeton, Looking Glass Software Ltd. -- Waterloo, Ontario 519/884-7473
cpcahil@virtech.UUCP (Conor P. Cahill) (08/26/89)
In article <4843@looking.on.ca>, brad@looking.on.ca (Brad Templeton) writes: > Surely it makes more sense for the OS to do all other cache duties. > Why put the 512K in your drive when you can put it in your system and > bump your cache there? Other than the CPU overhead of maintaining the > cache within the OS, I mean. I would assume the benefit from having > the cache maintained by software that knows a bit about what's going > on would outweigh this. A big benefit is that the memory used for disk cacheing is not part of the standard system memory therefore my 33Mhz 386 can have 16 meg of memory and my disk controller can have 512K-12meg of disk cache. Another point would be raw devices which are not buffered by the kernel but are cached by disk controller. -- +-----------------------------------------------------------------------+ | Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 ! | Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 | +-----------------------------------------------------------------------+
pcg@thor.cs.aber.ac.uk (Piercarlo Grandi) (08/27/89)
In article <1076@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes: In article <4843@looking.on.ca>, brad@looking.on.ca (Brad Templeton) writes: > Surely it makes more sense for the OS to do all other cache duties. Yes, yes. There are figures on this, by AJ Smith. Some issue of TOCS. > Why put the 512K in your drive when you can put it in your system and > bump your cache there? Another point would be raw devices which are not buffered by the kernel but are cached by disk controller. Please think again and again about this. I am not entirely sure (let's try understatement for a change :-]) that this is a good idea rather than a problem. -- Piercarlo "Peter" Grandi | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcvax!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
mustard@sdrc.UUCP (Sandy Mustard) (08/28/89)
See Summary Line. Inquiring minds want to know.
plocher%sally@Sun.COM (John Plocher) (08/30/89)
In article <4843@looking.on.ca>, brad@looking.on.ca (Brad Templeton) writes: > Surely it makes more sense for the OS to do all other cache duties. > Why put the 512K in your drive when you can put it in your system and > bump your cache there? Other than the CPU overhead of maintaining the We have here the timeless tradeoff between software and hardware. The old proven method of doing I/O is to use the main processor to shove the bits all over the place. e.g., MAC Video, Apple ][ floppy, IBM Serial I/O, IBM hard disk, Sun Monochrome BW2 video, IBM VGA/EGA/CGA/MDA/Herc graphics... This works well for a minimal system, but the high performance systems all have migrated to the "add CPUs/smarts to the I/O system" camp. Examples here include Digiboard Com 8/i serial boards, TI/Intel graphics chips, Adaptec SCSI host adapter, and the above mentioned DPT hard disk controller. The DPT gives you several things besides the basic caching that an OS would find hard to do simply: Host I/O works in parallel with controller/disk reads and writes Sector caching instead of track or cyl caching (finer granularity) 8 sector automatic read ahead w/o missing a rotation - Data is returned to host as soon as avaliable, not after all 8 sectors have been read in Read ahead is pre-empted by a cache miss from host Ordered write back - Writes are cached and elevator sorted by the controller 250ms latency between host "write" and start of dirty-sector write back All writes "succeed" within 0.5ms, actual write may delay ~250ms Burst write back if more than 50% of the cache is dirty Geometry spoofing (you tell controller what actual geometry is, and what your OS can handle (DOS can not access > 1024 cyls, ...) and the controller maps between them transparently. In general, pushing all this off onto the controller is a win because it simplifies the OS design and results in less main processor overhead to handle I/O to the disk. -John Plocher A very happy owner of a DPT PM3011 RLL controller
clewis@eci386.UUCP (08/30/89)
In article <4843@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: >While cylinder or track caching is an eminently sensible idea that I >have been waiting to see for a long time, what is the point in the >controller or drive sotring more than that? > >Surely it makes more sense for the OS to do all other cache duties. >Why put the 512K in your drive when you can put it in your system and >bump your cache there? Other than the CPU overhead of maintaining the >cache within the OS, I mean. I would assume the benefit from having >the cache maintained by software that knows a bit about what's going >on would outweigh this. I've had quite a bit of exposure to the DPT caching disk controllers so I'll outline some of the interesting points. Some of these pertain generally to DPT, or only the models I was playing with (ESDI and ST506 disk interface versions with SCSI host interface), or more generally. 1) Write-after caching: Most systems do their swapping and/or paging raw. Thus they must *wait* for a write operation to complete before reusing the memory. Eg: avg 28 ms with ST506 drive. With write-after, you can reuse memory in .5 ms no matter how slow your drive is (unless the cache really fills). I installed one of these suckers on a Tower 600 with 4Mb running Oracle. We were able to immediately double the number of users using Oracle (from 4 to 8 simultaneous actions with considerably better response for all 8. Oracle 4.1.4 is a pig! So was the host adapter at the time - 3-6ms to transfer 512 bytes!). A look at the controller statistics showed that the system was swapping like mad, but virtually *no* physical disk I/O's actually occurred. Eg: blocks were being read back so fast that the controller never needed to write them out. Of course, this can be similarly done by adding physical memory to the system, however, DPT memory is cheaper than Tower memory... 2) Host memory limitations - how does 16Mb of main memory almost exclusively for use by programs and 12Mb of buffer cache strike you? (AT-style system limitations) Otherwise there's lots of tricky trade-offs. On the other hand, when faced with lots of physical memory on the host, it makes far more sense to use it for program memory than a RAM swap disk. 3) If your kernel panics, the controller gets a chance to flush its buffers - handy particularly if you make the kernel buffers small. Was sort of scary to see, for the first time, a Tower 400 woof its cookies (so I'm not a perfect device driver writer ;-) and see the disk stay active for another 30 seconds... 4) If you have a power failure, having the cache on the controller is a bad idea, because the kernel does make some assumptions about the order in which I/O occurs. With the models I was using it made economic sense to place a UPS only on the controller and disk subsystem. I don't know whether this is possible on the AT versions, but on the AT versions it's cheaper to get a whole-system UPS. 5) DPT read-ahead can be cancelled by subsequent read requests. 6) The DPT's algorithms (eg: replacement policy, lock regions, write-after delay times, dirty buffer high-water, cache allocation amongst multiple drives etc.) can be tuned. Most kernels cannot be much. 7) Now we get into the hazy stuff - I'm convinced from the testing that I did with the DPT lashups I built, plus experience inside other kernels, that the DPT has far better caching than most UNIX kernels. Generally speaking, except for look-ahead (which the DPT supports as well) kernel take no special knowledge of the disk *other* than inherent efficiency of file system layout (eg: Fast File System structures) and free list sorting (dump/mkfs/restore anyone?). For example, except for free-list sorting and other mkfs-style tuning, fio.c and bio.c (file I/O and block I/O portions of the kernel) don't know diddly squat about the real disk. Whereas, the DPT knows it intimately - sectors per track, rotational latency etc. The DPT uses the elevator algorithm and apparently a better LRU (page replacement) algorithm, has sector and cylinder skewing and so on. Unfortunately, I no longer have a copy of the report. Further, most of the measurements I was making was with reasonably representative technical measures of performance, but don't give an overall feel for performance. However, one that I remember may be of interest - kernel relinks on the Tower usually took close to 3 minutes. With the DPT, it went to little over 2 minutes. Big hairy deal... However, further examination of "time" results showed that the I/O component *completely* disappeared. Like wow. Some other simple benchmarks showed overall performance increases of up to a factor of 15! The only way to make the DPT system work better would be to make some major deals with fio.c/bio.c and a couple of minor mods to the DPT. For example, multiple lower priority look ahead threads based upon file block ordering. Explicitly cancellable I/O's or look aheads. More, but I forget now. The DPT also has some other niceties: automatic bad-block sparing, single command format/bad blocking, statistics retrieval, and in my case, compatibility with dumb SCSI controllers except for the additional features - the NCR Tower SCSI driver has this neat "issue this chunk of memory as a SCSI command and give me the result" ioctl. Neat stuff the DPT. [No, I don't work for, nor have I ever worked for DPT. Hi Tom!] -- Chris Lewis, R.H. Lathwell & Associates: Elegant Communications Inc. UUCP: {uunet!mnetor, utcsri!utzoo}!lsuc!eci386!clewis Phone: (416)-595-5425
vjs@calcite.UUCP (Vernon Schryver) (08/31/89)
In article <123922@sun.Eng.Sun.COM>, plocher%sally@Sun.COM (John Plocher) writes: > We have here the timeless tradeoff between software and hardware.... > This works well for a minimal system, but the high performance systems all > have migrated to the "add CPUs/smarts to the I/O system" camp. Examples > here include Digiboard Com 8/i serial boards, TI/Intel graphics chips, > Adaptec SCSI host adapter, and the above mentioned DPT hard disk controller. Extrapolations from the PC-AT corner of the world to the universe are hazardous. As has been true for the >20 years of my experience, the current trend could also be said to be in the dumb direction. Both trends are always present. The UNIX workstations which are delivering ~1MByte/sec TCP/ethernet have dumb hardware. More than one UNIX vendor is unifying the UNIX buffer pool and page cache. Some people would say the high end of graphics performance is recently showing a lot more dumb hardware/smart software. In the low end, replacing CPU code or a hardware raster-op engine with a DSP is not really not an increase in controller intelligence. You don't need a CPU or even DMA for USARTS below T1 speeds, let alone UARTS; all you need are reasonble FIFO's and/or reasonble care for interrupt latency in the rest of the system. Recall that the crazy UNIX tty code is the desendent of smart hardware for doing line disciplines. > ...[various advantages of doing disk caching in hardware]... > Ordered write back - Writes are cached and elevator sorted by the controller This could be considered a bug. If you are shooting for file system reliability, the file system must be able to specify the order of some writes. For example, in UNIX style file systems it is better to write the data blocks before a new inode, and the new inode before the file directory data blocks containing the new name. Any other order produces more chaos if a crash occurs in the middle of what ever sequence you choose. Perhaps considering the typical reliable database with "commit" operations would be more convincing. It is absolutely wrong for the controller to re-order the writes of such a database manager. > In general, pushing all this off onto the controller is a win because it simplifies > the OS design and results in less main processor overhead to handle I/O to the > disk. > -John Plocher Putting more smarts in the controller is good, if the smarts cannot be used for anything else (e.g. faster and longer-burst ECC, other strange error recovery stuff, control loops to improve tracking, logic for lasers), if there is no important information that the smarts cannot reach (e.g. results of things like sync(2) or bdflush or binval()/btoss()/... in the UNIX buffer handling), and it is not possible to use the smarts of the operating system (e.g. no one with the time and source to improve the UNIX buffer code). I know no one I respect who is in "OS design". The good people are in "system design", which requires trying to put things where they will do the most good for the entire system, without regard for artificial distinctions derived from organization charts, such as the file system department and the controller group and the application division. Of course, politics, budgets, and schedules often warp implementations of good designs, but the "right" place to put intelligence is not a simple matter of dogma. Vernon Schryver vjs@sgi.com
karl@ddsw1.MCS.COM (Karl Denninger) (09/02/89)
In article <4843@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: >While cylinder or track caching is an eminently sensible idea that I >have been waiting to see for a long time, what is the point in the >controller or drive sotring more than that? > >Surely it makes more sense for the OS to do all other cache duties. >Why put the 512K in your drive when you can put it in your system and >bump your cache there? Other than the CPU overhead of maintaining the >cache within the OS, I mean. I would assume the benefit from having >the cache maintained by software that knows a bit about what's going >on would outweigh this. Well, there are lots of "plus"es, and one bad point. First, the good: o The controller can be intimately familiar with the layout of the disk itself. Since it usually is the device that formatted the disk, it can take in to account sector skew, interleave, and other factors that are quite beyond the OS's control. For instance, if the controller knows that it takes a sector's worth of time to switch heads, it can skew the cylinder format so that when going from head 1 to head 2, it doesn't "miss" a rotation. Drivers usually aren't nearly this familiar with disk geometry. o The controller can cache "raw" writes and reads, something that the Unix system does not do. This means that swap & page operations can be nearly instantaneous in many cases unless you completely fill the cache on the controller. It can also re-order writes back to the disk itself as it sees fit, making it possible to highly optimize the write order (and thus decrease the physical transfer time). o The controller can, if desired, physically mirror two disks, reducing the probability of failure. o In many cases the data is read back from the cache before it is even physically written to the disk! This is particularly true of page and swap areas. I have seen a quantum leap in performance from this effect alone. o The on-board processor takes the load of the transfer, ordering, and cache search instead of your physical processor. This gives your main processor more time to do work for the users. o You can retune for a minimal number of buffers, giving even more RAM to the users. This reduces the page fault rate and thus further increases performance. o You can have more than 16MB in the system without trouble, and use nearly none of your "base" 16MB for disk cache. Again, a potentially big win on loaded systems. And the bad point: o If the power fails and you do not have a UPS, you can get royally screwed. Since the RAM onboard is not NV, quite a bit of data can be lost if the power goes off. Yuck. So buy a UPS (if you can afford the cache controller, you can afford the UPS. Trust me). Some people have pointed out potential pitfalls in the setup when the system crashes or the like, or with databases. These DO NOT EXIST. The reason is simple -- the processor on the DPT board is independant from the main processor. IF the system crashes, it crashes -- so what? The DPT board will finish writing everything in the cache before it stops! The same holds true for a DBMS application. The only way you can lose is if the power goes off, and that can be protected against for less than the cache controller board costs! In short, if you need the performance, and can afford them AND a UPS, these cards are wonderful. I highly recommend them for those people who want near-mainframe disk performance. We've checked these out, and they win hands-down over simply "bumping" the cache size in the OS by an equivalent amount. Get a 2MB expansion cache board for Unix if you can afford it -- the 512K is very, very nice, but the 2MB version has to be seen to be believed! Disclaimer: We handle the DPT boards, as well as systems and the standard controllers/drives. -- Karl Denninger (karl@ddsw1.MCS.COM, <well-connected>!ddsw1!karl) Public Access Data Line: [+1 312 566-8911], Voice: [+1 312 566-8910] Macro Computer Solutions, Inc. "Quality Solutions at a Fair Price"
vjs@calcite.UUCP (Vernon Schryver) (09/03/89)
In article <1989Sep2.023511.24943@ddsw1.MCS.COM>, karl@ddsw1.MCS.COM (Karl Denninger) writes: > ... > o The controller can be intimately familiar with the layout of the > disk itself.... All of the advantages of having a smart controller are in principle available to the operating system. In the dark past, before UNIX, file systems paid attention to the geometry of the disk and even current head positions when allocating blocks out of the bit map. Some UNIX file systems try to do a little of the same--consider cylinder groups. What's more, if a file system which allocates the file optimally during the write, reads get faster. (See the ancient Berkeley Project Genie papers.) This is not possible if the controller is too smart. (Yes, I know SCSI drives are often too smart, but they just as uppity for this wonderful controller.) > o The controller can cache "raw" writes and reads, something that the > Unix system does not do. Not true. If there were a law requiring raw I/O to not use the buffer cache, then some UNIX kernels would be breaking it. > o The controller can, if desired, physically mirror two disks, reducing > the probability of failure. Even better, is to have the file system provide the correct rather than blanket redundancy. > o In many cases the data is read back from the cache before it is even > physically written to the disk! Just like any cache scheme, no matter what level. > o The on-board processor takes the load of the transfer, ordering, and > cache search instead of your physical processor [... giving more cycles > to users]. As others elsewhere continue to argue, given fixed and finite dollars (i.e price is an object), it is better to have two general purpose CPU's than a mixture of specialized ones. An expression of this fact is the greater respect "symmetric multiprocessor UNIX'es" receive compared to "master-slave" or other modes. Since most machines are sometimes CPU bound and not doing any I/O, a machine with two *86's working on user CPU cycles will be faster than a machine with one *86 for users and one asleep on the disk. Imagine a dual processor, with a kernel process running the code now on the DPT board and the old disk driver talking to this daemon. Put all of the RAM in a common pool, but let the DPT-emulator grab as much as it wants. The dual processor would cost about the same, would never be slower and usually make the smart controller, single processor look ridiculously slow. (Obviously, the AT bus would the wrong choice.) > o You can retune for a minimal number of buffers, giving even more RAM > to the users. This reduces the page fault rate and thus further > increases performance. Giving the same amount of RAM to the buffer cache instead of to the controller would have the same effect. > o You can have more than 16MB in the system without trouble, and use > nearly none of your "base" 16MB for disk cache.... Agreed. On PC-AT/386 clones, this controller sounds like a neat idea. Such a machine is not a "real computer." (Never mind the vast (for me) amount of my own $ I've spent on this one.) > o If the power fails and you do not have a UPS, you can get royally > screwed. Since the RAM onboard is not NV, quite a bit of data can > be lost if the power goes off.... > > Some people have pointed out potential pitfalls in the setup when the system > crashes or the like, or with databases. These DO NOT EXIST. The reason is > simple -- the processor on the DPT board is independant from the main > processor. IF the system crashes, it crashes -- so what? The DPT board > will finish writing everything in the cache before it stops! ... A purist might disagree with you. "Commit" means "commit to stable storage." Does the DPT board have "proven correct" hardware and firmware? Can it not crash like any other system? However, being impure, I'll agree if you say RAM+battery or RAM+capacitor+nonvolitile storage is stable storage, whether it is made by IBM, Storage Tech., Legato, or DPT. > Disclaimer: We handle the DPT boards, as well as systems and the standard > controllers/drives. > Karl Denninger (karl@ddsw1.MCS.COM, <well-connected>!ddsw1!karl) > Public Access Data Line: [+1 312 566-8911], Voice: [+1 312 566-8910] > Macro Computer Solutions, Inc. "Quality Solutions at a Fair Price" Oh. Now I understand your enthusiasm. I trust you also sell UPS's. Vernon Schryver vjs@calcite.uucp ...{pyramid,sgi}!calcite!vjs
cpcahil@virtech.UUCP (Conor P. Cahill) (09/04/89)
In article <68@calcite.UUCP>, vjs@calcite.UUCP (Vernon Schryver) writes: > In article <1989Sep2.023511.24943@ddsw1.MCS.COM>, karl@ddsw1.MCS.COM (Karl Denninger) writes: > > ... > > o The controller can be intimately familiar with the layout of the > > disk itself.... > > All of the advantages of having a smart controller are in principle > available to the operating system. Yes, and they all steal cpu cycles from the user processes. I paid alot for my heavy duty cpu and I would rather have it managing my windows and compiling my programs then figuring out how to talk to my disk. > > o The controller can cache "raw" writes and reads, something that the > > Unix system does not do. > > Not true. If there were a law requiring raw I/O to not use the buffer > cache, then some UNIX kernels would be breaking it. Raw device support is not provided by the kernel, it is provided by the device driver. Sometimes it makes sense for both the raw interface and block interface to both pass through the buffer cache, but for disk i/o all device drivers *should* not pass the raw i/o through the cache. If they did, fsck would have lots of trouble (especially when it told you to reboot-no sync). > > o The controller can, if desired, physically mirror two disks, reducing > > the probability of failure. > > Even better, is to have the file system provide the correct rather than > blanket redundancy. I'm not too sure what you mean by correct rather than blanket redundancy, but the disk mirroring available will automatically fix/remap bad sectors on both the master and/or mirror disk without any intervention on the part of the user. This is a very positive element for systems that will be placed in service in some remote region and left running for extended periods of time without any local interaction. > > As others elsewhere continue to argue, given fixed and finite dollars (i.e > price is an object), it is better to have two general purpose CPU's than a > mixture of specialized ones. An expression of this fact is the greater > respect "symmetric multiprocessor UNIX'es" receive compared to "master-slave" > or other modes. Since most machines are sometimes CPU bound and not doing > any I/O, a machine with two *86's working on user CPU cycles will be faster > than a machine with one *86 for users and one asleep on the disk. You are comparing apples and oranges. The dual/multi processor environments were always setup with multiples of the same processor. The master/slave setup was pretty bad (at least on the one such setup I am familiar with) because only the master processor could perform most system calls. However, in the scenario where a *MUCH CHEAPER* processor can be used to offload some of the system i/o requirements this does not come into question. Most mini's and mainframes use this mechanism to leave the cpu doing what it does best. In particular the DPT comes with a 68000 processor which I *think* runs for about $8 each and CAN handle all of the requiements dealing with disk i/o. This is the same argument that is applied to the intelligent serial card, which for about $500-900 will boost the serial i/o support on your system from maybe 2 19.2 lines to as much as 8 or 16. -- +-----------------------------------------------------------------------+ | Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 ! | Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 | +-----------------------------------------------------------------------+
karl@ddsw1.MCS.COM (Karl Denninger) (09/04/89)
In article <68@calcite.UUCP> vjs@calcite.UUCP (Vernon Schryver) writes: >In article <1989Sep2.023511.24943@ddsw1.MCS.COM>, karl@ddsw1.MCS.COM (Karl Denninger) writes: >> ... >> o The controller can be intimately familiar with the layout of the >> disk itself.... > >All of the advantages of having a smart controller are in principle >available to the operating system. In the dark past, before UNIX, file >systems paid attention to the geometry of the disk and even current head >positions when allocating blocks out of the bit map. Some UNIX file >systems try to do a little of the same--consider cylinder groups. What's >more, if a file system which allocates the file optimally during the write, >reads get faster. (See the ancient Berkeley Project Genie papers.) ..... >> o The controller can cache "raw" writes and reads, something that the >> Unix system does not do. > >Not true. If there were a law requiring raw I/O to not use the buffer >cache, then some UNIX kernels would be breaking it. So tell me what systems do your "smart optimization" based on disk geometry and format layout. Please name _one_ single '386 ISA Unix system, or a derivitive thereof, that even knows how to format (low level) the drives. Oh, I guess 386/ix does, but it is rather stupid(tm) about it. Even 386/ix, if you look at the installation notes, recommends that you format from some other tool than the OS if you can. As for raw I/O going through the buffer cache, please name one Unix, again in the 80386 ISA environment that does this. Just one, please? Then find a Unix for me that has an excellent buffer cache management scheme. Again, you may be looking for a while. I haven't found one yet. I've checked 386/ix, Bell Tech, AT&T, and SCO ports. All, in fact, either say outright or allude to the fact that raw device access is unbuffered. If a standard is defacto, it is just as much a standard as if it was written in stone. >> o The controller can, if desired, physically mirror two disks, reducing >> the probability of failure. > >Even better, is to have the file system provide the correct rather than >blanket redundancy. Huh? How can you provide redundancy against a head crash EXCEPT by mirroring the disk? Explain your scheme please. No fair cheating and using disk "striping" -- 8 disk drives for each pack (one per bit!). We're talking about hardware that you can hang on a '386 ISA bus, not some esoteric $2M system here! The one system I saw to do this in the driver interface hit performance for > 50% over the base. Yuck. The DPT performance hit is a few percent. >> o In many cases the data is read back from the cache before it is even >> physically written to the disk! > >Just like any cache scheme, no matter what level. In the page/swap partition? Show me a Unix for the '386 ISA systems which does this. Every one I have seen so far goes right out to a raw, unbuffered, I/O device. >> o The on-board processor takes the load of the transfer, ordering, and >> cache search instead of your physical processor [... giving more cycles >> to users]. > >As others elsewhere continue to argue, given fixed and finite dollars (i.e >price is an object), it is better to have two general purpose CPU's than a >mixture of specialized ones. An expression of this fact is the greater >respect "symmetric multiprocessor UNIX'es" receive compared to "master-slave" >or other modes. Since most machines are sometimes CPU bound and not doing >any I/O, a machine with two *86's working on user CPU cycles will be faster >than a machine with one *86 for users and one asleep on the disk. What "symmetric multiprocessor" systems are you speaking of? Is there even a single such implementation that doesn't end up talking, ONE AT A TIME, to the disk hardware? Assume one (or two) disks here -- no fair comparing the typical 80386 installation to a mainframe "disk farm". If you talk to the disk hardware one processor at a time you must be, in the end analysis, SLOWER than the "one for the disk, one for the data" crowd, as you now have to arbitrate disk I/O between processors as well as physically move the data! >Imagine a dual processor, with a kernel process running the code now on the >DPT board and the old disk driver talking to this daemon. Put all of >the RAM in a common pool, but let the DPT-emulator grab as much as it >wants. The dual processor would cost about the same, would never be slower >and usually make the smart controller, single processor look ridiculously >slow. (Obviously, the AT bus would the wrong choice.) Obviously you aren't working in the real world, and in addition are posting about a utopia that isn't relavent to the purpose of THIS news group. The people here, who read and post to this group, are working with and talking about ISA 80386 machines. That means AT bus. Like it or not, there are lots of compelling reasons for using the AT bus, and companies such as DPT make it more than reasonable to do so. Items like multiple-sourcing, inexpensive price points, interchangability of operating environments (pick your Unix), etc. >> o You can retune for a minimal number of buffers, giving even more RAM >> to the users. This reduces the page fault rate and thus further >> increases performance. > >Giving the same amount of RAM to the buffer cache instead of to the controller >would have the same effect. Only if your Unix knows how to use it intelligently. None of the current systems do. Your argument sounds good, but doesn't cut it with today's OS environments. Again, you need to rethink within real-world constraints. >> o You can have more than 16MB in the system without trouble, and use >> nearly none of your "base" 16MB for disk cache.... > >Agreed. On PC-AT/386 clones, this controller sounds like a neat idea. >Such a machine is not a "real computer." (Never mind the vast (for me) >amount of my own $ I've spent on this one.) Such a machine IS a real computer. It has 3-5X the power of a Vax 780, which only a few years ago was quite a "real computer". The 33 Mhz systems are royal screamers, and put many of the so-called "workstations" to shame. And they offer something NONE of the proprietary monsters do -- no hardware single-sourcing. That alone is enough reason for many people to buy into the technology. >> o If the power fails and you do not have a UPS, you can get royally >> screwed. Since the RAM onboard is not NV, quite a bit of data can >> be lost if the power goes off.... >> >> Some people have pointed out potential pitfalls in the setup when the system >> crashes or the like, or with databases. These DO NOT EXIST. The reason is >> simple -- the processor on the DPT board is independant from the main >> processor. IF the system crashes, it crashes -- so what? The DPT board >> will finish writing everything in the cache before it stops! ... > >A purist might disagree with you. "Commit" means "commit to stable >storage." Does the DPT board have "proven correct" hardware and firmware? >Can it not crash like any other system? However, being impure, I'll agree >if you say RAM+battery or RAM+capacitor+nonvolitile storage is stable >storage, whether it is made by IBM, Storage Tech., Legato, or DPT. Cute. Now please differentiate between: o A DPT board which crashes, losing the data in the cache. o A drive which head-crashes, losing the data on the disk. o A system which loses power, losing the data in the memory. Oh, you can't tell the difference from the user perspective? Wow. Neither can I. "Purists" can be damned; it's the user's perspective that matters in every case. When you say "commit" you can't possibly achieve "commit to stable storage" because there is no such animal! The best you can do is "commit to MORE STABLE storage". Is your disk "proven correct"? For how many hours? (what's it's MTBF?) I bet the DPT board has a higher MTBF than your drive -- it almost has to, being that there are no moving parts! Besides, no one said anything about RAM + battery. The point was made that the DPT board is a bad investment if you don't also have a UPS. GIVEN the UPS, however, the DPT is the best way to supercharge your 386 system that we have seen, bar none. IF your system crashes or panics, as they sometimes do, the DPT board will finish flushing the cache to the drive(s) before it shuts down. No data loss has been seen here beyond what you would lose anyway with a "straight" system. >> Disclaimer: We handle the DPT boards, as well as systems and the standard >> controllers/drives. >> Karl Denninger (karl@ddsw1.MCS.COM, <well-connected>!ddsw1!karl) >> Public Access Data Line: [+1 312 566-8911], Voice: [+1 312 566-8910] >> Macro Computer Solutions, Inc. "Quality Solutions at a Fair Price" > >Oh. Now I understand your enthusiasm. I trust you also sell UPS's. We sell whatever the customer wants. If someone comes to us and says "I want a fast '386 based system, no holds barred", that is what we sell them, yes. We also sell "Standard" '386 systems, without these controllers, and DEC gear, and lots of other stuff. We are in the business of solving people's problems, not pushing one board over another. We, like most other ISV's, operate in a real world bounded by many constraints -- most of them set upon us by our customer base. There are a lot of people who won't consider DEC RISC machines, or Sun Sparcstations for the simple reason that they are single-sourced. If you look at those systems you'll also find rather disappointing I/O performance compared to a DPT + ESDI disk + 33Mhz 386 machine. TRY one of the DPT boards before you go off half-cocked berating them. You might be (very pleasently) surprised. Then try to find one of your utopian multi-processor symmetrical '386 systems, for the same price, that beats the DPT + 33Mhz 80386 for both CPU and I/O performance in the real world. You'll miss on the price, performance, or (more likely) both. -- Karl Denninger (karl@ddsw1.MCS.COM, <well-connected>!ddsw1!karl) Public Access Data Line: [+1 312 566-8911], Voice: [+1 312 566-8910] Macro Computer Solutions, Inc. "Quality Solutions at a Fair Price"
zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (09/04/89)
>There are a lot of people who won't consider DEC RISC machines, or Sun >Sparcstations for the simple reason that they are single-sourced. If you >look at those systems you'll also find rather disappointing I/O performance >compared to a DPT + ESDI disk + 33Mhz 386 machine. I find the DEC RISC machines disk i/o terrible but the new Sun machines are about the same as a '386 ESDI system with Interactive Unix. Sun also uses any available memory for disk caching (which works well). A '386 scsi system is faster. Some rough results for raw throughput: DEC 3100 150 '386 ESDI 522 Sun Spark/1 550 '386 SCSI 677 -- Branch Technology | zeeff@b-tech.ann-arbor.mi.us | Ann Arbor, MI
vjs@calcite.UUCP (Vernon Schryver) (09/05/89)
In article <1989Sep4.024559.13279@ddsw1.MCS.COM>, karl@ddsw1.MCS.COM (Karl Denninger) writes: > > ...[a loud rebuttal]... > Karl Denninger (karl@ddsw1.MCS.COM, <well-connected>!ddsw1!karl) > Public Access Data Line: [+1 312 566-8911], Voice: [+1 312 566-8910] >Macro Computer Solutions, Inc. "Quality Solutions at a Fair Price" I got Mr. Denninger's dander up. Be careful or thick skinned when not complimenting what people sell. He is obviously wrong and inconsistent on several technical issues, but such technicalities are not germane to what he is saying. As I agreed in the article which got him excited, the DPT controller sounds good for small, slow computers such as this 20MHz, 8MB clone, for those of us without the money, time, talent, or source to fix the SV kernel. This part of my life qualifies, so I would happily accept one for an extended evaulation. However, the simple professional honesty required in another part of my life compels me to say the DPT controller is architecturally wrong. That AT&T et al currently make it useful and desirable is irrelevant. There is no reason to lie to customers and say it is more than a neat and cheap kludge around design flaws in some versions of SVR3, or to claim those flaws are part of "UNIX." In 1989, "fast" uniprocessor workstations are >=20 times a VAX 780, 3-5 times faster than 386 clones. True, they use 88K's, SPARK's, or MIPS-R3000's instead of 386's and cost more than $2,500 (but <$25,000). You can buy multiprocessor workstations well over 100 VUPS (1VUP=1VAX 780). At least one such SVR3 until recently mapped raw disk I/O to cooked. (Disagreement on that point from a PC VAR are boring to a hack paid to work in that kernel.) That we are now stuck with no more than half-dozen VUPS is no reason to get religous. This is comp.unix.i386, not biz.att or biz.MacroComputerSolutions. People here snear at the silliness of DOS extended memory. Let's not permanently wed an analogous kludge for what are hopefully temporary O/S bugs. Vernon Schryver vjs@calcite.uucp or ...!{sgi,pyramid}!calcite!vjs
vjs@calcite.UUCP (Vernon Schryver) (09/05/89)
In article <1125@virtech.UUCP>, cpcahil@virtech.UUCP (Conor P. Cahill) writes: > > Raw device support is not provided by the kernel, it is provided by the > device driver. Sometimes it makes sense for both the raw interface and > block interface to both pass through the buffer cache, but for disk i/o > all device drivers *should* not pass the raw i/o through the cache. If > they did, fsck would have lots of trouble (especially when it told you > to reboot-no sync). > | Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 ! > | Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 | Running fsck thru the buffer cache helps. You don't suffer some of the inconsistencies that otherwise occur. Most of the need for the "reboot-no sync" message goes away. (BTW, in SVR3 for separate reasons, fsck does not say that; it automatically re-mounts root.) As long as the character device reads & writes the correct bytes quickly, what does it matter to a user process such as fsck how the driver (which many consider part of the SVR3 kernel, even if it need not be in MACH) does its thing? Vernon Schryver vjs@calcite.uucp or ...{sgi,pyramid}!calcite!vjs
cpcahil@virtech.UUCP (Conor P. Cahill) (09/05/89)
In article <72@calcite.UUCP>, vjs@calcite.UUCP (Vernon Schryver) writes: > > Running fsck thru the buffer cache helps. You don't suffer some of the > inconsistencies that otherwise occur. Most of the need for the "reboot-no > sync" message goes away. (BTW, in SVR3 for separate reasons, fsck does not > say that; it automatically re-mounts root.) As long as the character > device reads & writes the correct bytes quickly, what does it matter to a > user process such as fsck how the driver (which many consider part of the > SVR3 kernel, even if it need not be in MACH) does its thing? There is a big difference for programs that wich to read lots of data from a disk partition. I have used database software that allows a user to specify raw volumes for portions of the database. The database used the raw partitions for backups/restores and sequential access. Using the raw interface with a 200K blocking factor allowed the data to be processed about an order of magnitude faster than using the blocked interface. I think the user would notice this (in fact he did, when I wrote a replacement program for the restore program and forgot to use the raw device, it took over 3 hours to do what normally would be a 20-30 minute operation AND the entire system was at a standstill). In another response you made references to the bad coding in the System V OSs available for the i386 and said the solution could be solved by fixing the OS. I don't have the money to buy a source liscence ?sp?, nor the time to spend making a fix to gain some additional performance when I can buy a hardware solution for a lot less. Besides as I said before, most larger systems do not have all the i/o brains in the OS, they use dedicated I/O subsystems to do all that work. As a side note, I do not have anything to do with DPT, I am just an end user who is very satisfied with the performance of my system using the DPT board. -- +-----------------------------------------------------------------------+ | Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 ! | Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 | +-----------------------------------------------------------------------+
hedrick@geneva.rutgers.edu (Charles Hedrick) (09/05/89)
>As I agreed in the article which got him excited, the DPT controller sounds >good for small, slow computers such as this 20MHz, 8MB clone, for those of >us without the money, time, talent, or source to fix the SV kernel. The analog of the DPT controller is also useful for larger systems. I think most people believe that the best disk performance from a Sun 4/280 (sorry -- the new models are too new for much experience in the community) is when using a Ciprico Rimfire SMD controller or something similar. It has a 512K cache and does various kinds of readahead. (However it is normally used with write-through cache, so as not to require a UPS, although that is under the sysadmin's control.) I think this discussion has been in all-or-nothing terms that don't make any sense. My feeling is that modern I/O systems have optimizations at several levels, and that the best performance comes from doing appropriate things in both the kernel and the controller. Someone pointed to Ethernet controllers as an analog, noting the fact that the TCP/IP community generally recommends against intelligent controllers. This failed to note that even "dumb" Ethernet controllers have a fair amount of intelligence. All of the processing within timing constraints in the range of 1 usec to 1 msec is done by Ethernet controller chips these days. What you want in the kernel is the protocol processing. Protocol processing has timing constraints in the range of multiple msec to seconds. Similarly, I think disk performance comes in two types. What belongs in the kernel is file system optimization and caching that does not depend upon real-time knowledge of what the disk is doing. What belongs in the controller are things like zero-latency read/write (picking up a transfer in the middle because that's where the head happens to be -- which requires a track-level cache) and track-level lookahead. Track-level lookahead is time-critical because you want to abandon the lookahead operation if there are pending requests that you'd be better off doing. On a 3600 RPM disk with 67 sectors, a sector comes by every 250 microsec. I think that's a bit fast for a typical Unix kernel to be doing track-level optimization. There's a grey area in the middle with head-motion optimization. It would certainly be practical to do that in the kernel, and in fact many kernels have done so. However since many don't, on pragmatic grounds it makes sense to include the ability to reorder seeks in the controller. Presumably those whose kernels do a better job of head-motion optization will tell the controller not to do so. There's also a marketing issue: track-level optimizations are largely OS-independent, assuming they provide a few parameters to adjust. Ciprico can sell their controllers for anything that uses a VME bus. The DPT controller can be used on MS/DOS, Unix, etc. If you were an expert in low-level disk optimizations, it would probably make more sense for you to put your time into a controller than into software. How would you sell your software? Given the market, probably an MS-DOS TSR... Of course the Unix vendors should be have staffs doing this kind of thing anyway, but experience shows that few Unix development staffs have either the interest or the necessary expertise.
aland@infmx.UUCP (Dr. Scump) (09/08/89)
In article <1989Sep4.024559.13279@ddsw1.MCS.COM> karl@ddsw1.MCS.COM (Karl Denninger) writes: >... >The people here, who read and post to this group, are working with and >talking about ISA 80386 machines. That means AT bus. Like it or not, Says who? >are lots of compelling reasons for using the AT bus, and companies such as >DPT make it more than reasonable to do so. Items like multiple-sourcing, >inexpensive price points, interchangability of operating environments (pick >your Unix), etc. Since when is the charter of this group limited to ISA? Are you telling us that MCA-based (Micro Channel) machines don't belong here? Remember, ISC is shipping their current 2.0.2 release for MCA bus also. Do you want MCA people kicked out into a separate newsgroup? >Such a machine IS a real computer. It has 3-5X the power of a Vax 780, yeah, but who says the VAX 780 is a "real computer"? :-] :-] :-] >Karl Denninger (karl@ddsw1.MCS.COM, <well-connected>!ddsw1!karl) -- Alan S. Denney @ Informix Software, Inc. {pyramid|uunet}!infmx!aland "I want to live! -------------------------------------------- as an honest man, Disclaimer: These opinions are mine alone. to get all I deserve If I am caught or killed, the secretary and to give all I can." will disavow any knowledge of my actions. - S. Vega
karl@ddsw1.MCS.COM (Karl Denninger) (09/09/89)
In article <2296@infmx.UUCP> aland@infmx.UUCP (alan denney) writes: >In article <1989Sep4.024559.13279@ddsw1.MCS.COM> karl@ddsw1.MCS.COM (Karl Denninger) writes: >>... >>The people here, who read and post to this group, are working with and >>talking about ISA 80386 machines. That means AT bus. Like it or not, > >Says who? > >Since when is the charter of this group limited to ISA? >Are you telling us that MCA-based (Micro Channel) machines don't belong >here? Remember, ISC is shipping their current 2.0.2 release for MCA >bus also. Do you want MCA people kicked out into a separate newsgroup? No, not at all. I was referring to ideas of (symmetrical) multiprocessor monsters which aren't related. I should have qualified my statement further. Of course MCA systems are applicable; they are also 80386 based, and are a considerable force in the market. (SCO is shipping their release for the MCA bus too. There isn't much difference really, outside of a few low-level hardware interface details). Now as to my opinion of the _real_ reasons for the MCA bus.... but that's not relavent here, nor is it likely to bring anything other than heat :-) -- Karl Denninger (karl@ddsw1.MCS.COM, <well-connected>!ddsw1!karl) Public Access Data Line: [+1 312 566-8911], Voice: [+1 312 566-8910] Macro Computer Solutions, Inc. "Quality Solutions at a Fair Price"
fr@icdi10.UUCP (Fred Rump from home) (09/12/89)
In article <71@calcite.UUCP> vjs@calcite.UUCP (Vernon Schryver) writes: >In article <1989Sep4.024559.13279@ddsw1.MCS.COM>, karl@ddsw1.MCS.COM (Karl Denninger) writes: >> > ...[a loud rebuttal]... >> Karl Denninger (karl@ddsw1.MCS.COM, <well-connected>!ddsw1!karl) > >I got Mr. Denninger's dander up. Be careful or thick skinned when not >complimenting what people sell. He is obviously wrong and inconsistent on >several technical issues, but such technicalities are not germane to what >he is saying. Now wait a minute! There are many of us out here who's gander is up. Don't just blame Karl for accepting the world as it is today. Your points are fine but they simply aren't relevant to what the market has to offer. Sure some fine day there'll be better kernels and better everything else too. And the discussion will continue. In the meantime some of us need to make a living with what the best technology offers our customers NOW. We offen call this 'bang for the buck'. Sure, a Sequent with all it's 386's will do all your fancy footwork. Prime has announced a 1000 user 386 system too. Dynix is a tuned UNIX for just such an environment, but we start at a price point we can't even reach with the products Karl is selling. So why are we even discussing it? It's such a limited market in that $500,000 and up world that you can spend a lot of time tuning one system while we've installed another 1000 Unix systems. Ok, so they'll only run 30 users. But that's all we and our customers need. End of story. Fred Rump -- This is my house. My castle will get started right after I finish with news. 26 Warren St. uucp: ...{bpa dsinc uunet}!cdin-1!icdi10!fr Beverly, NJ 08010 domain: fred@cdin-1.uu.net or icdi10!fr@cdin-1.uu.net 609-386-6846 "Freude... Alle Menschen werden Brueder..." - Schiller