cliffhanger@cup.portal.com (Cliff C Heyer) (09/25/89)
>In <22308@cup.portal.com> Cliff C Heyer wrote: >> the *big iron* guys use MIPS as a trojan horse to hide the *real* >> performance issue - "real world" I/O bandwidth. Lets blow their >> cover!) Eric S. Raymond responded... >Huh? The `big iron' crowd is happy to talk I/O bandwidth, it's about the >only place general-purpose mainframes got room to shine any more. It's >the *PC/workstation* crowd that's prone to yell about MIPS and whisper >about I/O performance. Guess you got my point...but still I can express it better: To clarify, *big iron* guys emphasize I/O on their *mainframes*, but not on their *PCs*. Instead, they emphasize MIPS on their PCs. Of course many would say that PC users are not "interested" in I/O BW, and I agree with this. BUT there is a convenient dual purpose here for the *big Iron* guys. The following is my theory: Let me play *joe Mr. Marketing VP* who will get a BIG promotion if profits are maximized. Joe wants two things: (1) saturate the market with *slow* hardware so users will have to *buy more* in the future, and (2) don't tell people how slow their PC hardware is compared to *big iron* so as not to call attention to the relatively slow speed. Why have people complaining and demanding better performance if you can avoid it? This is the only answer I can come up with to explain why IBM consistently puts out PCs that are substantially below average in "real" disk I/O speed: 200KB/sec. Just look at Byte benchmarks. Plus I'm getting LOTS of mail now further confirming this. Now I'm talking about hardware bought & configured by IBM. I know you can plug in aftermarket disks w/custom drivers that may blow away the original equipment. Companies are trying to promote the image that they are selling "state if the art" hardware, but if you look at "mainframe" specs you see how far from "state of the art" you really are. Lets just be honest here, I say why let a company tell you that your buying the "best" when the truth is you are buying mid 70's level performance. I know I know, it doesn't cost $5,000,000. That is an accomplishment! My beef is how we are told all the time that we are buying the "best" when in fact, we are often buying below the current average - as with IBM. I've had lots of input from usenet about SRAM prices, etc., and I can appreciate that there is no way to build a 33MHz 80386 PC with no wait states 100% of the time without the price going to $50,000. 'Nuf said. BUT my point is that the alleged "leaders" in the industry are not even keeping up with what the small companies are doing. For example, the Amiga doing 700-900KB/s "real" disk I/O. I've had this confirmed now by several people, some who I know personally. And new PS/2s come out with SCSI doing 200KB/s, at THREE TIMES the price of the Amiga. THEN we have the $30,000 MIPS M120/5 only doing 600KB/sec, the AViiON doing 300KB/sec, the Sparcstation doing 200KB/s, etc. etc. So my belief is that some companies are trying to save I/O BW for their *big iron* by purposefully handicapping the speed of their PCs. They accomplish two things: First, saving BW for the *big iron* by making big databases run too slow, and second encourage hardware upgrades sooner by limiting your I/O so you'll need more hardware sooner. And even though workstation vendors don' have *big iron*, they DO have a $200,000 "high end" they NEED to sell which won't sell if their entry level models do 1MB/sec disk I/O. What I would like to see happen is for trade papers to place much more emphasis on disk I/O so that the *big iron* PC makers will no longer be able to play this game with the consumer. Infoworld, PC Week, are you listening? (Or are you going along with this because of all the advertising dollars you are being paid?) At lease PCs are cheap enough that you can buy & test them without signing a non-disclosure agreement! Barry Traylor writes... In article <7981@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes:> >> Well, I just tried it on my machine (old, slower disk controller, >>medium fast SCSI disk (Quantum)). Read 3Meg file into memory: 609K/s. >>Copy 3 meg file (on slightly fragged partition) to another file on the >>same disk partition: ~550K/s. On a newer controller with a fast SCSI disk >>(170Meg CDC): ~900K/s and ~800K/s. > >Ok, ok, so we've now seen two pretty impressive transfer rates for MICROs. >I would even go so far as to say that the rates reported beat by a little >the PDP11/70 I used 10 years ago. I hope, however, that you don't think >this comes even close to what is attainable SUSTAINED on a mainframe. Hmmm. I'm talking about sustained rates *per job*, not for the *system*. I know overall throughput is in excess is 100MB/sec. But who makes a disk drive that does 100MB/sec transfers? The best now is 3-4MB/sec. So when we get right down to it, a COBOL program reading a file can expect less than 3-4MB/sec on a mainframe. (The same reasoning explains how a 100 MIPS 4 processor mainframe can only support 25 MIPS *per job*) >much of the CPU was chewed up while these transfers were underway? If you are running single tasking OS (NOT UNIX), who cares? You have to wait until the transfer is done anyway, so it might as well be as fast as possible. Hopefully SCSI does DMA while the CPU is busy elsewhere(comments please..) >I have seen mainframes do 50 times that rate (on a 1 processor system) and only >utilize 10% of a CPU. Yup. That is because of intelligent channel processors that do DMA to multi-ported memory. The same thing SCSI can do. Except with one user, we only need one channel (or one for each file). But with UNIX we could use a few more. >I have seen I/O rates at 4000-5000 i/os per second >where the CPU is less than 75% utilized. How many SCSI channels do these >micros support? One I think. Comments others please!!!!! >On a strict connectivity basis, the mainframe I am associated with can support >over 100! But also they can support 1000's of users at once. Not needed on a PC. >So go ahead and feed steroids to SCSI. It will help mainframes as much as >everyone else. We would love to sell our mainframe customers hundreds of >the things to squeeze into their pack farm acherage. I guess you aren't in marketing! MF vendors don't want to *encourage* users to port 500MB databases to micros, because the profit margin is so low on PCs. I believe they enact plans that discourage users from leaving the *big iron*, which would include limiting PC I/O to 300KB/sec. Look at IBM's OfficeVision - it's mainframe-centered. I'm hoping some engineers might speak up who have actually designed PC disk I/O subsystems and could tell us why they didn't try for 900KB/sec like on the Amiga. Cliffhanger@cup.portal.com -----
ccplumb@rose.waterloo.edu (Colin Plumb) (09/25/89)
In article <22488@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) writes: > I'm hoping some engineers might speak up who have actually designed PC disk > I/O subsystems and could tell us why they didn't try for 900KB/sec like on > the Amiga. In their defense, they're handicapped by the MS-DOS file system, which is pretty piss-poor. Randell's figures are using the rewritten file system; replacing MS-DOS's is trickier. A 2090A SCSI controller with a CDC Wren III can do 1.2MB/sec through the device driver and the 2091 is probably faster, so it's possible I will be able to get 1MB/sec I/O out of my 7.14MHz 68000 one day. -- -Colin
lamaster@ames.arc.nasa.gov (Hugh LaMaster) (09/26/89)
In article <22488@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) writes: >To clarify, *big iron* guys emphasize I/O on >their *mainframes*, but not on their *PCs*. Instead, they emphasize >This is the only answer I can come up with to explain why IBM >consistently puts out PCs that are substantially below average in "real" >disk I/O speed: 200KB/sec. Just look at Byte benchmarks. Plus I'm >So my belief is that some companies are trying to save I/O BW for their >*big iron* by purposefully handicapping the speed of their PCs. They Many of your points are well taken. In fact, many big companies don't make it a secret that they limit their user's options to force certain migration paths. The industry trade rags are full of speculation about such things, and sometimes even print a lot of criticism of the big boys for introducing new, high performance products too quickly - it is hard on the used equip. mkt. However, I think you are painting with too broad a brush to include Sun, MIPSCo, etc. in your list. Remember that the controllers you have been using for your comparisons to get ~1 MB/sec. through a filesystem are relatively new. Most of these controllers have been thoroughly *debugged* and in volume production (two prerequisites for full service companies to buy) for 6 mos. to one year. Sun now sells faster controllers that will do almost 1 MB/sec. on SMD disks through a Unix filesystem. I haven't had a chance to measure any IPI or synchronous SCSI disks. But it is unfair to use today's controllers to criticize systems shipped 1-2 years ago. The other thing that would probably help would be if more people said to salesrep from company X: "I am buying the system from company Y. Even though the CPU is only 10 MIPS instead of 20, it can stream data from 4 controllers simultaneously at 2.5MB/sec. each, with negligible CPU overhead." Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117
jesup@cbmvax.UUCP (Randell Jesup) (09/26/89)
In article <22488@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) wri tes: >>much of the CPU was chewed up while these transfers were underway? > >If you are running single tasking OS (NOT UNIX), who cares? You have to >wait until the transfer is done anyway, so it might as well be as fast as >possible. Hopefully SCSI does DMA while the CPU is busy elsewhere(comments >please..) Well, I don't want to sound commercial here, but the Amiga (referenced by the above quote) is multitasking. I don't have any cpu benchmarks run during intense disk I/O handy, but I'll post some when I get time to dig them out. BTW, most Unix machines are handicapped by the "standard" unix fs/disk cache. This cache requires them to do single-block reads, while under AmigaDos the filesystem can ask for large blocks and have it transfered by DMA directly from disk to where the application's read goes to. This works quite well with SCSI. On the same hardware, the Amiga Unix (Amix) gets signifigantly lower I/O throughput because of this, and the extra transfer via CPU to the application's buffer. >>I have seen I/O rates at 4000-5000 i/os per second >>where the CPU is less than 75% utilized. How many SCSI channels do these >>micros support? > >One I think. Comments others please!!!!! You can add up to 5 SCSI controllers to an amiga (limited by the 5 slots). The other limit is the bus bandwith of the current Amiga, at about 3.5 Mb/s. Of course, each scsi controller can talk to at least 7 drives, if you don't use multi-lun drives. -- Randell Jesup, Keeper of AmigaDos, Commodore Engineering. {uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.cbm.commodore.com BIX: rjesup Common phrase heard at Amiga Devcon '89: "It's in there!"
elgie@canisius.UUCP (Bill Elgie) (09/26/89)
In article <22488@cup.portal.com>, cliffhanger@cup.portal.com (Cliff C Heyer) writes: > > BUT my point is that the alleged "leaders" in the industry are not even > keeping up with what the small companies are doing. For example, the > Amiga doing 700-900KB/s "real" disk I/O....... THEN > we have the $30,000 MIPS M120/5 only doing 600KB/sec, ....... > Actually, I believe that Amiga is somewhat bigger than MIPS (tho the latter should catch up...). The "$30,000 MIPS M/120" is 1) more than a year old and slated for an up- grade, and 2) includes quite a bit of memory, a disk, ethernet, serial ports, etc, as well as a very well-done UNIX and its associated software, with an unlimited user license. It runs considerably faster than anything I have seen from Amiga. We support a good-sized database application on one of these systems, in spite of the limited "600KB/sec" transfer rate: that measure is not very mean- ingful and inaccurate in any case. greg pavlov (under borrowed account), fstrf, amherst, ny
henry@utzoo.uucp (Henry Spencer) (09/26/89)
In article <7997@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes: >... most Unix machines are handicapped by the "standard" unix fs/disk >cache. This cache requires them to do single-block reads, while under AmigaDos >the filesystem can ask for large blocks and have it transfered by DMA directly >from disk to where the application's read goes to... It's quite possible to do this under Unix as well, of course, if you've got kernel people who seriously care about I/O performance. -- "Where is D.D. Harriman now, | Henry Spencer at U of Toronto Zoology when we really *need* him?" | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
sritacco@hpdml93.HP.COM (Steve Ritacco) (09/27/89)
This seems like an appropriate time to mention the channel controller in the NeXT machine. There is an example of a micro with attention paid to I/O bandwidth. I don't know what transfer rates it can sustain, but maybe someone on the net can tell us.
markb@denali.sgi.com (Mark Bradley) (09/27/89)
In article <32512@ames.arc.nasa.gov>, lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes: > > However, I think you are painting with too broad a brush to include Sun, MIPSCo, > etc. in your list. Remember that the controllers you have been using for > your comparisons to get ~1 MB/sec. through a filesystem are relatively new. > Most of these controllers have been thoroughly *debugged* and in volume > production (two prerequisites for full service companies to buy) for 6 mos. > to one year. Sun now sells faster controllers that will do almost 1 MB/sec. > on SMD disks through a Unix filesystem. I haven't had a chance to measure > any IPI or synchronous SCSI disks. But it is unfair to use today's controllers > to criticize systems shipped 1-2 years ago. I can't yet publish our IPI numbers due to signed non-disclosure, but suffice it to say that it would not make sense to go to a completely different controller and drive technology for anything less than VERY LARGE performance wins or phenomenal cost savings.... On the other hand, using a controller from the same company, SGI gets over 2 MB/sec. on SMD *through* the filesystem. See comp.sys.sgi for discussions on the Extent File System designed by our own Kipp Hickman and Donovan Fong (who is no longer with SGI). However it is also true that a decent job on drivers, caching, selection of the right technology, both in terms of con- trollers and disk drives, and actually marrying these all together will yield a more coherent disk subsystem that is capable of providing nearly theoretical maximum throughput. This is something many companies seem to miss the boat in doing. Clearly I am somewhat biased, but the numbers don't lie (see below for our mid-range ESDI numbers, which are handy in my current directory). > > The other thing that would probably help would be if more people said to > salesrep from company X: "I am buying the system from company Y. Even > though the CPU is only 10 MIPS instead of 20, it can stream data from 4 > controllers simultaneously at 2.5MB/sec. each, with negligible CPU overhead." This is reasonable. Even more so if there is negligible CPU overhead. 2.5 on 4 controllers seems low and/or expensive, however. > > Hugh LaMaster, m/s 233-9, UUCP ames!lamaster > NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov > Moffett Field, CA 94035 > Phone: (415)694-6117 IP9,nfs,bbs Sequential write test total 33554432 time 19.910 ms/IO 9 Kb/S 1685 Sequential read test total 33554432 time 22.420 ms/IO 10 Kb/S 1496 Random read test total 8388608 time 14.990 ms/IO 29 Kb/S 559 Multiple processes writing separate files simultaneously total 268435456 time 187.660 ms/IO 11 Kb/S 1430 Multiple processes reading the intermixed files total 268435456 time 216.950 ms/IO 13 Kb/S 1237 Multiple processes reading randomly from the intermixed files total 67108864 time 128.580 ms/IO 31 Kb/S 521 Write 8 files, one at a time total 33554432 time 20.140 ms/IO 9 Kb/S 1666 total 33554432 time 20.110 ms/IO 9 Kb/S 1668 total 33554432 time 20.570 ms/IO 10 Kb/S 1631 total 33554432 time 20.230 ms/IO 9 Kb/S 1658 total 33554432 time 20.870 ms/IO 10 Kb/S 1607 total 33554432 time 19.930 ms/IO 9 Kb/S 1683 total 33554432 time 20.290 ms/IO 9 Kb/S 1653 total 33554432 time 20.510 ms/IO 10 Kb/S 1636 Multiple processes reading the sequentially laid-out files total 268435456 time 202.900 ms/IO 12 Kb/S 1322 Multiple processes reading randomly from the sequentially laid-out files total 67108864 time 123.270 ms/IO 30 Kb/S 544 Disclaimer: This is my opinion. But in this case, it might just be that of my employer as well. markb -- Mark Bradley "Faster, faster, until the thrill of IO Subsystems speed overcomes the fear of death." Silicon Graphics Computer Systems Mountain View, CA ---Hunter S. Thompson
nelson@udel.EDU (09/27/89)
In article <22488@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) writes: > >I'm talking about sustained rates *per job*, not for the *system*. I know >overall throughput is in excess is 100MB/sec. But who makes a disk drive that >does 100MB/sec transfers? The best now is 3-4MB/sec. So when we get right >down to it, a COBOL program reading a file can expect less than 3-4MB/sec on >a mainframe. (The same reasoning explains how a 100 MIPS 4 processor mainframe >can only support 25 MIPS *per job*) > Since we are talking about "*big iron*", let's talk about real big iron. Cray DD-40 disk drives can support >10MB/sec through the operating system (at least COS; I assume the case is also true for UNICOS). And COS also supports disk striping at the user level, so for sequential reads of a file striped across an entire DS-40 disk subsystem (20+ GB, 4 drives) a process can achieve sustained rates of 40MB/sec. Of course, this is for relatively large (~ 0.5MB) reads, but these aren't uncommon for the sort of processing Crays do. Disk I/O is one of Cray's big selling points vs. the Japanese super-computer manufacturers--their machines generally have mainframe (read 4MB/sec) style disk channels. Mark Nelson ...!rutgers!udel!nelson or nelson@udel.edu This function is occasionally useful as an argument to other functions that require functions as arguments. -- Guy Steele
pl@etana.tut.fi (Lehtinen Pertti) (09/27/89)
From article <1989Sep26.163307.17238@utzoo.uucp>, by henry@utzoo.uucp (Henry Spencer): > In article <7997@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes: >>... most Unix machines are handicapped by the "standard" unix fs/disk >>cache. This cache requires them to do single-block reads, while under AmigaDos >>the filesystem can ask for large blocks and have it transfered by DMA directly >>from disk to where the application's read goes to... > > It's quite possible to do this under Unix as well, of course, if you've got > kernel people who seriously care about I/O performance. Yes, and if your DMA-controller can manage user buffers spreading across several pages all over your memory. -- pl@tut.fi ! All opinions expressed above are Pertti Lehtinen ! purely offending and in subject Tampere University of Technology ! to change without any further Software Systems Laboratory ! notice
lamaster@ames.arc.nasa.gov (Hugh LaMaster) (09/28/89)
In article <24950@louie.udel.EDU> nelson@udel.EDU writes: >Since we are talking about "*big iron*", let's talk about real big iron. >Cray DD-40 disk drives can support >10MB/sec through the operating >system (at least COS; I assume the case is also true for UNICOS). I just tested this on our Cray running Unicos. The speed was almost exactly 10MB/sec. Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117
lamaster@ames.arc.nasa.gov (Hugh LaMaster) (09/28/89)
In article <42229@sgi.sgi.com> markb@denali.sgi.com (Mark Bradley) writes: >I can't yet publish our IPI numbers due to signed non-disclosure, but suffice >it to say that it would not make sense to go to a completely different controller >and drive technology for anything less than VERY LARGE performance wins or >phenomenal cost savings.... You might, however, be able to say what architectural features of your system and the controller contributed. For example, is there anything about cache, memory, etc. that helps a lot. What controller features are needed? Which ones are bad? >maximum throughput. This is something many companies seem to miss the boat >in doing. Clearly I am somewhat biased, but the numbers don't lie (see below I agree. *Big Iron* machines have been able to provide sustained sequential reads at 70% of theoretical channel/disk speed on multiple channels, while providing 70% of CPU time in user CPU state to other CPU bound jobs, for at least the past 10 years. Many of today's workstations have as fast CPUs as those machines did then, but, needless to say, the I/O hasn't been there. I am glad to see that this is getting a lot more attention in industry now. Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117
rod@venera.isi.edu (Rodney Doyle Van Meter III) (09/28/89)
Someone mentioned the Crays as examples of hot I/O boxes. A friend of mine who knows these things much better than me called the Cray peripherals "incestuous". Sure they work, but they apparently rely on a nearly intimate knowledge of the timing quirks of the processors and other peripherals. Makes them expensive, and makes it hard to use off-the-shelf parts. What about Thinking Machines' Data Vault? I've been given to understand it's actually better than the machines themselves in some respects. --Rod
brooks@vette.llnl.gov (Eugene Brooks) (09/28/89)
In article <9911@venera.isi.edu> rod@venera.isi.edu.UUCP (Rodney Doyle Van Meter III) writes: > >What about Thinking Machines' Data Vault? I've been given to >understand it's actually better than the machines themselves in some >respects. Thinking Machines' Data Vault is a fine example of the right way to build an IO system these days. Instead of using limited production high performance drives, you build a highly parallel system using the same mass production drives you can buy for workstations and throw in a SECDED controller while you are at it. The system has 72 drives implementing a 64 bit wide data path with one bit per drive. Using current 1.2 Gbyte drives each having a bandwidth of more than a megabyte per second you could build a selfhealing disk system of more than 64 gigabytes and having more than 64 megabytes a second throughput. For one of the future supercomputers built of 1000 microprocessors each having 8 to 32 mbytes of memory you would need more than one of these disk systems to keep the thing fed. brooks@maddog.llnl.gov, brooks@maddog.uucp
johng@cavs.syd.dwt.oz (John Gardner) (09/28/89)
> Organization: CSIRO Division of Wool Technology, Ryde, Sydney, Australia. Lines: 15 In article <22488@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) writes: >Yup. That is because of intelligent channel processors that do DMA to multi-ported >memory. The same thing SCSI can do. Except with one user, we only need one channel >(or one for each file). But with UNIX we could use a few more. > One small point to add, the amiga does DMA to dual port ram. All DMA goes through a bank of ram called chip ram ( because the graphics coprocessors also use this area) while everything else is run is fast ram. This is a big help as the amiga does have a multitasking operating system ( usually single user though.) -- /*****************************************************************************/ PHONE : (02) 436 3438 ACSnet : johng@cavs.dwt.oz #include <sys/disclaimer.h>
gsh7w@astsun3.acc.Virginia.EDU (Greg Scott Hennessy) (09/28/89)
In article <34298@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov (Eugene Brooks) writes: #Thinking Machines' Data Vault is a fine example of the right way to #build an IO system these days. #The system has 72 drives #implementing a 64 bit wide data path with one bit per drive. What are the extra 8 drives used for? Parity? -Greg Hennessy, University of Virginia USPS Mail: Astronomy Department, Charlottesville, VA 22903-2475 USA Internet: gsh7w@virginia.edu UUCP: ...!uunet!virginia!gsh7w
vaughan@mcc.com (Paul Vaughan) (09/28/89)
Ok, so we've straightened out the definitions of mini, micro, mainframe, and super computer for this month. Now we have to define *big iron*? Give me a break! (Or was choosing a nice nebulous term that everybody could interpret their own way the whole idea? :-) Paul Vaughan, MCC CAD Program | ARPA: vaughan@mcc.com | Phone: [512] 338-3639 Box 200195, Austin, TX 78720 | UUCP: ...!cs.utexas.edu!milano!cadillac!vaughan
rlk@think.com (Robert Krawitz) (09/29/89)
In article <34298@lll-winken.LLNL.GOV>, brooks@vette (Eugene Brooks) writes:
]Thinking Machines' Data Vault
The system has 72 drives
84, actually. Specifically, 64 data + 14 ECC, and 6 hot spares (they
can be brought on-line immediately).
--
ames >>>>>>>>> | Robert Krawitz <rlk@think.com> 245 First St.
bloom-beacon > |think!rlk Cambridge, MA 02142
harvard >>>>>> . Thinking Machines Corp. (617)876-1111
bruce@sauron.think.com (Bruce Walker) (09/29/89)
In article <34298@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov (Eugene Brooks) writes: >Thinking Machines' Data Vault is a fine example of the right way to >build an IO system these days. Instead of using limited production >high performance drives, you build a highly parallel system using >the same mass production drives you can buy for workstations and throw >in a SECDED controller while you are at it. The system has 72 drives >implementing a 64 bit wide data path with one bit per drive. Actually, the current DataVaults have 42 drives. Though the bus to the DV is 64 bits wide, it is broken down into a 32-bit data path inside the DV. There are 32 data drives, 7 ECC drives, and 3 hot spares, each of which can be switched into any of the other 39 channels. We also offer double-capacity DVs with 84 drives; no more bandwidth, just a 2nd tier of drives off of each channel. --Bruce Walker (Nemnich), Thinking Machines Corporation, Cambridge, MA bruce@think.com, think!bruce, bjn@mitvma.bitnet; +1 617 876 1111
brooks@vette.llnl.gov (Eugene Brooks) (09/29/89)
In article <2045@hudson.acc.virginia.edu> gsh7w@astsun3 (Greg Scott Hennessy) writes: >In article <34298@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov (Eugene >Brooks) writes: >What are the extra 8 drives used for? Parity? Actually, the data in my article was just a wild guess. There is nothing like incorrect data on the USENET to get the boys at TM to speak up and reveal the facts that don't appear in their publicly available literature. brooks@maddog.llnl.gov, brooks@maddog.uucp
mcdonald@aries.uiuc.edu (Doug McDonald) (09/29/89)
>Thinking Machines' Data Vault is a fine example of the right way to >build an IO system these days. Instead of using limited production >high performance drives, you build a highly parallel system using >the same mass production drives you can buy for workstations and throw >in a SECDED controller while you are at it. The system has 72 drives >implementing a 64 bit wide data path with one bit per drive. Using current I remember with great fondness a similar setup on the Illiac IV. It was so unreliable when that machine first got (sort-of) running my program, which didn't use it for hours, got to run while others were waiting for the farm to be fixed. SECDED sounds OK for reading - but what about writing? Don't they need to have an extra disk to take the data that should go to a sick disk being replaced? Doug McDonald
pa1159@sdcc13.ucsd.EDU (pa1159) (09/29/89)
In article <24950@louie.udel.EDU> nelson@udel.EDU () writes: >In article <22488@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) writes: >> >Since we are talking about "*big iron*", let's talk about real big iron. >Cray DD-40 disk drives can support >10MB/sec through the operating >system (at least COS; I assume the case is also true for UNICOS). >And COS also supports disk striping at the user level, so for >sequential reads of a file striped across an entire DS-40 disk >subsystem (20+ GB, 4 drives) a process can achieve sustained rates >of 40MB/sec. Of course, this is for relatively large (~ 0.5MB) >reads, but these aren't uncommon for the sort of processing Crays >do. > This brings up a point: in what processing regimes does total sustained disk tranfer rate be the performance-limiting factor? For a mini/single-user workstation configuration I'd think that the average access time rather than sustained throughput would be most important as most I/O transfers would be relatively small. So, given equal access times, how much of a difference in interactive workloads does a jump from say 500 KB/s (low end micro disks) to 3-4 MB/s make in performance? Of course, for things like massive image processing applications sustained throughput is a Good Thing, but for the Rest Of Us, how much does it really matter? Matt Kennel pa1159@sdcc13.ucsd.edu PS: The Connection Machine parallel disk subsystem is pretty nifty. 40 simultaneous bitstreams, which when error-corrected &c make a 32-bit word per tick. You can trash one drive and then reconstruct its contents from the 39 others. I don't know the numbers, but I suspect that it's very fast. l>Disk I/O is one of Cray's big selling points vs. the Japanese >super-computer manufacturers--their machines generally have >mainframe (read 4MB/sec) style disk channels. > >Mark Nelson ...!rutgers!udel!nelson or nelson@udel.edu >This function is occasionally useful as an argument to other functions >that require functions as arguments. -- Guy Steele
Don_A_Corbitt@cup.portal.com (09/29/89)
Warning - posting from newcomer - Disk IO data enclosed System: Northgate 386 16MHz 4MB 32 bit memory on motherboard (paged - 0WS in page, else 1WS) RLL hard disk - 7.5MBit/sec transfer rate No RAM or disk cache Test 1 - How does transfer buffer size affect throughput under MS-DOS? Buffer RLL KB/s RAMDrive KB/s 512 156 446 1024 192 714 2048 284 1027 4096 352 1316 8192 409 1511 16384 445 1633 32768 471 1700 Test 2 - Using low-level calls, how does throughput differ? These are still the MS-DOS calls, but using read/write sector, not read/write file, commands. Buffer RLL KB/s RAMDrive KB/s 512 196 1245 1024 336 2203 2048 381 3206 4096 387 5266 8192 489 6526 16384 567 7367 32767 611 7856 Conclusion - it appears that MSDOS does a MOVSB to copy data from an internal buffer to the user area. I did the timing, and that almost exactly matched the speedup we see going from file IO to raw IO on the RAM disk. Note that this disk drive has a maximum burst transfer rate of 937KB/s, and a maximum sustained rate of around 800KB/s (assuming 0ms seek, etc). So we are able to get >1/2 max performance using the filesystem, and 3/4 of max using the low-level calls. Also, it appears that the memory-memory bandwidth is sufficient for anything that can get into a 32-bit 16 MHz slot. Of course, generic peripherals are looking at a 8MHz 16 bit slot with slow DMA. Don_A_Corbitt@cup.portal.com - this non-lurking could ruin my reputation PS - in 1984 I wrote the firmware for a 3.5" floppy drive with performance in mind - 1:1 interleave, track skewing, etc for the portable Tandy Model 100. It ran faster than any desktop PC I could find to benchmark it against. And I haven't noticed anyone making the effort to do the same since. So nobody cares about Disk IO?
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (09/29/89)
In article <1186@sdcc13.ucsd.EDU>, pa1159@sdcc13.ucsd.EDU (pa1159) writes: | This brings up a point: in what processing regimes does total | sustained disk tranfer rate be the performance-limiting factor? On a Cray2, swapping! You can have programs using 2GB (yeah, that's GB) of *real* memory, and when you swap those suckers out... disk throughput is very important as program size gets larger. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
chen@pooh.cs.unc.edu (Dave) (09/30/89)
In article <2045@hudson.acc.virginia.edu> gsh7w@astsun3 (Greg Scott Hennessy) writes: >In article <34298@lll-winken.LLNL.GOV> brooks@maddog.llnl.gov (Eugene >Brooks) writes: >#Thinking Machines' Data Vault is a fine example of the right way to >#build an IO system these days. >#The system has 72 drives >#implementing a 64 bit wide data path with one bit per drive. > >What are the extra 8 drives used for? Parity? > They are there for SEC-DED, i.e. single error correction, double error detection. If one of the 64 drives goes bad, the data is can be completely recovered simply by accessing every word in the vault. When doing a read the extra 8 bits allow you to tell which bit is wrong. If two bits are wrong it can be detected, but not corrected. The method is described in many computer architecture books, I think, and is used in most mainframe memory systems. Dave _________________________David_T._Chen_(chen@cs.unc.edu)_______________________ It's funny, I hate the itching, but I don't mind the swelling. -- David Letterman
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (09/30/89)
Thanks for posting that data. It looks like a lot of other stuff previously posted as the DOS call level, so I have more faith at the BIOS level. Did you try a CORE test? It shows about 600KB for a cached RLL controller. If a disk is rotating at 3600 rpm, and there are 26 sectors of 512 bytes on each track, the burst rate is: 26*512*3600/60/1024 = 780 Where: 26 sectors 512 bytes/sector 3600 rpm 60 sec/min 1024 1K 780 kilobytes/sec I think that shows you are getting close to "all there is." -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
toivo@uniwa.uwa.oz (Toivo Pedaste) (10/02/89)
>Actually, the current DataVaults have 42 drives. Though the bus to >the DV is 64 bits wide, it is broken down into a 32-bit data path >inside the DV. There are 32 data drives, 7 ECC drives, and 3 hot >spares, each of which can be switched into any of the other 39 >channels. What I've wondered about such a configuration is how you bring a disk back on line after it has failed. Do you rebuild the information on it by reading the other drives and using the ECC? If so how long does it take and what effect does it have on the performance of the system? Just curious. -- Toivo Pedaste ACSNET: toivo@uniwa.uwa.oz
daveb@rtech.rtech.com (Dave Brower) (10/03/89)
some people wrote: >>Cray DD-40 disk drives can support >10MB/sec through the operating >>system (at least COS; I assume the case is also true for UNICOS). > >This brings up a point: in what processing regimes does total >sustained disk tranfer rate be the performance-limiting factor? > In many tp/database/business applications, CPU is fast enough that disk bandwidth will soon be the limiting factor for many applications. Some airline reservation systems are said to have huge farms of disk where only one or two tracks are used on the whole pack to avoid seeks, for instance. A 1000 tp/s database benchmark might easily require 10MB/sec i/o throughput. Maybe Cray should change markets... -dB -- "Did you know that 'gullible' isn't in the dictionary?" {amdahl, cbosgd, mtxinu, ptsfa, sun}!rtech!daveb daveb@rtech.uucp
philf@xymox.metaphor.com (Phil Fernandez) (10/06/89)
In article <3752@rtech.rtech.com> daveb@rtech.UUCP (Dave Brower) writes: > ... Some >airline reservation systems are said to have huge farms of disk where >only one or two tracks are used on the whole pack to avoid seeks, for >instance. No, I don't think so. I did a consulting job for United Airlines' Apollo system a couple of years ago, looking for architectures to break the 1000t/s limit. We looked at distributing transactions to many processors and disks, etc., etc., but nothing quite so profligate at using only a couple of tracks (or cyls) on a 1GB disk pack in order to minimize seeks. On the *big iron* that UAL and other reservations systems use, the operating systems (TPFII and MVS/ESA) implement very sophisticated disk management algorithms, and in particular, implement elevator seeking. With elevator seeking, disk I/O's in the queue are ordered in such a way to minimize seek latency between I/O operations. In an I/O- intensive tp application with I/O's spread across multiple disk packs, a good elevator scheduling scheme is all that's needed to get the appropriate disk I/O bandwidth. Makes for a good story, tho! phil +-----------------------------+----------------------------------------------+ | Phil Fernandez | philf@metaphor.com | | | ...!{apple|decwrl}!metaphor!philf | | Metaphor Computer Systems |"Does the body rule the mind, or does the mind| | Mountain View, CA | rule the body? I dunno..." - Morrissey | +-----------------------------+----------------------------------------------+
jesup@cbmvax.UUCP (Randell Jesup) (10/07/89)
>>Yup. That is because of intelligent channel processors that do DMA to multi- >>ported >>memory. The same thing SCSI can do. Except with one user, we only need one >>channel >>(or one for each file). But with UNIX we could use a few more. >> > One small point to add, the amiga does DMA to dual port ram. All DMA >goes through a bank of ram called chip ram ( because the graphics coprocessors >also use this area) while everything else is run is fast ram. This is a big >help as the amiga does have a multitasking operating system ( usually single >user though.) A correction: some amiga disk controllers DMA to dual-ported memory. Some DMA directly to system memory. A few don't use DMA at all (but are slightly cheaper). DMA to DP memory has some advantages, but speed isn't usually one of them on the Amiga. This is because the data has to cross the bus at least one extra time, and of course the processor load increases. DMA straight to system memory (essentially any memory in the system) is faster if done right. FIFO's are important here to avoid DMA overruns. This, combined with the Amiga FastFileSystem allows data to often be DMA'd directly into the application's destination for the read (or from it for a write). This improves performance even more. "chip ram" is ram that the graphics/audio/floppy/etc coprocessors can access directly. This is currently 1Meg (used to be 512K). Expansion devices (like HD controllers) can DMA to any memory they want to (including "chip ram"). -- Randell Jesup, Keeper of AmigaDos, Commodore Engineering. {uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.cbm.commodore.com BIX: rjesup Common phrase heard at Amiga Devcon '89: "It's in there!"
cliffhanger@cup.portal.com (Cliff C Heyer) (10/08/89)
Cliff wrote... >> Yes, they (IBM and DEC - and all the rest, including DG) will >>save the bandwidth and fast i/o for the *big iron* machines AND the >>high-end "workstations". This is just what the main problem is! The Robert Cousins responded.... >If you decide that $8000 is high end, then you are right, but frankly, >I think your facts need to double checked. There are currently only 4 >DG 88k workstations which all have approximately the same I/O bandwidth. >While I will admit that it is quite fast, the I/O performance of the >low end is almost the same as the high end. The major difference is in >CPU speed. The "high end" is the SERVER with the VME bus. Note that the "low cost" VME bus TO be introduced for the workstations is alleged not to do block I/O, thus limiting it's throughput. >DMA is a must for performance at any level whenever you have >more than one task running at a time. Yup, which is why I've been looking for a 386 board maker that has put an ESDI or SCSI controller "on board" and bypassed the AT-bus with direct DMA channel(s). (Like the Amiga.) So far, looks like the Mylex MX386 is the only one (?) Hoping if I subscribe to Computer Architecture News I might learn more!
news@rtech.rtech.com (USENET News System) (10/10/89)
In article <829@metaphor.Metaphor.COM> philf@xymox.metaphor.com (Phil Fernandez) writes: >In article <3752@rtech.rtech.com> daveb@rtech.UUCP (Dave Brower) writes: >> ... Some >>airline reservation systems are said to have huge farms of disk where >>only one or two tracks are used on the whole pack to avoid seeks, for >>instance. >With elevator seeking, disk I/O's in the queue are ordered in such a >way to minimize seek latency between I/O operations. A number of techniques which we used on a VAX-based TP exec called the Transaction Management eXecutive-32 (TMX-32) were: - per disk seek ordering - as stated above - which disk seek ordering - with mirrored disks, choose the disk with the heads closest the part of the disk you're gonna read. (sometimes just flip-flopping between the two is enough.) - coalesced transfers - for instance, if you need to read track N, N+3 and N+7 its sometimes faster to read tracks N to N+7 and sort out the transfers in memeory. - single-read-per-spindle-per-transaction - split up heavily accessed files over N spindles, mapping logical record M to disk (M mod N), physical record (N/M), such that on the average only one disk seek needs to be made per transaction (in parallel, of course). This is worthwhile when the transactions are well defined. This task became considerably difficult when DEC introduced the HSC-50 super-smart, caching disk controller for the VAXcluster and the RA-style disks: 1) it was impossible to know the PHISICAL location of a disk block, due to dynamic, transparent bad-block revectoring and lack of on-line information about the disk geometry. We placed the files carfully on the disk so that they started on a cylinder boundary, adjacent to other files, and assumed what they were "one dimensional." 2) Some of the optimizations were done in the HSC itself so we didnt do them on HSC disks. (seek ordering and command ordering) 3) HSC volume shadowing made the optimizations to our home-grown shadowing obsolete. We kept our shadowing to use in non-HSC enviroments, like uVAXes and locally connected disks, and because it was per-file based, not per volume. Using these techniques, I ran the million-customer TP benchmark @76 TPS on a vax 8600 (~4-mips). I dont remember the $/TPS (of course), but it might have been pretty high because there were a LOT of disk drives. We might have eeked out a few more TPS if we had physical control over the placement of the disk blocks, but probably not more than a few. I also felt that I never knew what the disk was 'really doing' because so much was hidden in the HSC; being the computer programmer that I am, I wanted to know where each head was at each milli-second:->. (The 76TPS bottleneck was the mirrored journal disk, which, although it was written sequentially, it was still nescessary to write to it for the close of each transaction. The next step would have been to allow multiple journal files, but since the runner-up was about 30TPS, we never got around to it :->.) As an aside, for you HSC fans building this kind of stuff, it is possible that large write I/Os to an HSC-served disk will be broken up into multiple physical I/O operations to the disk. This means that if you are just checking headers and trailers for transaction checkpoint consistency, you may have bogus stuff in the middle with perfectly valid header and trailer information if the HSC crashed during the I/O. - bob +-------------------------+------------------------------+--------------------+ ! Bob Pasker ! Relational Technology ! ! ! pasker@rtech.com ! 1080 Marina Villiage Parkway ! INGRES/Net ! ! <use this address> ! Alameda, California 94501 ! ! ! <replies will fail> ! (415) 748-2434 ! ! +-------------------------+------------------------------+--------------------+