cliffhanger@cup.portal.com (Cliff C Heyer) (09/08/89)
Feel free to ax the following. I'm not pretending to be an IBM expert! I've tried to summarize what I've learned over the years....using non-IBM terminology. Also don't think I'm biased for only mentioning DEC. I'm too lazy to research other hardware just for this article! Lawrence Butcher initially stated: >The biggest, fastest business computers seem to >run 1960's operating systems without protection, >and allow user programs to do I/O without OS >assistance. First of all, lets understand what we are comparing: "server/IPCF" vs. "timesharing" architecture. "1960's operating systems" were optimized for a specific need. A UNIX-type OS is optimized for a different need. Yes, "1960's operating systems" do "manage" their own DMA I/O in a sense. Users "share" one copy of the screen control program in "core" and declare chunks of memory ("queue", global section w/AST in VMS terms.) for communication with a "server" A server process "sleeps" until a request comes in it's queue("interrupt") for data (eg. airline ticket key). The server collects up to 100 keys from terminals, organizes them by disk drive and disk address, and retrieves them in sequence (just like the ladder algorithms in hard disk controllers). Large segments of the database are kept in memory on post-3033 hardware. Remember 1960s computers had a *blazing* 1/2 MIPS and about 500KB core. The timesharing architecture where you set up a separate process for each user ("ticket clerk") simply wouldn't cut it. The overhead was too high then, and it still is today. Here's why: 1. Setting up a "user process" for each user creates a vast amount of OS overhead compared with using a simple control program which services TTY queues in round-robin fashion. The "login" to the OS which prompts & checks your password is off loaded to communications processors OR is not used due to direct leased-line connections. With timesharing too many unnecessary resources are needlessly duplicated for each user. Core & MIPS used up too fast. 2. Timesharing OS "file channels" for each user creates a vast amount of overhead compared with communicating through memory with interrupts (IPCF) with a server. With timesharing too many resources are needlessly duplicated for each user. 3. Simultaneous update lock overhead vast compared to server lock management. With timesharing too many resources are needlessly duplicated for each user. 4. Security. Maintaining user directories, file protections, etc. overhead is vast compared to a server architecture. With timesharing too many unnecessary resources are needlessly duplicated for each user. Also with the control program approach there are no "directories" or "commands" that can be abused by hackers to break the system. 5. No data independence. If you modify the database you have to tinker with the applications. With a server, the communication is through memory so anything can be changed so long as you maintain the communications protocol. It's really funny, but now 20 years later "the rest of us" can now take part in this type of technology (now IBM's patents have expired?). The emergence of PC LANs has created a need for "server" nodes that operate with much the same principle. Products such as SYBASE are an example. I hope DEC will take the plunge into this type of architecture with it's new VAX 9000 line (if they ever hope to compete with IBM). This "airline system" architecture, however, will never become as popular as timesharing systems because it's basically a vertical market product. The conclusion of this section is that specialized OS contributes to massive I/O rates. (IBM keeps with this trend via ESA) Lawrence Butcher writes: >My new question. Is fast I/O all that micros need >to bury mainframes? Or is user level I/O needed? >If needed, how can simple hardware be built which >allows direct user level DMA I/O? Yes & Yes, except watch out for apples vs. oranges comparison. What you're referring to as "user level I/O" is NOT the same as what most people might think. "User level I/O" in terms of the "1960s operating systems" referres to a vertical market control program that replaces a typical UNIX-type timesharing system. That VAST user level I/O was & is achieved by sacrificing ALL the capability that a UNIX user has come to love. VAX ACMS systems can't compete with IBM because they attempt to layer the "control program" on top of VMS. VMS sucks up all the CPU cycles with overhead not needed for the job being done. "Mainframe I/O" where you have 100+ channels cooking at 3MB/sec does not mesh with a timesharing-type OS. That's why IBM LAYERS all it's timesharing environments(CMS, TSO) ON TOP of MVS or VM (and they run slow as death). You can't always have your cake and eat it too. The OS has to be created for the application. DEC only gives us ONE type of OS. DEC's software is the reverse of IBMs - ACMS (control program) sits on top of VMS(timesharing system), but with IBM TSO(timesharing system) sits on top of MVS(control program). What DEC needs is a specialized OS for use to run the ACMS environment once it's been created on VMS. ALSO keep on mind the the alleged "100+ MIPS" processors are actually multiprocessors. The IBM 3090-600 has 6 15 MIPS processors. The *fastest* single user processing is 15 MIPS - NOT 100 MIPS! IBM's max uniprocessor MIPS is about the same as others in the industry. Channel CPUs to offload main CPU interrupts help big time, as does multi-ported memory. But DEC has added intelligent processors long ago also. In 1974 when the KL10 came out they started using a PDP-11 front end to offload terminal interrupts from the main processor. I think IBM is stronger with the channel CPU & multi-ported memory, though. VAX CPUs are still managing DMA and character I/O transfers. I think IBM has some *broad* patents in the multi-ported memory area that will expire soon. Enter the VAX 9000. Also, IBM does not use DEC's "single bus" subsystem connection scheme but rather connects subsystems via "channel architecture" and multi-ported memory. This gives >100MB/sec throughput. You might wonder how IBM keeps it's memory subsystem fast enough to keep up with it's 15 MIPS processors. They do it the brute force way. Use expensive 3ns ECL cache and 512MB of 50ns SRAM main memory to run at a cool 10ns cycle time or 100MHz. (Compare to the *fastest* 80386 PC at 33MHz.) No wonder they cost $10 million (quoted from Sam Fuller/Amdahl) The conclusion of this section is that massive I/O BW (again) comes from a specialized OS, but ALSO *expensive*fast* memory subsystems AND channel architecture w/multi-ported memory. Now, on the subject of PCs and disk I/O... Yes, the only thing that distinguishes an 8-MIPS PC board from an 8-MIPS VAX 6000 board is the I/O bandwidth. This comes primarily from the memory cycle time. The PC industry is suffering from the "low margin & compatibility blues" In other words, because the PC is now a commodity the profit margins are too slim to allow for *high speed* SRAM and ECL memory, and too slim to finance R & D - which is what is required to engineer new bus technology. New bus technology is risky to introduce and make a profit because *everyone* has standardized on the AT bus and *most* are happy enough not to yell loud. If you were to buy a PC with minicomputer or mainframe I/O - it would cost you as much as a minicomputer or mainframe. Because the *engineering* costs & material costs will be the same as the big guys. You'd have to have a *fast* memory subsystem which would cost *bug bucks*. You don't get something for nothing. These PC makers have to compete with 50 other companies that have cut-throat margins. They can't fuss around trying to make a high bandwidth bus(w/ associated fast memory) and expect to get to market on time to keep up with the others. If you snooze you loose. You have to keep the product flowing to keep the cash flowing. What *will* change this though is the emergence of new low cost chips such as the NCR53C700 SCSI chip, and the fall in price of SRAM or faster DRAM (actually *vastly* faster DRAM) When the PC makers can use low cost chips for high bandwidth rather than engineer their own bus design, they will do it. They will be able to put ESDI/SCSI interfaces ON the motherboard and BYPASS the AT bus. The thing about "mainframes" is that these guys ALWAYS thought BIG. Did you know the first hard disk in the late 50s contained 5MB? NOT 50KB like early floppies. These guys new how much data they had to store and weren't going to fuss around with a 50KB hard drive. Sure, the 5MB drive was as big as a refrigerator, but so what? You're charging $100K for it. The same is true for I/O bandwidth. They knew how much data they had to process and what the time limits were (eg. get 100,000 pay checks of 64 bytes each out by Friday). If you ran the system 24 hours, that came out to say 10MB/sec I/O. This increased to 16MB/sec in the early 70s, and to 70MB/sec in the early 80s. Did they ever fuss around with a 500KB/sec AT bus? Yea right. Mainframe disk I/O exceeded the AT bus around 1959. These guys KNOW how to do it fast. But these companies are also publicly owned and must make money for their stockholders. If they start blowing out 80386 PCs with 2MB/sec sustained disk I/O on multiple simultaneous drives, who will buy their mainframes? (eg. how many American Airlines Sabre systems are there?) Also, they can't do that and stay competitive with the clones at the moment.. So they market "incremental improvements" which are say 10% faster than the last desktop model, but still use "off the shelf" chips and boards. So you are continually forced to buy new hardware to keep up with the best. It *does* cost more to engineer a fast bus than to use off-the-shelf components and slap together another clone. And since the market is NOT demanding a fast PC bus, why go to all the trouble? Maybe you'll loose money. If you do put a fast bus together, you'll point it towards the workstation market where the profit margins will support the engineering costs. But as I said above, when fast cheap I/O chips can be used by board makers to bypass the AT bus, that's when the sparks will fly and the mainframe folks will be running for the hills. Ask yourself why DEC is planning to lay off 25,000 people over the next few years? The writing is on the wall. Soon CISC computers will be "all invented." Please POST your comments...
mat@uts.amdahl.com (Mike Taylor) (09/09/89)
In article <21962@cup.portal.com>, cliffhanger@cup.portal.com (Cliff C Heyer) writes: > Feel free to ax the following. OK. Just a few editorial corrections on the "facts." > "Mainframe I/O" where you have 100+ channels > cooking at 3MB/sec does not mesh with a > timesharing-type OS. > Check Amdahl's UTS for a counter-example. > ALSO keep on mind the the alleged "100+ MIPS" > processors are actually multiprocessors. The IBM > 3090-600 has 6 15 MIPS processors. Uniprocessor performance is over 20 MIPS. But be careful. This same processor would run more like 40-60 MIPS with a Unix system that has a good compiler. The reason for the difference is the relatively poor locality of reference encountered in an MVS environment, causing many more cache misses, and the fact that most of the code has had no attention paid to pipeline scheduling. (Oversimplification). > IBM's max uniprocessor MIPS is about the same as > others in the industry. Amdahl's uniprocessor MIPS are more than 50% higher than IBMs. We achieve more throughput with four CPUs than IBM does with 6. > > You might wonder how IBM keeps it's memory > subsystem fast enough to keep up with it's 15 MIPS > processors. They do it the brute force way. Use > expensive 3ns ECL cache and 512MB of 50ns SRAM > main memory to run at a cool 10ns cycle time or > 100MHz. Actually Amdahl runs at 10ns. IBM's cycle time is (I think) 15 ns. -- Mike Taylor ...!{hplabs,amdcad,sun}!amdahl!mat [ This may not reflect my opinion, let alone anyone else's. ]
mash@mips.COM (John Mashey) (09/09/89)
In article <feMc02Wi53Sr01@amdahl.uts.amdahl.com> mat@uts.amdahl.com (Mike Taylor) writes: >Uniprocessor performance is over 20 MIPS. But be careful. >This same processor would run more like 40-60 MIPS with Mike said it, but it bears re-emphasis: IBM-mips and VAX-mips are not the same, so be real careful when comparing them. the moderate number of benchmarks we've seen in common bear out Mike's description of 40-60 (conservatove-vs-VMS) VAX-mips / CPU. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
seanf@sco.COM (Sean Fagan) (09/10/89)
In article <21962@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) writes: Lots of misconceptions. >Remember 1960s computers had a *blazing* 1/2 MIPS >and about 500KB core. Hmm. 360's and 370's, which are from the '60's, certainly had more than that. CDC Cybers, with which I am most concerned (of course 8-)), also had more than that (256Kwords of memory, 3-10MIPS / MFLOPS). >"Mainframe I/O" where you have 100+ channels >cooking at 3MB/sec does not mesh with a >timesharing-type OS. *What*?! It works *best* with a timesharing OS, because it's more likely to be utilized! >That's why IBM LAYERS all >it's timesharing environments(CMS, TSO) ON TOP of >MVS or VM (and they run slow as death). You can't >always have your cake and eat it too. CDC's NOS, which runs on CDC 170-state Cybers, is an interactive (of sorts 8-)) timesharing OS. True, it was based on a batch system, but it is now a true TSOS, and it is *very* efficient. In fact, I believe that NOS nowadays is more efficient (for I/O, running jobs, etc.) than the last version of the batch-only OS was... >The OS has >to be created for the application. Not necessarily. Witness UNIX, which can, with a few modifications, be *very* quick and efficient. I, personally, think Mach is going in the correct direction. If you make your OS do minimal work, efficiently and properly, everything else falls out as a library, which need not be used. What you're talking about is running a *single* job on a computer, and this is not cost-efficient anymore (not for minis and higher, I don't believe). It is cheaper to have a slightly slower OS and application pair. >ALSO keep on mind the the alleged "100+ MIPS" >processors are actually multiprocessors. The IBM >3090-600 has 6 15 MIPS processors. The *fastest* >single user processing is 15 MIPS - NOT 100 MIPS! >IBM's max uniprocessor MIPS is about the same as >others in the industry. Not everybody has that problem. CDC Cybers are blazingly fast; no single CPU version gets 100+ MIPS, but they are very fast, faster than 15 MIPS to be sure (I think the 990 [single CPU] gets something like 50-60 MIPS / MFLOPS). Don't think that the way IBM does it is the way it has to be done. As the saying goes, there's more than one way to skin a cat... >Yes, the only thing that distinguishes an 8-MIPS >PC board from an 8-MIPS VAX 6000 board is the I/O >bandwidth. This comes primarily from the memory >cycle time. Not really. It comes from the memory bandwidth, which is a bit different (ok, not a whole lot, but a bit). >If you were to buy a PC with minicomputer or >mainframe I/O - it would cost you as much as a >minicomputer or mainframe. Because the >*engineering* costs & material costs will be the >same as the big guys. You'd have to have a *fast* >memory subsystem which would cost *bug bucks*. Naturally. It also wouldn't be PC compatable (bus-wise). However, both MCA and EISA offer some interesting possibilities (I haven't seen anybody *use* them, so there might be good reasons for that) in Real(tm) I/O. >[the first] disk in the late 50s contained 5MB? NOT 50KB like >early floppies. These guys new how much data they >had to store and weren't going to fuss around with >a 50KB hard drive. Did you know how *large* the think was compared to a floppy? That's why the floppy exists: as an alternative to hard disks (well, there are, again, some other reasons, but that's what it can be considered as). If you're going by that reasoning, you should also point out that the tape drives held several hundred megabytes. Are they a viable alternative? No. Different needs, different uses, different solutions, different markets. -- Sean Eric Fagan | "Time has little to do with infinity and jelly donuts." seanf@sco.COM | -- Thomas Magnum (Tom Selleck), _Magnum, P.I._ (408) 458-1422 | Any opinions expressed are my own, not my employers'.
lamaster@ames.arc.nasa.gov (Hugh LaMaster) (09/11/89)
In article <3289@scolex.sco.COM> seanf@sco.COM (Sean Fagan) writes: >In article <21962@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) writes: >>bandwidth. This comes primarily from the memory >>cycle time. > >Not really. It comes from the memory bandwidth, which is a bit different >(ok, not a whole lot, but a bit). *If* this were really true, it ought to be cheap to build a mainframe style memory subsystem. One I know of has 128 banks and 32 ports. Each port can do one read/write per clock period. I *wish* it were simple to build micros with this capability. Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117
ggw@wolves.uucp (Gregory G. Woodbury) (09/12/89)
In article <21962@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) writes: >Now, on the subject of PCs and disk I/O... : >What *will* change this though is the emergence of >new low cost chips such as the NCR53C700 SCSI >chip, and the fall in price of SRAM or faster DRAM >(actually *vastly* faster DRAM) When the PC makers >can use low cost chips for high bandwidth rather >than engineer their own bus design, they will do >it. They will be able to put ESDI/SCSI interfaces >ON the motherboard and BYPASS the AT bus. : : >And since the market is NOT demanding a fast PC >bus, why go to all the trouble? Maybe you'll loose >money. If you do put a fast bus together, you'll >point it towards the workstation market where the >profit margins will support the engineering costs. So what in the hell is a "workstation"? I have been unable to make any real distinction between a "workstation" and a "high-performance PC". The distinction (if there is one) is purely marketing hype. We are/have been replacing our dependency on mainframe computing by acquiring a network of dedicated, "high performance" (and relatively) low cost "PC's". Some of these things are "workstations", but they all use the AT style PC buss, and take too bloody long to do the disk i/o. It is cheaper (more effective) for us to have an Intergraph Clipper chipset mounted in an AT-class PC take 8 hours to run our application than to have an IBM-3081K do it in 3 and then send us back the results. With the batch environment set-up and post-processing overhead, we get better throughput from the local PC's. What's more, given the typical restrictions placed on jobs in botched (pun intended) mainframe environments, we can deal with bigger memory sizes (with no change in cost) on the "PC's" Here are some real numbers: an 88000 based co-processor in a '386 33 MHz PC with 1.2 GBytes of disk and 24 MB of memory can cost ~$30,000. This is amortizable over several years. The full cost recovery for this pc/workstation (assuming 40 hr weeks!) is < $6.00/hour! The University owned (consortium) mainframe limits normal jobs to 5.5MB of memory and costs >$150.00/hour! Our application benchmark took ~1 hour to run on the 3081. It takes about 1.8 hours to run on the 88000, and costs a hell of a lot less 'cause there isn't a memory premium overhead on the cost! If I could have this processor (88000) in a machine with a decently fast bus AND at a cost nearly the same, then it would be perfect. But once you get away from the AT-bus and such, the costs get unreasonable. We priced a NS32532 in an AT-bus card (co-processor) and an NS32532 in a VME bus box. Holding all other things equal/equivalent, the VME bus based box would have cost us THREE times the cost of the co-processor configuration! I like reading the architecture group here, it just gets so frustrating when I hear the technical people waxing estatic about there latest new toy and not realizing that just behind their "cutting edge" of research and development there is the "bleeding edge" of people trying to use this stuff in a reasonable (and cost-efficient!) manner. -- Gregory G. Woodbury Sysop/owner Wolves Den UNIX BBS, Durham NC UUCP: ...dukcds!wolves!ggw ...dukeac!wolves!ggw [use the maps!] Domain: ggw@cds.duke.edu ggw@ac.duke.edu ggw%wolves@ac.duke.edu Phone: +1 919 493 1998 (Home) +1 919 684 6126 (Work) [The line eater is a boojum snark! ] <standard disclaimers apply> -- Gregory G. Woodbury Sysop/owner Wolves Den UNIX BBS, Durham NC UUCP: ...dukcds!wolves!ggw ...dukeac!wolves!ggw [use the maps!] Domain: ggw@cds.duke.edu ggw@ac.duke.edu ggw%wolves@ac.duke.edu Phone: +1 919 493 1998 (Home) +1 919 684 6126 (Work) [The line eater is a boojum snark! ] <standard disclaimers apply>
suitti@haddock.ima.isc.com (Stephen Uitti) (09/13/89)
In article <1989Sep12.031453.22947@wolves.uucp> ggw@wolves.UUCP (Gregory G. Woodbury) writes: >In article <21962@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) writes: >>Now, on the subject of PCs and disk I/O... >>What *will* change this though is the emergence of >>new low cost chips such as the NCR53C700 SCSI chip...[RAM, etc] > We are/have been replacing our dependency on mainframe computing >by acquiring a network of dedicated, "high performance" (and relatively) >low cost "PC's". Some of these things are "workstations", but they all >use the AT style PC buss, and take too bloody long to do the disk i/o. > >It is cheaper (more effective) for us to have... > ...an 88000 based co-processor in a '386 >33 MHz PC with 1.2 GBytes of disk and 24 MB of memory can cost ~$30,000. >This is amortizable over several years. Three years, max. >The full cost recovery for this >pc/workstation (assuming 40 hr weeks!) is < $6.00/hour! The University >owned (consortium) mainframe limits normal jobs to 5.5MB of memory and costs >$150.00/hour! Our application benchmark took ~1 hour to run on the 3081. >It takes about 1.8 hours to run on the 88000, and costs a hell of a lot less >'cause there isn't a memory premium overhead on the cost! I've worked at "University comp centers". The big cost is not the hardware. It is the people. This is true even when the machines are huge/expensive. These centers typically serve many people with small needs. They purchase machines that are extremely cost effective for the work that gets done on them. They typically don't optimize machines for individual users. They tend to discourage serious use by charging for hardware time, rather than charging for the (expensive) administrative overhead. The sites i worked at used VAXen, IBMs, etc. One site had about 100 pc's (some were macs). It is cheaper (capital outlay) to be your own sysop on your own PC. Most people only do word processing and a few compiles. No big deal. My 4.77 MHz 8088 based PC/XT can do this. $1000 and you are set for life. Still, we had users who didn't know how to boot a PC. If your comp center bought your system for your exclusive use, it would obviously cost more than $6/hr. My salary was more than that. Ok, so i ran several machines... a big comp center has lots more overhead. The comp center will attempt to account for time on the machine. Sigh. If you're lucky, there will be resources left to grant to users. I did see one prof get a machine, real cheap. He then expected it to run itself. The OS didn't magically get easier to run just because the machines was cheap. When his disk crashed, and no backups were done, there wasn't a whole lot that anyone could do, at any price. Oh, yes, maintenance. Do you think your PC won't break, being used 40 hrs/week? Get a good contract. If you don't need a sysop, then don't hire one. If you are the default sysop, make sure it doesn't take all your time - pay some attention to the application. For new and clueless users who don't have a lot of money, buy a Mac. Sign up for a maintenance contract. Maybe take some classes in word processing. Or read the manuals... Stephen.
karl@ficc.uu.net (Karl Lehenbauer) (09/15/89)
In article <21962@cup.portal.com> cliffhanger@cup.portal.com (Cliff C Heyer) writes: >[the first] disk in the late 50s contained 5MB? NOT 50KB like >early floppies. These guys new how much data they >had to store and weren't going to fuss around with >a 50KB hard drive. The Sage early warning computer had a drum storage unit that was about five by eight feet by six feet high. It stored around 128 Kbytes. It was retired in the early 1980s (!). They have one (drum unit, front panel, etc.; not everything) at the Boston Compter Museum. You wanna tell us how good computing was in the fifties, go hop on a period machine, say a Bendix G15 (I think I know where you can find one), and try to use it to do *anything*. -- -- uunet!ficc!karl "Have you debugged your wolf today?"
mcdonald@uxe.cso.uiuc.edu (09/18/89)
>Many 386 PCs give values of about only 200KB/sec which is not >a "SCREAM" unless you compare to the PC XT or something (which >is what manufacturers do to mislead you.) This is with SCSI too!!! >Perhaps you could try this file copy on your machine and see >what you get? I have heard of a few machines other than SUN that >can do 750KB/sec. Use MSDOS to do this so there is no program >overhead. If you only have one disk, you can copy to the NULL >device(in which case don't divide B by 2). I tried it on my Dell 310 with 150 meg disk. Reading a 3 megabyte file (copying to a 8192*7 byte buffer) and doing nothing to the results resulted in 570 kByte/second. Copying the file from one disk to a different part of the same disk gave 440 kByte/second (after the divide by two). This is not stupendous, but then again is by no means 200kByte/second. This was using pure DOS calls, contiguous files, written using interrupt calls in C (not the C library). However, the actual copy made fearsome chuggings of the disk head, probably due to the necessity to update the FAT. It was not as bad as the vendor-supplied disk test program, though. Oh yes, I am using a 160 kbyte disk cache (Microsoft Smartdrive). Obviously, this cache cannot contain the whole 3 meg file! (What it does is buffer tracks). Doug McDonald