johnl@iecc.cambridge.ma.us (John R. Levine) (05/06/91)
In article <8324@uceng.UC.EDU> dmocsny@minerva.che.uc.edu (Daniel Mocsny) writes: >Are we likely to see the fastest CPU in year X being able to run, >without change, a binary program more than 5 years old? ... Well, there's always the IBM 360. You can still run 1965 vintage 360 binaries on IBM's latest 3090 mainframe. The 360 architecture has stood the test of time surprisingly well, better I think than the 360 extensions. The 360 had simple instruction decoding, strict data alignment rules, and a large and uniform register set which made it relatively easy to speed up. (Yes, it also had things like edit-and-mark which is a disaster in a paging system, they weren't totally prescient.) -- John R. Levine, IECC, POB 349, Cambridge MA 02238, +1 617 492 3869 johnl@iecc.cambridge.ma.us, {ima|spdcc|world}!iecc!johnl Cheap oil is an oxymoron.
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (05/06/91)
In article <1991May05.174756.9026@iecc.cambridge.ma.us> johnl@iecc.cambridge.ma.us (John R. Levine) writes: | Well, there's always the IBM 360. You can still run 1965 vintage 360 | binaries on IBM's latest 3090 mainframe. And Honeywell. Things compiled in GECOS-II in 1963 or so seem to run on the latest Honeywell DPS systems as well. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "Most of the VAX instructions are in microcode, but halt and no-op are in hardware for efficiency"
haynes@felix.ucsc.edu (99700000) (05/07/91)
In article <1991May05.174756.9026@iecc.cambridge.ma.us> johnl@iecc.cambridge.ma.us (John R. Levine) writes: | Well, there's always the IBM 360. You can still run 1965 vintage 360 | binaries on IBM's latest 3090 mainframe. Hmmm. Wonder if you can still emulate 1401 machine code on a 3090?
dmocsny@minerva.che.uc.edu (Daniel Mocsny) (05/07/91)
In article <1991May05.174756.9026@iecc.cambridge.ma.us> johnl@iecc.cambridge.ma.us (John R. Levine) writes: >In article <8324@uceng.UC.EDU> dmocsny@minerva.che.uc.edu (Daniel Mocsny) writes: >>Are we likely to see the fastest CPU in year X being able to run, >>without change, a binary program more than 5 years old? ... >Well, there's always the IBM 360. You can still run 1965 vintage 360 >binaries on IBM's latest 3090 mainframe. That is truly impressive, in fact, it's rather astounding. But I see I left cost out of my question. So let me try another wrinkle: How does a 3090 stack up against modern workstations on the usual measures of performance/price, such as SPECmarks/$? My guess would be that the large backwards compatibility comes at a price. Also, how much slower and/or more expensive is the 3090 as a result of maintaining such backwards compatibility? (I realize that might be hard to get a handle on.) -- Dan Mocsny Internet: dmocsny@minerva.che.uc.edu
johnl@iecc.cambridge.ma.us (John R. Levine) (05/07/91)
In article <8346@uceng.UC.EDU> you write: >How does a 3090 stack up against modern workstations on the usual >measures of performance/price, such as SPECmarks/$? My guess would >be that the large backwards compatibility comes at a price. It's hard to compare, since the 3090 is a mainframe, not a workstation, which means that it has I/O bandwidth orders of magnitude better than anything you'd see on or next to a desk. High end 3090 installations run on-line systems which handle 1000 transactions/second (that's per second, not per hour) and there's nothing anywhere near comparable in workstation-land. A bunch of micros sharing data over a network turns out not to do the trick, because you end up with intolerable hot spots in the data base. >Also, how much slower and/or more expensive is the 3090 as a result >of maintaining such backwards compatibility? (I realize that might be >hard to get a handle on.) No question. it's not cheap. Some of the stuff they have to do is extremely gross. The worst example is an execute instruction which points to a translate-and-test (TRT) instruction. The TRT has two memory operands and looks up each byte of the first using the second as the lookup table until it finds a table entry that's non-zero. This means that the length of the first operand depends on its contents. 360 instructions are not continuable, so since the execute, the TRT, and both operands can each potentially span a page boundary, the CPU can need to touch as many as 8 pages. To tell whether it needs the 8th page it does a "trial execution" of the instruction that doesn't store any results before actually doing the instruction. There's even more internal hair than that, since the 3090 has lots of fault-tolerance hardware and takes microcode checkpoints several places in a complex instruction. That's the worst, a more typical instruction is "add" which computes an address by adding together one or two registers and a 12-bit offset in the instruction, picking up the word at that address, and adding it to a target register. Other than the three-input adder for address generation, that's pretty straightforward.
greg@organia.sce.carleton.ca (Greg Franks) (05/07/91)
In article <8346@uceng.UC.EDU> dmocsny@minerva.che.uc.edu (Daniel Mocsny) writes: In article <1991May05.174756.9026@iecc.cambridge.ma.us> johnl@iecc.cambridge.ma.us (John R. Levine) writes: >In article <8324@uceng.UC.EDU> dmocsny@minerva.che.uc.edu (Daniel Mocsny) writes: >>Are we likely to see the fastest CPU in year X being able to run, >>without change, a binary program more than 5 years old? ... >Well, there's always the IBM 360. You can still run 1965 vintage 360 >binaries on IBM's latest 3090 mainframe. That is truly impressive, in fact, it's rather astounding. But I see I left cost out of my question. So let me try another wrinkle: How does a 3090 stack up against modern workstations on the usual measures of performance/price, such as SPECmarks/$? My guess would be that the large backwards compatibility comes at a price. Also, how much slower and/or more expensive is the 3090 as a result of maintaining such backwards compatibility? (I realize that might be hard to get a handle on.) However, people who purchase IBM 3090's are not interested in SPECmarks/$$ because they are more interested in MBytes/second of transfer from the disk farm to the CPU and back. Most workstations of the SPECmarks/$$ fall flat on their face when it comes to I/O systems. (They also want to run five levels of emulation so that their anchient accounting program doesn't have to be changed :-) -- Greg Franks, (613) 788-5726 | "The reason that God was able to Systems Engineering, Carleton University, | create the world in seven days is Ottawa, Ontario, Canada K1S 5B6. | that he didn't have to worry about greg@sce.carleton.ca ...!cunews!sce!greg | the installed base" -- Enzo Torresi
ward@vlsi.waterloo.edu (Paul Ward) (05/07/91)
In article <8346@uceng.UC.EDU> dmocsny@minerva.che.uc.edu (Daniel Mocsny) writes: >That is truly impressive, in fact, it's rather astounding. But I see >I left cost out of my question. So let me try another wrinkle: > >How does a 3090 stack up against modern workstations on the usual >measures of performance/price, such as SPECmarks/$? My guess would >be that the large backwards compatibility comes at a price. A meaningless question - how can you possibly compare price performance of a workstation (typically a single or a few user machine) with an IBM mainframe which can support 400+ users concurrently? If anything, the major difference between PCs, workstations, minis and mainframes is the IO bandwidth, not the processor performance. What good is 500 MIPS and 50 MFLOPS if you are waiting so long for an I/O operation to complete that the real performance is ~5 MIPS and 0.5 MFLOPS. (The same applies to the memory subsystem - you have to keep the processor fed, or it will stall). >Also, how much slower and/or more expensive is the 3090 as a result >of maintaining such backwards compatibility? (I realize that might be >hard to get a handle on.) > > > >-- >Dan Mocsny >Internet: dmocsny@minerva.che.uc.edu Paul Ward. -- "One can certainly imagine the myriad of uses for a hand-held iguana maker." - Hobbes.
sysmgr@KING.ENG.UMD.EDU (Doug Mohney) (05/07/91)
In article <1991May7.130302.22332@vlsi.waterloo.edu>, ward@vlsi.waterloo.edu (Paul Ward) writes: > >A meaningless question - how can you possibly compare price performance of >a workstation (typically a single or a few user machine) with an IBM mainframe >which can support 400+ users concurrently? If anything, the major difference >between PCs, workstations, minis and mainframes is the IO bandwidth, not >the processor performance. And the service contracts :-) Signature envy: quality of some people to put 24+ lines in their .sigs -- > SYSMGR@CADLAB.ENG.UMD.EDU < --
c506634@umcvmb.missouri.edu (Eric Edwards) (05/09/91)
In article <1991May7.130302.22332@vlsi.waterloo.edu> ward@vlsi.waterloo.edu (Paul Ward) writes: > > A meaningless question - how can you possibly compare price performance of > a workstation (typically a single or a few user machine) with an IBM mainframe > which can support 400+ users concurrently? If anything, the major difference > between PCs, workstations, minis and mainframes is the IO bandwidth, not > the processor performance. What good is 500 MIPS and 50 MFLOPS if you are > waiting so long for an I/O operation to complete that the real performance is > ~5 MIPS and 0.5 MFLOPS. (The same applies to the memory subsystem - you have It's not entirely meaningless. Many jobs are CPU, not I/O bound. Compatibility aside, you would have to be a fool to use an IBM mainframe for that. They don't even come close to competitive with workstations on CPU performance. Can *all* the price performance difference be attributed to the presence or absence of a high speed IO system? Also, is there anything to prohibit a RISC based machine from having a high speed IO subsystem? Would adding this make the machine cost as much as a 3090? Eric Edwards: c506634 @ "I say we take off and nuke the entire site Inet: umcvmb.missouri.edu from orbit. It's the only way to be sure." Bitnet: umcvmb.bitnet -- Sigourney Weaver, _Aliens_
ward@vlsi.waterloo.edu (Paul Ward) (05/09/91)
In article <c506634.3284@umcvmb.missouri.edu> c506634@umcvmb.missouri.edu (Eric Edwards) writes: >In article <1991May7.130302.22332@vlsi.waterloo.edu> ward@vlsi.waterloo.edu (Paul Ward) writes: >> >> A meaningless question - how can you possibly compare price performance of >> a workstation (typically a single or a few user machine) with an IBM >> mainframe which can support 400+ users concurrently? If anything, the >> major difference between PCs, workstations, minis and mainframes is the >> I/O bandwidth, not the processor performance. What good is 500 MIPS >> and 50 MFLOPS if you are waiting so long for an I/O operation to complete >> that the real performance is~5 MIPS and 0.5 MFLOPS. (The same applies >> to the memory subsystem - you have > >It's not entirely meaningless. Many jobs are CPU, not I/O bound. >Compatibility aside, you would have to be a fool to use an IBM mainframe >for that. They don't even come close to competitive with workstations on >CPU performance. Can *all* the price performance difference be attributed >to the presence or absence of a high speed IO system? I agree that many jobs are CPU bound - but take a closer look at them. Take simulation as an example - suppose you want to simulate 10,000,000 logic gates in some design. (BTW there is nothing on the market that can do this at the moment). It looks like a classic CPU bound problem. However, it is so large, that no workstation memory system can handle it. You need virtual memory. But again, it is so large that you will spend forever just swapping pages between disk and memory. What is required is a very large (~100s MB) of memory, and a very fast disk I/O subsystem. >Also, is there anything to prohibit a RISC based machine from having a high >speed IO subsystem? Would adding this make the machine cost as much as a >3090? I don't know, but it is an interesting question. Do you have $20,000,000 ? We can try a little experiment. :-) Paul Ward University of Waterloo -- "One can certainly imagine the myriad of uses for a hand-held iguana maker." - Hobbes.
johnl@iecc.cambridge.ma.us (John R. Levine) (05/09/91)
In article <c506634.3284@umcvmb.missouri.edu> you write: >Can *all* the price performance difference be attributed >to the presence or absence of a high speed IO system? No, a lot of it is due to high-performance shared memory and a lot of reliability and servicability hardware and microcode. The 370/ESA I/O system does some impressive stuff in the interest of performance. For example, each disk drive is typically attached to several controllers, each of which is attached to several channels. This means that there are typically four or more different physical device addresses for each disk. One of the improvements in the new I/O system is that the CPU just issues an operation for the logical disk, and the channels find a path that isn't already in use. There is also a lot of buffering in disk controllers, as much as 128MB (that's MB, not KB.) In fairness, there is also a lot of glop, particularly in the disks, to support designs that made a lot more sense in 1964 than they do now. Traditional IBM disks allow variable length hardware disk blocks, and each block can have a key of up to 256 bytes. You can have the disk controller search down a track or cylinder looking for a particular key. This made perfect sense for ISAM on a 360/30, when the CPU stopped during disk I/O anyway, but it's pretty awful now. IBM has for 20 years had more reasonable index schemes based on B-trees, and disks with fixed size blocks addressed by block number rather than cylinder, track, and record, but there is still support for the old stuff. One might reasonably expect a new design not to have hardware keys on the disk. >Also, is there anything to prohibit a RISC based machine from having a high >speed IO subsystem? In most cases, no. In some cases there might be problems with bus contention, cache collisions, etc. Almost all 3090 systems have more than one CPU, and they all have many channels, which affects the design quite a lot. By the way, the 3090 channels each contain an 801 RISC micro to control the I/O, so in that sense there is already a RISC with a fast I/O system. > Would adding this make the machine cost as much as a 3090? I expect it'd be close enough that the price difference wouldn't be compelling. -- Regards, John Levine, johnl@iecc.cambridge.ma.us, {spdcc|ima|world}!iecc!johnl
cet1@cl.cam.ac.uk (C.E. Thompson) (05/10/91)
In article <1991May05.174756.9026@iecc.cambridge.ma.us> johnl@iecc.cambridge.ma.us (John R. Levine) writes: > ... w.r.t. the IBM 360 architecture ... >(Yes, it also had things like edit-and-mark which is a disaster in a paging >system, they weren't totally prescient.) > EDMK was an implementation problem on early IBM 370s with virtual memory, because the length of the area to be modified could only be determined by trial execution, while address translation exceptions were required to nullify all side-effects of the instruction. But this isn't a problem in modern implementations (e.g. IBM 308x and 3090) which have general mechanisms for rolling back the logical state of of a CPU, including recent storage modifications. Chris Thompson JANET: cet1@uk.ac.cam.phx Internet: cet1%phx.cam.ac.uk@nsfnet-relay.ac.uk
cet1@cl.cam.ac.uk (C.E. Thompson) (05/10/91)
In article <9105070005.AA24446@iecc.cambridge.ma.us> johnl@iecc.cambridge.ma.us (John R. Levine) writes: > {in re IBM 360 architecture} > >No question. it's not cheap. Some of the stuff they have to do is extremely >gross. The worst example is an execute instruction which points to a >translate-and-test (TRT) instruction. The TRT has two memory operands and >looks up each byte of the first using the second as the lookup table until >it finds a table entry that's non-zero. This means that the length of the >first operand depends on its contents. 360 instructions are not >continuable, so since the execute, the TRT, and both operands can each >potentially span a page boundary, the CPU can need to touch as many as 8 >pages. To tell whether it needs the 8th page it does a "trial execution" of >the instruction that doesn't store any results before actually doing the >instruction. There is something seriously wrong with this example. TRT doesn't *modify* storage, so rolling back the state of the CPU on a paging exception is almost trivial. You can't predict how early the TRT will stop, and so which pages will be touched, but you don't *need* to. It is no worse in this respect than a CLC instruction. A straight TR instruction is actually somewhat worse, because it does modify storage, and one can't tell without trial execution whether the whole of the 256-byte translation table needs to be translatable. The notoriously awful instruction is EDMK, as pointed out in another posting. Anyway, all these problems are finessed by the general rollback mechanisms of IBM 308x and 3090 series machines. > There's even more internal hair than that, since the 3090 has >lots of fault-tolerance hardware and takes microcode checkpoints several >places in a complex instruction. Even with a non-interruptible instruction? (obviously this happens for interruptible instructions like MVCL and CLCL.) Do you know this for a fact (about 3090s, specifically)? It rather suprises me. Chris Thompson JANET: cet1@uk.ac.cam.phx Internet: cet1%phx.cam.ac.uk@nsfnet-relay.ac.uk
herrickd@iccgcc.decnet.ab.com (05/14/91)
In article <9105091145.AA04421@iecc.cambridge.ma.us>, johnl@iecc.cambridge.ma.us (John R. Levine) writes: > Traditional IBM disks allow variable length hardware disk blocks, and each > block can have a key of up to 256 bytes. You can have the disk controller > search down a track or cylinder looking for a particular key. This made > perfect sense for ISAM on a 360/30, when the CPU stopped during disk I/O > anyway, but it's pretty awful now. IBM has for 20 years had more reasonable a minor nit: My 30 had a DASD controller that took channel programs from the memory of the 30 and went off and did them. If the 30 wanted to wait, it could, but nothing compelled it to. Having the DASD controller do the key lookup while the general purpose computer goes on about its business makes good sense. Finding the correct record on a track is a mechanical (as opposed to electronic) process. The 30 could continue doing things at electronic speeds while the DASD controller watched the disk rotate. DASD - Direct Access Storage Device. We had three 2311 drives on that system. They looked like small washing machines with removable disk packs under transparent covers. The packs had ten recording surfaces on six discs. I believe the capacity was about two and a half million bytes per pack. dan herrick herrickd@iccgcc.decnet.ab.com
haw30@duts.ccc.amdahl.com (Henry A Worth) (05/15/91)
In article <1991May9.144406.20558@vlsi.waterloo.edu> ward@vlsi.waterloo.edu (Paul Ward) writes: >In article <c506634.3284@umcvmb.missouri.edu> > c506634@umcvmb.missouri.edu (Eric Edwards) writes: > >>Also, is there anything to prohibit a RISC based machine from having a high >>speed IO subsystem? Would adding this make the machine cost as much as a >>3090? The HIPPI? interface with RAID (Redundant Arrays of Inexpensive Disks) disk systems is one possible solution. > >I don't know, but it is an interesting question. Do you have $20,000,000 ? >We can try a little experiment. :-) > So, you want to build a mainframe-class RISC system. Well, to compete with recently announced high-end CISC products from IBM, Amdahl, et. al., your going to need at least: 4-8 CPU's with 50+ honest MIPS - none of those inflated marketing RISC mips. ~1GB of fast static RAM - can't let those CPU's wait on paging, or have to twiddle their thumbs for too long after cache hits. A couple of hundred SCSI controllers - for that 10 million dollar disk farm in your garage. The capability for a like number of FDDI controllers - to keep up with the latest net-news. Several additional processors to help manage the I/O, encrypion/ decryption, ... Perhaps a few GB of dynamic RAM for disk caching, SSD and such. The busses and logic to make all this work... Features to ensure high availability... Oh, and don't forget a 100kw power source and cooling system. Now, if your a bit shy of the 100's of millions of dollars it would take to develop such a system -- several vendors (including Amdahl) have announced that they are working on "RISC-based" mainframes (I believe a few low-end systems have already been announced or are even avaliable from a couple of the Japanese mainframe producers) -- just wait until they become commodity items, and buy off the shelf. :-) -- Henry Worth -- haw30@duts.ccc.amdahl.com No, I don't speak for Amdahl -- I'm not even sure I speak for myself.