davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (04/12/91)
Another posting got me thinking about this, but I'll start a separate thread, since it's somewhat off topic. Will the day ever come when we can fast build custom CPUs? By that I mean the customer will be able to order a CPU built with certain instructions in hard code, perhaps some in microcode, designed to run an o/s which will emulate the rest. Obviously this would have an upper bound so you couldn't have ALL features, but consider trading a few register windows for another 8k cache, or giving up some parallelism to get hardware divide. What this requires is a set of capabilities which I believe could be available in the next decade. - fully automated chip layout. Doesn't have to be optimal, fully functional and nominal would do. - A program the customer could run on a PC or workstation to select the options, and then send then in by email, floppy, or whatever. (I think this capability is possible today) - direct computer controlled chip generation without a mask. If you don't have this to keep cost down the idea is too expensive to do. Okay, now everyone tell me what technology I missed. Remember that all of these CPUs would still run the same software, so there is no need to generate custom anything but silicon, or whatever we are using by the time the rest of this could be done. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "Most of the VAX instructions are in microcode, but halt and no-op are in hardware for efficiency"
miklg@sono.uucp (Michael Goldman ) (04/15/91)
Bill Davidsen suggested a custom CPU as a chip for the future. I read about, and got some literature, on a chip from Phillips which is a programmable gate array with a programming time on the order of ~1 ms. Their idea is that people would put their code into boolean, and *swap* it in and out of the gate array with the process - e.g., it could be TCP/IP code one time slice and X.25 code the next. It's still fairly new, and the idea seems exciting, but possibly ahead of its time. It would be nice if it really caught on, and we could get away from this concept of a single CPU with registers, and simply have our algorithms re- designed by the compiler to take advantage of the generality and parallelism inherent in general logic. We could have a number of different architectures for the gate arrays, with our compiler trans- lating our code into Boolean, with the manufacturer's utility translating that into the particular architecture of their gate array. There are a lot of applications that would benfit from parallelism - matrix multiplication, searching, numerical integration, etc. I'm afraid without a certain momentum from a number of big users it won't catch on, but maybe !?
henry@zoo.toronto.edu (Henry Spencer) (04/16/91)
In article <1991Apr15.154955.2452@sono.uucp> miklg@sono.uucp (Michael Goldman ) writes: > I read about, and got some literature, on a chip from Phillips >which is a programmable gate array with a programming time on the >order of ~1 ms. Their idea is that people would put their code >into boolean, and *swap* it in and out of the gate array... >I'm afraid without a certain momentum from a number of big users it >won't catch on, but maybe !? I don't think it's going to be popular unless Philips is willing to publish complete programming specifications, so you can generate programs for the array without using proprietary software. So far, the programmable-logic manufacturers as a class get a grade of F- for their willingness to tell mere mortals how to program the chips. (I.e., they won't.) Turkeys. -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry
cs191049@cs.brown.edu (Aaron Smith) (04/17/91)
The ee department at Brown University (LEMS lab) is working on a reconfigurable machine based on programmable gate arrays. Right now, our major obsticle is the reconfigure time. With a reconfigure time of 1ms, several gate arrays could be combined to produce a different configuration on every other clock cycle or somewhere close to that. I think reconfigurability holds great promise. Out current weapon of choice is the Xlinx line. Aaron Smith Graduate Student ats@lems.brown.edu
lindsay@gandalf.cs.cmu.edu (Donald Lindsay) (04/22/91)
In article <3329@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes: >Will the day ever come when we can fast build custom CPUs? > - fully automated chip layout. > - A program the customer could run on a PC or workstation to select > the options, and then send then in by email, floppy, or whatever. > - direct computer controlled chip generation without a mask. There was a project a few years ago (U Arizona??) with the motto "Ada to Silicon". I believe they claimed some success. So, the idea of highly automated chip layout has been around a few times. On another tack, see "Building and Using a Highly Parallel Programmable Logic Array", IEEE Computer, Jan 1991 p.81. This is the Splash system: software, plus a VME board containing 32 Xilinx FPGAs, each with RAM. The software isn't as high level as you wish, but there is a class of problems where they routinely run an order of magnitude faster than a Cray-2 or a CM-2. On the fabrication side, the trend has been to what you might call megafabs, with long construction times and nine digit price tags. I have been listening for years for hints that a minifab or microfab could be built, and things are looking up. The military tends to need very small production runs, so they have pushed for flexible multipurpose equipment - for instance, machines that can do several steps to a set of wafers before being reloaded. Add in ideas like "clean boxes", the rise of fab-equipment interface standards, FIBs (Focussed Ion Beams), laser and E-beam direct-write technologies, solid-state laser/plasma x-ray sources, x-ray mirrors, etc and it seems guaranteed that fab technology will evolve. Will the evolution allow microfabs? Gee, I don't know. I have this dream of a truck pulling up to the EE department's loading dock and leaving a 0.1-micron facility .. but just because an x-ray source will be small (take that, IBM!) does not imply that the whole fab will be a small enough number of units, and state-of-the-art, too. To end on an enthusiastic note: I just saw a wonderful photo. It showed a corner of a 68040, before-and-after they used FIB to cut two traces (!!) and run a patch wire (!!!!!!). The claim was that they had designed unconnected bits of logic into odd places around the chip, so that they could, say, cut out an inverter, and patch in a nand in its place. Wow. In fact, gosh golly wow. -- Don D.C.Lindsay Carnegie Mellon Robotics Institute
rod@isi.edu (Rodney Doyle Van Meter III) (04/23/91)
Some of you are probably already familiar with MOSIS, but since the discussion seems to be headed that way, I'll put in a plug for us. If you're hooked in to MOSIS, you can design a chip (usually in scalable CMOS design rules which we can provide you a copy of), submit it to be fabbed (preferably by email), and in usually 8-12 weeks (depending on which process, glitches we hit, etc.) you get back some number of copies of your chip. If you're at a university with VLSI design classes, the odds are good you're already set up with us. Commercial people can get in, too. Call (213)822-1511 and ask for MOSIS. Tell whoever answers that you want to find out about fabbing through us -- they should know enough to get you started on the paperwork. Price? I don't know, since I'm not connected with production stuff (I do unrelated programming & have never been familiar with any of that), but I think the bottom is around $500 for four copies of a tiny chip, which, I seem to recall somebody telling me, is good for about 10K gates, depending on your design style. That's in 2 micron CMOS. We also support 1.6, and maybe 1.2. You can submit designs of virtually any size and request virtually any number of parts, but it'll cost you more. As for "automatic" design, my boss wrote a book called _VLSI: Silicon Compilation and the Art of Automatic Microchip Design_, Ron Ayres, Prentice-Hall, 1983. He can take logic equations and generate layout (though I think it's pretty inefficient, it does work). This is separate from MOSIS. There is (or used to be) a company called Silicon Compilers which used his stuff. There are also techniques for taking logic circuit designs and producing layout, but I'm completely unfamiliar with them. I have no doubt that this area is being explored in the research community, though I haven't a clue where you'd find out about it. That's virtually everything I know about both topics, but I could probably refer you to people who know more if yo'ure interested. --Rod
manley@optilink.UUCP (Dave Manley) (04/23/91)
From article <12742@pt.cs.cmu.edu>, by lindsay@gandalf.cs.cmu.edu (Donald Lindsay): > In article <3329@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com > (bill davidsen) writes: >>Will the day ever come when we can fast build custom CPUs? >> - fully automated chip layout. >> - A program the customer could run on a PC or workstation to select >> the options, and then send then in by email, floppy, or whatever. >> - direct computer controlled chip generation without a mask. > > On the fabrication side, the trend has been to what you might call > megafabs, with long construction times and nine digit price tags. I > have been listening for years for hints that a minifab or microfab > could be built, and things are looking up. Someone (I could try to find the reference) does sell a gate array 'microfab'. I believe it is a 2u CMOS process. Physically I think it (the fab not the array) is about 20 feet on a side. I don't remember how large the array sizes were. I think it is priced ~1M. If fast is four weeks United Silicon Structures (no I don't work for them) advertises 1-2u CMOS full custom, no minimum quantity. Now, maybe your question should be: Will the day ever come when we can cheaply, fast build custom CPUs?
mash@mips.com (John Mashey) (04/23/91)
In article <6286@optilink.UUCP> manley@optilink.UUCP (Dave Manley) writes: >Now, maybe your question should be: Will the day ever come when we can cheaply, >fast build custom CPUs? If only it were so easy .... One also needs to: a) Cheaply and quickly generate the corresponding set of of diagnostics for both design verification and production. I especially want to see the ones for custom new designs with multi-processor, 2-level cache coherency... generated quickly... and worse: b) Cheaply and quickly generate the corresponding set of compilers, debuggers, libraries.... Now, there has been progress in both of those areas, so it's hardly hopeless..... but not easy :-) -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems MS 1/05, 930 E. Arques, Sunnyvale, CA 94088-3650
colwell@pdx023.pdx023 (Robert Colwell) (04/23/91)
In article <2548@spim.mips.COM> mash@mips.com (John Mashey) writes: In article <6286@optilink.UUCP> manley@optilink.UUCP (Dave Manley) writes: >Now, maybe your question should be: Will the day ever come when we can cheaply, >fast build custom CPUs? If only it were so easy .... One also needs to: a) Cheaply and quickly generate the corresponding set of of diagnostics for both design verification and production. I especially want to see the ones for custom new designs with multi-processor, 2-level cache coherency... generated quickly... and worse: b) Cheaply and quickly generate the corresponding set of compilers, debuggers, libraries.... Now, there has been progress in both of those areas, so it's hardly hopeless..... but not easy :-) Said the spider to the fly...the micro guys have ruined the CPU design game for most folks, IMHO (yeah, I know, I'm one of them now.) There's so very much more to this than the cost of designing and fabbing your first working production parts. You need to design the system, too. You need software, including OS, compilers, assemblers, debuggers, linkers, & profiling tools. You need a sales force that understands your product and can sell to customers. You need a marketing organization that knows where the customers hide and how to reach them. You need a benchmarking crew, because nobody's technology is so much better than everyone else's that they can live with off-the-shelf performance across the board. You need field service. And you need a story as to why somebody should take a chance on your system or processor instead of going with a sure bet by somebody bigger. I believe that the only hope for future garage-shop hardware designers is to get faster & much cheaper fabs, but also to get faster & much cheaper logic synthesis and simulation tools. Ultimately, I believe it's hopeless to try to design "custom CPUs"; if you do manage to overcome the big guys' economies of scale and captive process technology, and you also manage to get to market quicker than they do, and you achieve all of the things mentioned above, you still need a significant performance edge (or some other value-added.) Good luck with that, too. Personally, I believe the day has already come and gone, just as it has in the auto industry. There's an auto museum near Cape Cod, Mass., with a display near the front door of the logos of all the car companies that existed from 1910 through the present day. It's quite sobering to see how many there once were compared to how many survive today. Imagine what it would take to start up a new one nowadays. Bob Colwell colwell@ichips.intel.com 503-696-4550 Intel Corp. JF1-19 5200 NE Elam Young Parkway Hillsboro, Oregon 97124
bhoughto@pima.intel.com (Blair P. Houghton) (04/24/91)
In article <COLWELL.91Apr23090526@pdx023.pdx023> colwell@pdx023.pdx023 (Robert Colwell) writes: >There's an auto museum near Cape Cod, Mass., with a >display near the front door of the logos of all the car >companies that existed from 1910 through the present day. >It's quite sobering to see how many there once were >compared to how many survive today. Imagine what it would >take to start up a new one nowadays. Ca. 1926 there were over 350 auto manufacturing companies in the USA alone. Now there are 3. (If you count Saturn as 4, you aren't paying attention.) --Blair "It's trivial, it's irrelevant, it's the only thing you'll remember from today's news... Welcome to Usenet."
lindsay@gandalf.cs.cmu.edu (Donald Lindsay) (04/25/91)
In article <2548@spim.mips.COM> mash@mips.com (John Mashey) writes: >In article <6286@optilink.UUCP> manley@optilink.UUCP (Dave Manley) writes: >>Now, maybe your question should be: Will the day ever come when we can >>cheaply, fast build custom CPUs? >One also needs to: > a) Cheaply and quickly generate the corresponding set of of > diagnostics for both design verification and production. > b) Cheaply and quickly generate the corresponding set of > compilers, debuggers, libraries.... In article <COLWELL.91Apr23090526@pdx023.pdx023> colwell@pdx023.pdx023 (Robert Colwell) writes: [stuff I agree with, reinforcing the above] What you say is true, but not relevant to most of the published daydreaming about instant custom chips. Usually, the suggestion is that the chip will fit a special niche - such as a radar autocorrelator chip or a pattern matcher chip - or will be a coprocessor (in some loose sense of the word). The general-purpose market is to be avoided, not only for the good reasons which you gave, but also because it's increasingly hard to find big wins there. In a niche, it may be possible to get an enormous win: the Splash board is sometimes 200 times faster than a 16K-PE CM-2. In particular, most daydreams have been about casting a single specific algorithm to hardware. If a chemical-bonding problem is going to take days to grind, why not make an overnight chip, that has parallel execution units, one for each aspect of that particular molecule? And (more down to earth, or anyway MOSIS) why shouldn't an encryption chip have a 500-bit-wide ALU? In the science cases, the assumption is that the hardware verification will be somewhat application-specific, too. The user would be expected to have some test cases, and perhaps include e.g. checks to see if a physically conserved property (angular momentum?) has actually been conserved. -- Don D.C.Lindsay Carnegie Mellon Robotics Institute
buschman@tubsibr.uucp (Andreas Buschmann) (04/26/91)
lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes: > And (more down to earth, or anyway MOSIS) why shouldn't an >encryption chip have a 500-bit-wide ALU? Some years ago there was an RSA encryption chip constructed here as a project, which used a 900-bit-wide adder. It was full custom. I don't know if there were build more then som example chips, or even if it is available somewhere. The rights are at Siemens now, at leas I think so, but I havn't heard of it again. /|) Andreas Buschmann /-|) TU Braunschweig, Germany (West) ^^^^ was bitnet: buschman%tubsibr@dbsinf6.bitnet uucp: buschman@tubsibr.uucp
rph@cs.brown.edu (Richard Hughey) (04/26/91)
In article <12785@pt.cs.cmu.edu> lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes: >Usually, the suggestion is that the chip will fit a special niche - >such as a radar autocorrelator chip or a pattern matcher chip - or >will be a coprocessor (in some loose sense of the word). The >general-purpose market is to be avoided, not only for the good >reasons which you gave, but also because it's increasingly hard to >find big wins there. In a niche, it may be possible to get an >enormous win: the Splash board is sometimes 200 times faster than a >16K-PE CM-2. > >Don D.C.Lindsay Carnegie Mellon Robotics Institute Comparing co-processors against the Connection Machine isn't exactly the way to go - the CM-2 can be regarded as a massively parallel (and massively COSTLY) general-purpose co-processor, a great contrast slightly- or non-parallel supercomputers. Splash' main advantage over the CM-2 is its cost - the performance on the sequence comparison example is more realistically 10 times slower than the Splash board on 100x100 sequence comparison (the version mentioned in Computer is for distributed sequence comparison, using 100 of the 16K PEs - 100x100 (or, eqv., 100x128) comparison can be done 0.17 seconds, in comparison to Splash' 0.020 seconds [CM-2 performance could be further increaded by a factor of 4 or more by using minimum-size words, leading to a somewhat more complicated program.]). Where Splash does win (vs CM-2) is on size (cost) and its ability to prototype hardware designs before fabrication - programming can be slow (the seq. comp. program has many many many lines of code) but is much faster than designing and fabricating a new system, which when up and running might not be the perfect solution to a problem. As part of my thesis, I've implemented a programmable linear systolic array, designed specifically for combinatorial applications (sequence comparison prime among them). The system (The Brown Systolic Array, or B-SYS) has traditional SIMD programming with very efficient systolic communication. Sequence comparison variations run 5-40 lines of B-SYS code per cell program, though some systolic programming issues I'm looking at make this much easier. There's a running 10-chip (470-processor) prototype system that does simple seq. comparison about 1/20 the speed of Splash, so slow because each instruction execution requires 3 I/O writes over an ISA bus (ugh!). A full implementation (32 chips (1504 PEs) on a single board w/ local instruction sequencer) could perform 3-5 8-bit GOPS (2x faster than Splash). A redesign of the chip to 0.8 micron CMOS could increase PE density (and performance) by a factor of 10. There's a paper upcoming in ICPP '91 on this, which I can send preprints of to anyone interested. Also, the tech report version of my thesis should be out in a couple of months. - Richard --------------------------------------- Richard Hughey INTERNET: rph@cs.brown.edu Brown University BITNET: rph@browncs Box 1910 (decvax, allegra, ...)!brunix!rph Providence, RI 02912
chased@rbbb.Eng.Sun.COM (David Chase) (04/26/91)
In article <12785@pt.cs.cmu.edu> lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes: >In particular, most daydreams have been about casting a single >specific algorithm to hardware. If a chemical-bonding problem is >going to take days to grind, why not make an overnight chip, that has >parallel execution units, one for each aspect of that particular >molecule? And (more down to earth, or anyway MOSIS) why shouldn't an >encryption chip have a 500-bit-wide ALU? There's also a middle ground -- (speaking of 500-bit wide ALUs) check out Computer Architecture News of March 1991, "Hardware Speedups in Long Integer Multiplication", by Shand, Bertin, Vuillemin. They used "Programmable Active Memory" to implement (among other things) a 32 by 512 bit multiplier, and 200Kbit/sec RSA en/decryption (512-bit keys). A PAM consists of a 5 by 5 array of LCA (Xilinx PGA data book) chips, plus 4 megabits of static RAM. Several of these were used to implement fast RSA. One point worth noting is that (as I understand it) the PAM is reconfigured for each key -- the (automatic) "compilation" to do this takes about 30 minutes (downloading to PAM is much faster). Note the benefits -- extreme customization allows high performance, programmability allows turnaround in under one hour, and you can do something else with the hardware when you are done with that problem. Read the paper. It's very interesting. David Chase Sun
guy@auspex.auspex.com (Guy Harris) (04/26/91)
>>It's quite sobering to see how many there once were >>compared to how many survive today. Imagine what it would >>take to start up a new one nowadays. > >Ca. 1926 there were over 350 auto manufacturing companies in >the USA alone. Now there are 3. (If you count Saturn as 4, >you aren't paying attention.) There are, however, some start-ups in the automotive industry; the ones I know about, however, are all building supercars. (If you count Acura, Lexus, or Infiniti as startups, you aren't paying attention. :-)) How many will survive, I dunno. There may be an analogy to be drawn here; cf. Robert Colwell's previous employer....
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (04/29/91)
At the time there were hundreds of companies, they were producing a few hundred cars a year, at a high price, mostly by hand. And there are still a number of companies which build small runs, since there's a cutoff in the EPA regs at something like 100 or 200 unit/year. Smaller firms have less stringent regulation. If I hit the lottery one of my cars will be an Avanti... -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "Most of the VAX instructions are in microcode, but halt and no-op are in hardware for efficiency"
peter@ficc.ferranti.com (peter da silva) (04/30/91)
In article <3394@crdos1.crd.ge.COM>, davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes: > And there are still a number of companies which build small runs... > If I hit the lottery one of my cars will be an Avanti... You mean a Corvette or Trans Am with a fiberglass shell based on an old Studebaker? There are plenty of companies that do the same thing in the computer world today. They're called VARs. -- Peter da Silva. `-_-' peter@ferranti.com +1 713 274 5180. 'U` "Have you hugged your wolf today?"
zaphod@madnix.UUCP (Ron Bean) (05/03/91)
In Article <B0.A_07@xds13.ferranti.com>, peter@ficc.ferranti.com (peter da silva) writes: >> And there are still a number of companies which build small runs... > >> If I hit the lottery one of my cars will be an Avanti... > >You mean a Corvette or Trans Am with a fiberglass shell based on an >old Studebaker? ^^^^^^^^ Try "built with Studebaker's original tooling". And that includes the chassis; only the engine and transmission come from Chevrolet. They've updated the design a bit in recent years, so I don't know how much of the old tooling remains, but they still build it from the ground up (Avantis have always had fiberglass bodies). A few years ago it was said that there were enough companies making replacement parts for Ford Model-A's that you could build a brand-new one, with no original parts (I suppose the same could be said of the IBM-PC :-). Perhaps the most likely motivation for a small-run custom CPU would be for nostalgic reasons (ie, PDP-10 on-a-chip). That way, you don't need a rational reason to do it. ================== zaphod@madnix.UUCP (Ron Bean) {harvard|rutgers|ucbvax}!uwvax!astroatc!nicmad!madnix!zaphod
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (05/06/91)
In article <1835@madnix.UUCP> zaphod@madnix.UUCP (Ron Bean) writes: | Perhaps the most likely motivation for | a small-run custom CPU would be for nostalgic reasons (ie, PDP-10 | on-a-chip). That way, you don't need a rational reason to do it. I think there would be a nice market for someone building a DPS-whatever (MULTICS engine) even today. MULTICS could become the PC operating system of choice for the next millenium. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "Most of the VAX instructions are in microcode, but halt and no-op are in hardware for efficiency"