mark@mips.UUCP (Mark G. Johnson) (10/30/86)
Since the subject has been brought up about transistor speeds and differences between various microprocessor technologies, it might be useful to discuss technology parameters that are of interest in comparing uP's. There are _more_than_one_ interesting tradeoffs to be made in designing/ selecting a technology for microprocessors, so the standard datasheet phrase "this chip is built in 1.8 micron CMOS" doesn't tell the whole story. In this article, I'll mention several of the important technology parameters, and tradeoffs among them, to illuminate what the different meanings of "1.8 micron CMOS" might be. I'll limit the discussion to CMOS technology, since it's probably of greatest interest at the moment, having been used in many 32-bit micros (e.g. Intel 80386, Fairchild Clipper, Motorola 68020, MIPS R2000, etc.) WARNING: CMOS process technology is a large field which would require a multi-volume reference work to adequately describe. So I will indulge in broad generalizations and oversimplifications to keep this posting to a reasonable length; please forgive me if I omit 2nd (or 1st!) order phenomena. -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- When selecting a CMOS process for building microprocessors, three technology aspects (at least) are of major interest: 1. Speed of the fundamental logic gates 2. Packing density 3. Cost These parameters are somewhat interrelated; in general you can select ('design') any two of the parameters, and these choices determine the third. [ However, this isn't an ideal equation like PV=nRT, and clever engineers have found and will find ways to improve all three simultaneously. ] Let's examine each parameter in some detail. SPEED OF THE FUNDAMENTAL LOGIC GATES Very roughly speaking, the technology variable which most strongly determines gate speed in CMOS processes is the "Effective Channel Length" (abbreviated 'Leff') of the individual transistors. In fact, theoreticians assert that speed is inversely proportional to the *square* of Leff if everything else remains constant. [ This is called the "CV scaling theory"; in practice the delay ratio is closer to (Leff**-1.5) than to (Leff**-2). Also, in practice it is rare for everything else to remain constant ]. Nevertheless, the benefits of shorter Effective Channel Lengths are unmistakable and dominant. So, if it is known that the 'Fubar_4' CMOS chip uses Leff=2.0 microns, and the 'Poot_5' uses Leff=1.5 microns, we can conclude that NAND gates, inverters, and other circuits are probably faster in the 'Poot_5'. Of course we should acknowledge that good Fab Engineering can make a major difference; a shoddy Leff=1.5 micron process can indeed be slower than a well-engineered, well-controlled Leff=2.0 micron process. PACKING DENSITY Density is governed by the "layout groundrules" (often called Design Rules) for the particular CMOS technology. These dictate the minimum allowed widths of conductors, minimum spaces between conductors, minimum size of inter-layer contacts, etc. For a CMOS process such as might be used in a 32-bit microprocessor, the groundrules will include minimum distances that vary over a large range. For example: Minimum spacing of NMOS transistor to PMOS transistor: 10 microns Minimum width of N+ diffusion: 4.0 microns Minimum space between adjacent metal lines: 3.0 microns Minimum channel length of MOS transistor: 2.0 microns Minimum metal enclosure of contact: 1.0 microns How do you describe the hypothetical process above?? 10 micron CMOS? 1 micron CMOS? Perhaps the "best" way to estimate layout density is to use an average of all layout groundrules, since they all contribute to the size of a chip. This technique was pioneered by IBM, who used (still uses?) the following (unweighted averaging) formula for computing a density figure-of-merit: Density (um) = [DW + DS + PW + PS + CW + CS + MW + MS] / 8 where DW = diffusion width, PW = polysilicon width, CW = contact width, MW = metal width, DS = diffusion spacing, etc. The IBM formula neglects the effects of inter-layer rules (enclosures etc) but overall it does a a respectable job of predicting the relative densities of two CMOS technologies. However, the unweighted average is sometimes unfortunate; the size of arrayed layout structures (RAM, PLA, datapath) tends to be set by transistor spacing, while the size of random logic structures tends to be set by interconnect spacing. SHAZAM!! We now have two *different* numbers, Effective Channel Length and Layout Density Figure-of-Merit, which describe two *different* aspects of a CMOS process, yet both appear (unfortunately!) in the units of microns. No wonder confusion occurs! It's quite possible for the 'Poot_5' microprocessor to use 1.5 micron Leff transistors, and average layout groundrules of 2.6 microns. Then, depending on who you talk to and what they're describing, the 'Poot_5' is sometimes a "1.5 micron" and sometimes a "2.6 micron" chip! COST As you'd expect, the smaller a layout groundrule gets, the more difficult it is to fabricate. So, as a general rule-of-thumb, processes with tight design rules have lower yields (yield == good chips/total chips) for equal die sizes [all other things being equal, like the competence of the fab engineers and equipment!!] A shabby zeroth-order metric for predicting yield is the smallest pitch (linewidth + spacing) allowed on any layer: the smaller this pitch, the lower the yield. This presumes that the major yield-loss occurs on the most difficult process step, obviously a gross oversimplification! Yield is not the only component of cost; it is just a multiplier on top of the basic manufacturing cost of a wafer. If wafer cost is high, even a high-yielding product will be expensive. Wafer cost is affected by the number of masks used in the process (more masks --> more opportunities to screw up + more labor content --> higher cost). Other components of wafer cost are: * unusual process steps that require special equipment * expensive raw materials * total number of process steps ("activities") Examples include: double-level-metal (extra process steps, extra masks, special equipment), epitaxial wafers (raw materials), retrograde wells (expensive equipment), silicides (equipment and process steps), precious and/or refractory metal systems (materials), lightly-doped-drain transistors (extra process steps), etc. These factors lead to classic engineering tradeoffs such as: "Maximize performance subject to such-and-such cost constraint" "Minimize cost subject to such-and-such performance constraint" TRADEOFFS Here are some of the more common tradeoffs found in microprocessor technology selection, along with example commercial uP's that use them: Faster gates --> higher cost Keep layout groundrules identical, but etch the transistor effective channel lengths down to small dimensions. This results in unchanged layout density but markedly faster gates. Cost increases due to the lower yield of the small transistors. This is basically the Intel thrust from 1977 to 1985. The much-heralded Intel "HMOS-1" process employed groundrules averaging about 4.0 microns, yet gave transistors of Effective Channel Lengths at the incredible size (for 1977) of 2.5 um. Higher density *and* faster gates --> higher cost Use additional levels of low-impedance interconnect. This improves layout density by permitting wires to "fly overhead" across active circuitry. It also improves speed by eliminating parasitic resistance from signal paths. Western Electric (WE 32000) and Inmos (Transputer) have chosen to use silicides to reduce poly resistance and hence gain another "metal-like" layer. Intel (80386), Fairchild (Clipper), and MIPS (R2000) have all decided to add a second level of metal. This costs more than silicides, partly due to extra labor & equipment, and partly because the metal-2 process is sometimes tricky and therefore yield-depressing. However, true double-level-metal is more flexible than silicides because the resistance is low enough (100X lower than silicides) to route any node (including VCC/ground) in either metal for basically any distance. Extremely high density --> high cost By attacking those layout groundrules most detrimental to layout density, it is possible to achieve remarkable results. However, building the smaller groundrules implies lower yield, hence higher cost. Engineers at Hewlett Packard developed a process which reduced basically every inter-layer rule (contact enclosure, etc) to zero, using innovative self-alignment techniques and exotic processing. This technology, used on the Focus/9000 processor, allowed H.P. to build 450K transistors on a uP chip in 1981, a density record that remains unbroken five years later. (Admittedly, the majority of the devices were used in a huge control ROM.) The average layout groundrule was approximately 1.3 microns. No cost figures were quoted, but it is apparent that the Focus process was quite a bit more expensive than a vanilla MOS technology. Big chipsize --> higher cost This particular design decision seems to be made in practically all 32-bit micros: make the die as big as we possibly can (thus cramming as much circuitry onto the chip as possible) subject to the constraint that the yield has to be greater than zero. Yield is usually modeled by the equation Y = exp(-KA) where A is the area of the die. Large die experience a double whammy: there are few of them on the wafer, and each one of them has an (exponentially!) low probability of working. So it's quite a gamble to build a really huge die: the cost of a working chip could be enormous (1/yield). In general, big chips (if they yield!) can offer higher performance, because they spend the extra gates on goodies that reduce latency. These might include register windows, prefetch queues, onboard memory management, onboard cache control, onboard I- and D-caches, onboard floating-point, branch caches, stack caches, etc., (but nobody yet dares to put ALL of these on a single chip - it'd be bigger than a credit card!). Of course, it's always possible to screw the pooch and waste a lot of gates for little incremental performance [e.g. put a Wallace Tree integer multiplier on the 80286 and see how much faster PC-DOS runs!]. SUMMARY When you read the statement "the Fubar_4 is built in 1.8 micron CMOS" you have actually learned very little. The technical parameters you'd probably *like* to know would include: (1) What is the average layout groundrule of the technology? This translates into layout density, a measure of how many gates can be packed onto a chip. (2) What is the Effective Channel Length of the transistors? This translates (approximately) into logic gate speed. (3) What is the ->smallest<- layout pitch (width + space), and what non-vanilla process technologies are being used? These translate (approximately) into the difficulty of building the device, hence yield, hence cost. -- -Mark Johnson DISCLAIMER: The opinions above are personal. UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mark TEL: 408-720-1700 x208 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086