[net.arch] uP Technology: What's in a Micron?

mark@mips.UUCP (Mark G. Johnson) (10/30/86)
Since the subject has been brought up about transistor speeds and
differences between various microprocessor technologies, it might be
useful to discuss technology parameters that are of interest in
comparing uP's.

There are _more_than_one_ interesting tradeoffs to be made in designing/
selecting a technology for microprocessors, so the standard datasheet
phrase "this chip is built in 1.8 micron CMOS" doesn't tell the whole
story.  In this article, I'll mention several of the important technology
parameters, and tradeoffs among them, to illuminate what the different
meanings of "1.8 micron CMOS" might be.  I'll limit the discussion to
CMOS technology, since it's probably of greatest interest at the moment,
having been used in many 32-bit micros (e.g. Intel 80386, Fairchild Clipper,
Motorola 68020, MIPS R2000, etc.)

WARNING:
	CMOS process technology is a large field which would require a
	multi-volume reference work to adequately describe.  So I will
	indulge in broad generalizations and oversimplifications to keep
	this posting to a reasonable length; please forgive me if I omit
	2nd (or 1st!) order phenomena.
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --


When selecting a CMOS process for building microprocessors, three technology
aspects (at least) are of major interest:

	1. Speed of the fundamental logic gates
	2. Packing density
	3. Cost
These parameters are somewhat interrelated; in general you can select
('design') any two of the parameters, and these choices determine the third.
	[ However, this isn't an ideal equation like PV=nRT, and clever
	  engineers have found and will find ways to improve all three
	  simultaneously. ]
Let's examine each parameter in some detail.



SPEED OF THE FUNDAMENTAL LOGIC GATES

     Very roughly speaking, the technology variable which most strongly
determines gate speed in CMOS processes is the "Effective Channel Length"
(abbreviated 'Leff') of the individual transistors.  In fact, theoreticians
assert that speed is inversely proportional to the *square* of Leff if
everything else remains constant.
	[ This is called the "CV scaling theory"; in practice the delay
	  ratio is closer to (Leff**-1.5) than to (Leff**-2).  Also, in
	  practice it is rare for everything else to remain constant ].
Nevertheless, the benefits of shorter Effective Channel Lengths are
unmistakable and dominant.

     So, if it is known that the 'Fubar_4' CMOS chip uses Leff=2.0 microns,
and the 'Poot_5' uses Leff=1.5 microns, we can conclude that NAND gates,
inverters, and other circuits are probably faster in the 'Poot_5'.
Of course we should acknowledge that good Fab Engineering can make a major
difference; a shoddy Leff=1.5 micron process can indeed be slower than a
well-engineered, well-controlled Leff=2.0 micron process.
	

PACKING DENSITY

     Density is governed by the "layout groundrules" (often called Design
Rules) for the particular CMOS technology.  These dictate the minimum
allowed widths of conductors, minimum spaces between conductors, minimum
size of inter-layer contacts, etc.  For a CMOS process such as might
be used in a 32-bit microprocessor, the groundrules will include minimum
distances that vary over a large range.  For example:

	Minimum spacing of NMOS transistor to PMOS transistor:	 10 microns
	Minimum width of N+ diffusion:				4.0 microns
	Minimum space between adjacent metal lines:		3.0 microns
	Minimum channel length of MOS transistor:		2.0 microns
	Minimum metal enclosure of contact:			1.0 microns
How do you describe the hypothetical process above??  10 micron CMOS?
1 micron CMOS?

     Perhaps the "best" way to estimate layout density is to use an
average of all layout groundrules, since they all contribute to the size
of a chip.  This technique was pioneered by IBM, who used (still uses?)
the following (unweighted averaging) formula for computing a density
figure-of-merit:

	Density (um) = [DW + DS + PW + PS + CW + CS + MW + MS] / 8

where DW = diffusion width, PW = polysilicon width, CW = contact width,
MW = metal width, DS = diffusion spacing, etc.  The IBM formula neglects
the effects of inter-layer rules (enclosures etc) but overall it does a
a respectable job of predicting the relative densities of two CMOS
technologies.  However, the unweighted average is sometimes unfortunate;
the size of arrayed layout structures (RAM, PLA, datapath) tends to be set
by transistor spacing, while the size of random logic structures tends to
be set by interconnect spacing.

     SHAZAM!!  We now have two *different* numbers, Effective Channel
Length and Layout Density Figure-of-Merit, which describe two *different*
aspects of a CMOS process, yet both appear (unfortunately!) in the units
of microns.  No wonder confusion occurs!  It's quite possible for the
'Poot_5' microprocessor to use 1.5 micron Leff transistors, and average
layout groundrules of 2.6 microns.  Then, depending on who you talk to
and what they're describing, the 'Poot_5' is sometimes a "1.5 micron"
and sometimes a "2.6 micron" chip!


COST

     As you'd expect, the smaller a layout groundrule gets, the more
difficult it is to fabricate.   So, as a general rule-of-thumb, processes
with tight design rules have lower yields (yield == good chips/total chips)
for equal die sizes [all other things being equal, like the competence of
the fab engineers and equipment!!]

     A shabby zeroth-order metric for predicting yield is the smallest
pitch (linewidth + spacing) allowed on any layer: the smaller this
pitch, the lower the yield.  This presumes that the major yield-loss
occurs on the most difficult process step, obviously a gross
oversimplification!

     Yield is not the only component of cost; it is just a multiplier on
top of the basic manufacturing cost of a wafer.  If wafer cost is high,
even a high-yielding product will be expensive.

     Wafer cost is affected by the number of masks used in the process
(more masks --> more opportunities to screw up + more labor content -->
higher cost).  Other components of wafer cost are:
	* unusual process steps that require special equipment
	* expensive raw materials
	* total number of process steps ("activities")
Examples include: double-level-metal (extra process steps, extra masks,
special equipment), epitaxial wafers (raw materials), retrograde wells
(expensive equipment), silicides (equipment and process steps), precious
and/or refractory metal systems (materials), lightly-doped-drain
transistors (extra process steps), etc.

These factors lead to classic engineering tradeoffs such as:
      "Maximize performance subject to such-and-such cost constraint"
      "Minimize cost subject to such-and-such performance constraint"


TRADEOFFS

Here are some of the more common tradeoffs found in microprocessor
technology selection, along with example commercial uP's that use them:

Faster gates --> higher cost
     Keep layout groundrules identical, but etch the transistor effective
  channel lengths down to small dimensions.  This results in unchanged
  layout density but markedly faster gates.  Cost increases due to the
  lower yield of the small transistors.  This is basically the Intel
  thrust from 1977 to 1985.  The much-heralded Intel "HMOS-1" process
  employed groundrules averaging about 4.0 microns, yet gave transistors
  of Effective Channel Lengths at the incredible size (for 1977) of 2.5 um.

Higher density *and* faster gates --> higher cost
     Use additional levels of low-impedance interconnect.  This improves
  layout density by permitting wires to "fly overhead" across active
  circuitry. It also improves speed by eliminating parasitic resistance
  from signal paths.  Western Electric (WE 32000) and Inmos (Transputer) have
  chosen to use silicides to reduce poly resistance and hence gain another
  "metal-like" layer.
     Intel (80386), Fairchild (Clipper), and MIPS (R2000) have all decided
  to add a second level of metal.  This costs more than silicides, partly
  due to extra labor & equipment, and partly because the metal-2 process
  is sometimes tricky and therefore yield-depressing.  However, true
  double-level-metal is more flexible than silicides because the resistance
  is low enough (100X lower than silicides) to route any node (including
  VCC/ground) in either metal for basically any distance.

Extremely high density --> high cost
     By attacking those layout groundrules most detrimental to layout
  density, it is possible to achieve remarkable results.  However,
  building the smaller groundrules implies lower yield, hence higher cost.
     Engineers at Hewlett Packard developed a process which reduced basically
  every inter-layer rule (contact enclosure, etc) to zero, using innovative
  self-alignment techniques and exotic processing.  This technology, used
  on the Focus/9000 processor, allowed H.P. to build 450K transistors on
  a uP chip in 1981, a density record that remains unbroken five years
  later.  (Admittedly, the majority of the devices were used in a huge
  control ROM.)  The average layout groundrule was approximately 1.3 microns.
  No cost figures were quoted, but it is apparent that the Focus process
  was quite a bit more expensive than a vanilla MOS technology.

Big chipsize --> higher cost
     This particular design decision seems to be made in practically all
  32-bit micros:  make the die as big as we possibly can (thus cramming
  as much circuitry onto the chip as possible) subject to the constraint
  that the yield has to be greater than zero.
     Yield is usually modeled by the equation  Y = exp(-KA)  where A is
  the area of the die.  Large die experience a double whammy: there
  are few of them on the wafer, and each one of them has an (exponentially!)
  low probability of working.  So it's quite a gamble to build a really
  huge die: the cost of a working chip could be enormous (1/yield).
     In general, big chips (if they yield!) can offer higher performance,
  because they spend the extra gates on goodies that reduce latency.
  These might include register windows, prefetch queues, onboard memory
  management, onboard cache control, onboard I- and D-caches, onboard
  floating-point, branch caches, stack caches, etc., (but nobody yet
  dares to put ALL of these on a single chip - it'd be bigger than
  a credit card!).  Of course, it's always possible to screw the pooch
  and waste a lot of gates for little incremental performance
  [e.g. put a Wallace Tree integer multiplier on the 80286 and see how
  much faster PC-DOS runs!].


SUMMARY

     When you read the statement "the Fubar_4 is built in 1.8 micron CMOS"
you have actually learned very little.  The technical parameters you'd
probably *like* to know would include:

       (1) What is the average layout groundrule of the technology?
             This translates into layout density, a measure of how
             many gates can be packed onto a chip.
       (2) What is the Effective Channel Length of the transistors?
             This translates (approximately) into logic gate speed.
       (3) What is the ->smallest<- layout pitch (width + space), and
           what non-vanilla process technologies are being used?
             These translate (approximately) into the difficulty of
             building the device, hence yield, hence cost.
-- 
-Mark Johnson
	DISCLAIMER: The opinions above are personal.
UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!mips!mark   TEL: 408-720-1700 x208
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086