[comp.arch] chip cost

chris@mimsy.umd.edu (Chris Torek) (11/09/90)

In article <2857@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.COM
(Wm E Davidsen Jr) writes:
>The cost is a factor of die size and process.  The cost of the same area
>in 2.5 micron CMOS (static memory) is a lot less than the same area filled
>with 0.8 micron CMOS (a CPU). If it was really cheaper to use less area
>everyone would use 6000 angstrom design rules for everything, right?

A good point (although I can think of various objections).

As I understand it (this information comes from my younger brother, who
is doing device research at Penn State) you can shrink as much as you
want.  The problem is that yeild goes to zero.  (`We put 100 billion
angels on the head of this pin.  The only problem is, they are all
dead.'  `Yeah, but can they still dance?' :-) )  There are two main
reasons for this.  One is dirt: one small speck on a 1.2u line is not
necessarily a disaster; the same speck on a .6u line is.  The other is
solid state physics matters that are way over my head (my brother has
been heard mumbling something about `hot electrons').

Anyway, assuming equipment and processing costs are equal for all sizes
(probably false, but...), all you have to do is maximize n_working_chips
in the following:

	area(chip) ~= devices(chip) * size(process)
	n_chips = area(die) / area(chip)
	P(dirt on chip) = total_dirt_on_die / area(chip)	[*]
	P(failure) = P(dirt) * coefficient(process)
	n_working_chips = n_chips * (1.0 - P(failure))

				[* assumes uniform random distribution]

If coefficient(process) is a curve, this makes n_working_chips a
corresponding curve (if my rusty mathematical mental imaging is still
functioning).  If it were a constant then minimizing area would be
best since a finite number of specks of dirt would take out a finite
number of chips (~1 each) as n_chips approaches infinity, so that
n_working_chips would also approach infinity.  (This of course assumes
infinitely small dirt. :-) )  Therefore it must be a curve; you want
the point where n_chips is large but coefficient(process) is still
fairly small.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (11/09/90)

In article <27547@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:

| As I understand it (this information comes from my younger brother, who
| is doing device research at Penn State) you can shrink as much as you
| want.  The problem is that yeild goes to zero.  (`We put 100 billion
| angels on the head of this pin.  The only problem is, they are all
| dead.'  `Yeah, but can they still dance?' :-) )  

  You have it right. To go smaller will either cost more to keep the
same yeild, or give lower yeild. Either way the actual cost per working
chip is likely to go up. Vendors have tickled their processes so that
the cost goes up less than the yeild goes down, and the net cost of a
chip then goes down.

  One problem is that making small features has a price jump as you
reach the limit of resolution in visible light. Vendors went to UV, and
I don't know what they're using now (other than here, which I would ask
before stating). Other tricks include a coating which makes the cuts
cleaner (the edges of a feature are more nearly perpendicular to the
surface, and I'm sure other tricks, too.

  I'm fairly sure that a good deal of the early cost of a chip is the
research, and of course it's a lot easier to make a chip than design it.
I'm sure in the long run the prices of chips will fall, because that's
the way the vendor can make the most moeny. You can buy into any
technology at the point where it gets cost effective for you.

  I suspect that the techniques IBM used to write "IBM" in letters which
were something like seven atoms high could be used to build *very* small
chips, I just don't know if the performance would be better than what we
have now. One problem seems to be curable by going to lower voltage to
reduce power dissipation. 
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
    VMS is a text-only adventure game. If you win you can use unix.

@crdgw1:raje@dolores.stanford.edu (exos:) (11/09/90)

Hi,
This is wrt the chip yield articles in comp.arch. I think we might be
straying from comp.arch material so I decided to send you private mail.
If you feel this is appropriate for comp.arch, you are welcome to post it.

Your arguments are right in the most part, I just wanted to clarify a
few points.

>In article <27547@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
>   Anyway, assuming equipment and processing costs are equal for all sizes
>   (probably false, but...), all you have to do is maximize n_working_chips
>   in the following:
>
>	   area(chip) ~= devices(chip) * size(process)
>	   n_chips = area(die) / area(chip)
>	   P(dirt on chip) = total_dirt_on_die / area(chip)	[*]

Actually the probability that there will be "dirt on chip" is proportional
to the area of the chip, not inversely proportional. ie. a larger chip is
more likely to have a defect. Actually the distribution is not quite so
straightforward, and using Binomial statistics the simplistic yield equation
is     

  yield = fraction of working chips in a wafer =  e^(- lambda Chiparea)
                                                     .......(1)
 where lambda = Average defects per unit area of the wafer

lambda is a function of the technology - ie lambda is larger for more
agressive technologies and also of the "maturity" of the technology - ie IC
fabs strive to reduce lambda on the same process by "getting better at
manufacturing the technology".

Now the total number of working chips is

  Working chips = Total Chips in the wafer * yield

                = N e^(-lambda Waferarea/N)           .......(2)

This is of course a continuously increasing function of N for constant lambda.
So for any given technology you could get more and more working chips
from a wafer but they would hold fewer and fewer devices and of course this
would be pointless to pursue too far.

By scaling devices down, the same function can be accomplished
in a smaller Chiparea but lambda increases and the yield (Eqn 1)
may initially go down.

There are two factors that mitigate the loss of yield in a more agressive 
technology
1. lambda eventually reduces as one progresses along the learning curve
2. The number of chips (N) in the wafer is larger because of the smaller
Chiparea and hence the total number of Working chips (Eqn 2) could be
made larger even if the lambda stays at a higher value than before.

And of course keep in mind that these chips will be faster and consume
lesser power.

Current predictions are that device scaling will pretty much stop at the
0.25um MOSFET gate length level. The problems are 
1. reliability of devices
 - hot electron effects
 - breakdown of oxides
 - punchthrough
 - unstable threshold voltages
 - statistical behaviour - 0.25umx0.25um MOSFET with an 80A oxide at 2V 
 gate bias has just around 3000 electrons in the channel!),

2. reliability of interconnects 
 - worsens as the fifth power of device scaling
 - electromigratoin
 - contact resistance
 - interconnect delay dominating total delay of device

3. Manufacturing issues
 - lithography limits (could be solved by e-beam)
 - etching limits (aspect ratios in DRAMS are as high as 10)

4. Design issues, testing, and a whole host of others ..

But the march continues, relentless.

Prasad

tgg@otter.hpl.hp.com (Tom Gardner) (11/09/90)

|As I understand it (this information comes from my younger brother, who
|is doing device research at Penn State) you can shrink as much as you
|want.  The problem is that yeild goes to zero.

Not quite. Two interesting effects become apparent:
  - as the conductors become smaller and closer together the electrons
  tunnel from one conductor to a neighbouring one
  - as the transistors and conductors become smaller then, for a given
  current density (which is often a limiting factor), the current must
  reduce. Eventually you reach the point where, due to the quantised nature
  of current, there is a non-zero probability that there is no electron in
  the conductor/transistor, even though there is, on average, a current 
  flowing. 

About 8 years ago John Barker was indicating that these effects would begin
to become apparent as feature sizes shrink below 0.1um. Is there any more 
recent information?

vinoski@apollo.HP.COM (Stephen Vinoski) (11/10/90)

The cost of the testing component of chip production continues to rise as
geometries shrink and pin count increases.  This is mainly because the
gate-to-pin ratio keeps climbing, resulting in faults which are difficult to
test via traditional test methods because they're buried so deep inside the
device.  (Methodologies such as scan testing are helping to reduce this cost.) 
Another part of the test cost is the test hardware required to really put the
device through its paces; some VLSI test systems cost over 5 million dollars,
and guess who ultimately pays for it...


-steve


| Steve Vinoski  (508)256-6600 x5904       | Internet: vinoski@apollo.hp.com  |
| Testability and Diagnostics              | UUCP: ...mit-eddie!apollo!vinoski|
| HP Apollo Division, Chelmsford, MA 01824 |       ...uw-beaver!apollo!vinoski|
|                             I feel crapulous today.                         |

przemek@liszt.helios.nd.edu (Przemek Klosowski) (11/10/90)

In article <780020@otter.hpl.hp.com> tgg@otter.hpl.hp.com (Tom Gardner) writes:
>Chris Torek:
>   |want.  The problem is that yeild goes to zero.
>
>Not quite. Two interesting effects become apparent:
>  - as the conductors become smaller and closer together the electrons
>  tunnel from one conductor to a neighbouring one
>  - as the transistors and conductors become smaller then, for a given
>  current density (which is often a limiting factor), the current must
>  reduce. Eventually you reach the point where, due to the quantised nature
>  of current, there is a non-zero probability that there is no electron in
>  the conductor/transistor, even though there is, on average, a current 
>  flowing. 
Well, you are right, but also you are wrong. These effects indeed exist
but at much smaller scale than discussed. The tunnelling effect range
depends on the wavelength of electron and on the depth of the
potential well in which it is trapped (i.e. work function---energy needed
to take the free electron out of the conductor). In practical cases it will
be on the order of an 10-100 angstrom (10-100 * 1e-10m, or .001 to .01 micron.
Again, the electron charge is 1.6e-19 coulomb, so you need approx 1e13 of
the little critters for each microampere of current each second, or 1e4
per each nanosecond. It is still plenty. There is a whole 'semi'science
called nanotechnology, that deals with devices on the nanometer scale, where
these things matter. Nothing demonstrated yet, although I would say that 
MBE deposited superlattices look very promising (my thesis is on physics
of magnetism in such systems)

	przemek
--
			przemek klosowski (przemek@ndcva.cc.nd.edu)
			Physics Dept
			University of Notre Dame IN 46556

staff@cadlab.sublink.ORG (Alex Martelli) (11/10/90)

I suggest Hennessy and Patterson's "Computer Architecture, a
Quantitative Approach", published by Morgan Kaufmann, for a simple and
clear model of how one goes about minimizing chip cost.  Great book!

-- 
Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 45, Bologna, Italia
Email: (work:) staff@cadlab.sublink.org, (home:) alex@am.sublink.org
Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434; 
Fax: ++39 (51) 366964 (work only), Fidonet: 332/401.3 (home only).