[comp.arch] ETA-10: CMOS or ECL?

roy@phri.UUCP (Roy Smith) (10/14/88)

	I was at JVNC yesterday gaping through the plate-glass windows at
the ETA-10 they have there (very sexy packaging; looks sort of like you
would expect a HAL-9000 to look).  Anyway, some guy who sounded pretty
knowledgeable about the machine came up to me and started telling me about
how it worked.

	At one point I said something like "The CPU is all ECL, right?",
and he said, "No, it's CMOS".  This really surprised me.  Is it really
CMOS?  I always thought of CMOS as pretty slow stuff, just pushing the
speed of normal TTL, and certainly not supercomputer stuff.
-- 
Roy Smith, System Administrator
Public Health Research Institute
{allegra,philabs,cmcl2,rutgers}!phri!roy -or- phri!roy@uunet.uu.net
"The connector is the network"

kaul@icarus.eng.ohio-state.edu (Rich Kaul) (10/15/88)

In article <3539@phri.UUCP> roy@phri.UUCP (Roy Smith) writes:
>	At one point I said something like "The CPU is all ECL, right?",
>and he said, "No, it's CMOS".  This really surprised me.  Is it really
>CMOS?  I always thought of CMOS as pretty slow stuff, just pushing the
>speed of normal TTL, and certainly not supercomputer stuff.

It's CMOS.  CMOS can be pretty fast when you get the integration level up
and pay some attention to speed in the design.  From what I've seen, ETA
did a pretty good job on that machine. 

The really sexy version of the ETA is CMOS cooled in liquid nitrogen.  This
baby really cooks (or is that freezes? :-)  The cooling in nitrogen gets
you a factor of 2 in speed when you use CMOS devices.  Of course, your
mileage will vary with your process, supply voltages, etc., but a factor of
2 is pretty common at nitrogen temperatures.
-=-
Rich Kaul				kaul@icarus.eng.ohio-state.edu

smryan@garth.UUCP (Steven Ryan) (10/16/88)

>                                                          Is it really
>CMOS?  I always thought of CMOS as pretty slow stuff, just pushing the
>speed of normal TTL, and certainly not supercomputer stuff.

Yes, it uses slow chips. The trick is the chips are dense and the whole
CPU fits on one board, I think it was about a foot x foot. The faster
models are bathed in liquid nitrogen.

Lincoln decided to use a slow technology but very dense so that the shorter
off-chip delays dominate.

gillies@p.cs.uiuc.edu (10/16/88)

There are some people who predict that most supercomputers of the
future will be made of CMOS.

CMOS gates use power mainly when they switch logic levels.  The power
consumption curve is a bell curve, with a peak halfway between the logic
levels.  Very few other types of logic (ECL, TTL) have this property, i.e.
that in steady state, the power consumption is almost NIL.

To increase CMOS speed, you just shrink the design rules of the CMOS
circuit (less than 1 micron?) and pay attention to transmission line
effects, etc.  You want this in a supercomputer, so that data
transmission time is minimized, maximizing CPU speed.

Try this in any other technology (esp. ECL), and your supercomputer
will probably melt down, because so much power is being dissipated
constantly, in a very small area (like 1 cubic foot).  It's a physics
/ thermodynamics homework problem to prove that a sufficiently small
ECL computer with enough gates cannot be cooled with known cooling
technologies.


P.S. I stopped taking VLSI courses 4 years ago, so please excuse my
ignorance if some of this is inaccurate.


Don Gillies, Dept. of Computer Science, University of Illinois
1304 W. Springfield, Urbana, Ill 61801      
ARPA: gillies@cs.uiuc.edu   UUCP: {uunet,ihnp4,harvard}!uiucdcs!gillies

friedl@vsi.COM (Stephen J. Friedl) (10/17/88)

In article <76700054@p.cs.uiuc.edu>, gillies@p.cs.uiuc.edu writes:
> 
> There are some people who predict that most supercomputers of the
> future will be made of CMOS.
> 
> CMOS gates use power mainly when they switch logic levels.  The power
> consumption curve is a bell curve, with a peak halfway between the logic
> levels.  Very few other types of logic (ECL, TTL) have this property, i.e.
> that in steady state, the power consumption is almost NIL.

This also means that as speeds go up, the low-power benefits of
CMOS drop pretty dramatically.  CMOS in a standby state can run
off two strips of metal in a potato, but when you clock it very
fast the current rises quickly.  There are some parts of a system
that will be used only occasionally, but [stretching memory here[
a CPU made of CMOS can take nearly as much power as NMOS.

> P.S. I stopped taking VLSI courses 4 years ago, so please excuse my
> ignorance if some of this is inaccurate.

P.S. - I've never had any VLSI courses, so please excuse my
ignorance if some of this is inaccurate

     Steve
-- 
Steve Friedl    V-Systems, Inc.  +1 714 545 6442    3B2-kind-of-guy
friedl@vsi.com     {backbones}!vsi.com!friedl    attmail!vsi!friedl
---------Nancy Reagan on the Three Stooges: "Just say Moe"---------

phil@diablo.amd.com (Phil Ngai) (10/18/88)

In article <76700054@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
|To increase CMOS speed, you just shrink the design rules of the CMOS
|circuit (less than 1 micron?) and pay attention to transmission line
|effects, etc.  

Transmission line effects on a chip? Just how big *are* your chips
anyway? 

"In the West, to waste water is not to consume it, to let it flow unimpeded 
and undiverted down rivers. Use of water is, by definition, beneficial use."
(from _Cadillac Desert_)
Phil Ngai, {ucbvax,decwrl,allegra}!amdcad!phil or phil@amd.com

grunwald@m.cs.uiuc.edu (10/18/88)

Again, the caveat of old knowledge, but..

The switching speed of CMOS is limited by charge disappation time. By reducing
the voltage difference between zero and one states, you can increase the
switching speed. This also takes less current, because you're pushing less
electrons around to charge an area.

krish@jetsun.WEITEK.COM (Krishnan Sridhar) (10/19/88)

In article <23287@amdcad.AMD.COM>, phil@diablo.amd.com (Phil Ngai) writes:
> In article <76700054@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:

> Transmission line effects on a chip? Just how big *are* your chips
> anyway? 
> 

 Size of the chip is not the only determining factor here. A few years ago, I
 worked on a piece of logic (for a supercomputer) using Fairchild's (may it's
 soul rest in peace) 100K ECL gate arrays. The characterization curves for
 the gate array cells assumed transmission line effects inside the chip ...
 The gate array could support about 3000 gates.

 - Krish

dre%ember@Sun.COM (David Emberson) (10/20/88)

In article <23287@amdcad.AMD.COM>, phil@diablo.amd.com (Phil Ngai) writes:
> 
> Transmission line effects on a chip? Just how big *are* your chips
> anyway? 

Well, if you clock the chip at ~100 GHz I suppose you would have to worry
about it! :-)

				Dave

aglew@urbsdc.Urbana.Gould.COM (10/20/88)

..> CMOS power consumption.

Dynamic power consumption dominates in CMOS, especially
as frequency goes up. As I understand it, there are two
components of dynamic power consumption: (1) the inherent
power consumption as you go through charge/discharge cycles
on your nodes; (2) the transient that occurs if both the
n and p networks are switched on at the same, brief, time, in 
transition.

The first seems to be inherent.

Cannot the second be controlled by making it impossible to have
both sides on at the same time. Eg. instead of using PHI and PHIBAR
to control top and bottom halves of a dynamic circuit, not use
PHI_P = 011000 and PHI_N = 000011, ie. use non-overlapping clocks
in the same clock module, instead of in different cascaded stages.
NB. this is *not* the same as two or four phase clocking with
overlapping clocks, at least in the textbooks I've seen.

This is probably a very novice question, but I'm trying...

ralphw@ius3.ius.cs.cmu.edu (Ralph Hyre) (10/22/88)

In article <23287@amdcad.AMD.COM>, phil@diablo.amd.com (Phil Ngai) writes:
> 
> Transmission line effects on a chip? Just how big *are* your chips
> anyway? 
I've heard rumors to the effect that this is what broke the first RSA VLSI
chip.  (Which was done in the early days while the design tools were still
evolving.)
-- 
					- Ralph W. Hyre, Jr.
Internet: ralphw@ius3.cs.cmu.edu    Phone:(412) CMU-BUGS
Amateur Packet Radio: N3FGW@W2XO, or c/o W3VC, CMU Radio Club, Pittsburgh, PA
"You can do what you want with my computer, but leave me alone!8-)"

clewis@ecicrl.UUCP (Chris Lewis) (10/23/88)

In article <3539@phri.UUCP> roy@phri.UUCP (Roy Smith) writes:
>
>	I was at JVNC yesterday gaping through the plate-glass windows at
>the ETA-10 ...
>
>	At one point I said something like "The CPU is all ECL, right?",
>and he said, "No, it's CMOS".  This really surprised me.  Is it really
>CMOS?  I always thought of CMOS as pretty slow stuff, just pushing the
>speed of normal TTL, and certainly not supercomputer stuff.

Modern high-speed CMOS IC's are actually outstripping much of TTL.  For example,
I believe that the 74HC series is approximately the same speed as 74AS
(advanced schottky).  Coupled with the fact that it is possible to
create considerably denser CMOS than TTL you no longer have as much
capacitance to charge - capacitance is one of the determining factors
in CMOS speed.  CMOS has come a long way from the old RCA CD4000 series
parts...  To answer someone else's question which is connected:

In article <558@quintus.UUCP> Jabir Hussain (jabir@quintus.uucp) writes:

> could anyone tell me what ASIC stands for?  thanks

ASIC stands for "Application Specific Integrated Circuit".  Various
IC manufacturers have developed a process by which a customer designs
what he wants his ICs to do, and the IC manufacturer programs their
machinery and makes it.  Effectively, an ASIC part is a medium to large
gate array, where the final connections between each of the elements
can be easily rearranged as the last step in fabrication.  The customer
is often given some CAD software that knows the geometry and capacity
of an unprogrammed ASIC part, the customer builds his circuitry using
CAD and a library of "macros" which describe basic building blocks.
A tape of the result is sent to the foundry to be used in constructing
a mask for the final etching phase.  The chip is then encapsulated
and tested and shipped to the customer.  Usually costs something like
$30K to $100K to run through one iteration of CAD -> finished part.
[Not to mention running the CAD (though many manufacturers rent out
time on their own systems) and engineering costs]  Thus, you gotta
be real careful that you get it right the first time.  Many manufacturers
also supply simulators so you can test the heck out of it before
committing to silicon.

Think of it as an mask-ROM, where instead of programming ones and zeros,
you're programming connections between gates.

Now, some of the modern ASIC is pretty spectacular.  I believe that
the ETA-10 is built with a 6000-gate ASIC part, with gate delays
on the order of .2 to .5 nS.  Resulting in what a CDC person described
as "effectively a Cyber 205 on a board".  Something like 15ns cycle
time for the whole machine, as opposed to something around 7 for the 205.
Motorola has an 10K gate ASIC family (though, this is ECL I think) which
has gate delays of 60-100 pS!

The libraries are also pretty spectacular - some libraries come complete
with entire CPU's as basic building blocks.  You want multiple CPUs
on a single chip?  Easy!  Many PC/AT/386 clones rely upon
these parts for reducing their parts count: eg: Chips and Technology
as well as Zymos market chip sets (4-7 components) that replace most
of the 100 or so MSI parts required on a motherboard.   Zymos actually
built a single ASIC part that replaced *everything* on an AT motherboard
except for the CPU chip.  The only reason you don't see such things around
now, is that they figgered they'd need something like 300 pins, and
the yield would be lousy.  So it never went in production.  Many of the
new display adapters on PCs consist of a single ASIC part.   If I'm not
mistaken, IBM pioneered a lot of this technology in building their
big iron, and other manufacturers (eg: machine manufacturers like
CDC, or chip companies like TI, Motorola) have followed suit or 
independently developed their own ASIC fabrication and design process.
There are now companies exclusively doing ASIC or ASIC tools (C&T, Zymos, 
CDC has a subsidiary doing it, LSI etc).
-- 
Chris Lewis
Ferret Mailing list:{uunet!attcan,yunexus,utzoo}!lsuc!gate!eci386!ferret-request
{uunet!mnetor,yunexus,utzoo}!lsuc!ecicrl!clewis
(or lsuc!gate!eci386!clewis or lsuc!clewis)

phil@diablo.amd.com (Phil Ngai) (10/26/88)

In article <126@ecicrl.UUCP> clewis@ecicrl.UUCP (Chris Lewis) writes:
|Modern high-speed CMOS IC's are actually outstripping much of TTL.  For example,
|I believe that the 74HC series is approximately the same speed as 74AS

Not by a long shot. 74HC is comparable to 74LS. 74AC competes nicely
with 74AS and 74F, however, particularly for driving RAM arrays.
Fairchild had a really nice CMOS line. Unfortunately, National lost
the data book when they bought Fairchild. (ever try to get a 74AC data
book from National?) I just hope they didn't lose anything *important*. 

"In the West, to waste water is not to consume it, to let it flow unimpeded 
and undiverted down rivers. Use of water is, by definition, beneficial use."
(from _Cadillac Desert_)
Phil Ngai, {ucbvax,decwrl,allegra}!amdcad!phil or phil@amd.com

earl@wright (Earl Killian) (10/26/88)

In article <23368@amdcad.AMD.COM>, phil@diablo (Phil Ngai) writes:
>Not by a long shot. 74HC is comparable to 74LS. 74AC competes nicely
>with 74AS and 74F, however, particularly for driving RAM arrays.
>Fairchild had a really nice CMOS line. Unfortunately, National lost
>the data book when they bought Fairchild. (ever try to get a 74AC
>data book from National?) I just hope they didn't lose anything
>*important*.

Don't worry about Fairchild.  FACT is now slow CMOS.

Comparing the same part (374), clock to output (max), in different families:
	TI ALS (TTL) 74ALS374			16.0ns (50pf)
	Fairchild FACT (CMOS) 74AC374 		11.0ns (50pf)
	Fairchild FAST (TTL) 74F374		10.0ns (50pf)
	TI AS (TTL) 74AS374			 9.0ns (50pf)
	IDT FCT (CMOS) 74FCT374A		 7.2ns (50pf)
For comparison, equivalent ECL parts would be
	Fairchild 100K (ECL) 100141		 2.3ns (50ohm)
	Sony CXB1109Q				 0.8ns (50ohm)

Conclusion: In 1988, ECL < CMOS < TTL, at least for MSI.  It's still
the case that VLSI ECL < VLSI CMOS, but that gap is closing.
-- 

laurel@super.ORG (Our Friends Up the Way) (10/26/88)

In article <126@ecicrl.UUCP> clewis@ecicrl.UUCP (Chris Lewis) writes:

>as "effectively a Cyber 205 on a board".  Something like 15ns cycle
>time for the whole machine, as opposed to something around 7 for the 205.

The ETA-10 has a cycle times between 7ns and 22ns, depending upon which
configuration you buy. The 205 has a cycle time of 20ns.
-- 
Michael Tighe
Supercomputer Research Center
ARPA: laurel@super.org

lindsay@k.gp.cs.cmu.edu (Donald Lindsay) (10/31/88)

My file system turned up 
"A Cryogenically Cooled CMOS VLSI Supercomputer"
VLSI Systems Design, June 1987

Some highlights:
3,000,000 circuits per CPU, on 240 gate arrays on one board
gate arrays have 284 pins (surface mount TAB) with 11-mil-center leads;
   1.25 micron; 20k gates max; 2nd sourced (Honeywell & Performance Semi)
electronic clock tuning on each chip, so all chips are within 100 pico
   of each other
6 pins/chip dedicated to self-test: everyone should be so smart.
The board is 17" x 23", 20 signal layers, 40 layers total.
Their CMOS is 1.6 times faster at liquid nitrogen temperatures.  (Both N
and P devices improve the same amount, so there aren't skew problems.)
Plus, the metallization is six times less resistive, which improves any
RC delays. (Plus lower noise, which I assume they can't count on, since
the machine has to also work at ambient.)

On the other hand, the Crays are faster. The Great White Hope at ETA is
that they will learn how to roll these things out like cookies and 
then the customers will spring for multiprocessors.

- plus the inevitable chip shrink -
-- 
Don		lindsay@k.gp.cs.cmu.edu    CMU Computer Science

brooks@maddog.llnl.gov (Eugene Brooks) (10/31/88)

In article <3453@pt.cs.cmu.edu> lindsay@k.gp.cs.cmu.edu (Donald Lindsay) writes:
>On the other hand, the Crays are faster. The Great White Hope at ETA is
>that they will learn how to roll these things out like cookies and 
>then the customers will spring for multiprocessors.
ETA likes to call the thing that their processor boards share "shared memory",
but when you look at it the only efficient accesses are stride 1 with transfers
to and from local memory.  To paraphrase a polititian of the 50's, if it looks
like a shared SSD, walks like a shared SSD, and quacks like a shared SSD;
it must be a shared SSD.

The Crays are not only faster, but they also have real honest to God shared
memory.