[comp.arch] dedicated vs general-purpose CPUs

henry@utzoo.uucp (Henry Spencer) (08/04/88)

In article <2379@sugar.uu.net> karl@sugar.uu.net (Karl Lehenbauer) writes:
>I don't know at what point this becomes weirding out...

I'm not quite sure myself, but I suspect that the borderline is farther
to the weird side than most people think.  Sometime I may have a try at
proving this.

>Isn't it OK to have a
>dedicated processor on the multiport serial I/O boards? ...

If you're expecting long interrupt latency or some other such problem on
the main processor(s), then sure it's okay.  Otherwise, why bother?  The
non-smart serial ports on a Sun 3 are faster than the ALM-1 "smart" multiport
board, even with the slow interrupt handling of a 68020.  Granted, this is
an extreme example; the ALM-1 is pretty old.  But phrase it another way:
do you want your CPU power in one block that you can allocate as you please,
or divided up into fixed-size chunks, most of which are not under your
control?  If allocating it yourself doesn't impose excessive overhead or
latency problems, clearly the former is preferable.

>... Why can't we put a processor anywhere that's useful
>and have general multiprocessing as well?

Because processors cost money and have to be programmed, and one would
prefer to get maximum use out of the former while minimizing the complexity
of the latter.

>Why should my graphics CPU, with
>dedicated memory and, eventually, parallel processing, hardware transforms
>and such have to be complicated by the need to run user programs?

Why should user programs be denied the use of parallel processing, transform
hardware, and such?  Do you really think there's only one use for that
equipment?
-- 
MSDOS is not dead, it just     |     Henry Spencer at U of Toronto Zoology
smells that way.               | uunet!mnetor!utzoo!henry henry@zoo.toronto.edu

peter@ficc.UUCP (Peter da Silva) (08/05/88)

In article ... henry@utzoo.uucp (Henry Spencer) writes:
> But phrase it another way:
> do you want your CPU power in one block that you can allocate as you please,
> or divided up into fixed-size chunks, most of which are not under your
> control?

As big a block as possible. Unfortunately in the real world cheap computers
(such as Amigas or Suns or IRISes) have to use prepackaged parts: 68000s,
or 68020s, or 68030s, or 80386es, or SPARCs, or 88000s, or...

So, you *have* to use fixed size blocks. Once you have pulled all the
power you can out of your 680x0 or RISC chip, and you need more MIPS,
you have to add more processors. So, you have the graphics (and serial-IO
and disk IO and any other IO you care to name) wheel of life.

The other alternative is to build your own custom minicomputer or mainframe
with a single processor that gives you a zillion MIPS in one package. I
think you have already shot down that idea by pointing out that general
purpose processors will overtake you. I worry about SUN and SPARC in this
context...

Better to use a GP processor as your main CPU, and use a graphics library
that's implemented in hardware or software, whichever's cheaper. When you
run out of MIPS, stick a graphics accelerator (like the Amiga Blitter)
in and pop in a new shared library. When you get a bigger CPU, go back to
your software-only library. Or write a new one that uses the new 68040
instructions.

The most bang for your buck. Forever.

-- 
PS: The Amiga Blitter is mostly accessed through libraries. So when the
680x0 becomes faster than the blitter you can just change graphics.library.
Don't even have to recompile.
-- 
Peter da Silva, Ferranti International Controls Corporation, sugar!ficc!peter.
"You made a TIME MACHINE out of a VOLKSWAGEN BEETLE?"
"Well, I couldn't afford another deLorean."
"But how do you ever get it up to 88 miles per hour????"

df@nud.UUCP (Dale Farnsworth) (08/07/88)

Peter da Silva (peter@ficc.UUCP) writes:

> Better to use a GP processor as your main CPU, and use a graphics library
> that's implemented in hardware or software, whichever's cheaper. When you
> run out of MIPS, stick a graphics accelerator (like the Amiga Blitter)
> in and pop in a new shared library.

Yes, that does work, but what Henry has been saying is that when you
run out of MIPS, stick in another GP processor as an additional main CPU.
That way you can dynamically assign functions to processors rather than
having them "hard" partitioned.

Modern processors typically contain a superset of the performance and
functionality of typical graphics accelerators chips.  I believe that
we are near the point (if we haven't already crossed it) where the GP
chip beats the special purpose chip in price as well as performance.

-Dale

-- 
Dale Farnsworth		602-438-3092	noao!nud!df

smryan@garth.UUCP (Steven Ryan) (08/07/88)

If you want an expensive example of a machine made of diverse processors,
consider the CDC 6600->170s. A machine consists of 1 or 2 60-bit CPs and
10 to 20 12-bit PPs. All the number crunching goes in the CP and the
io in the PPs with all processors in parallel. (Well actually, PPs share
a barrel.)

Most of the operating system for NOS actually runs in the PPs so that the
CP spends much of its time in user state.

The 180 retains the same philosophy with 64-bit CPs and 16-bit PPs, though
apparently NOS/VE uses the PPs as just drivers with operating system mostly
in the CP.

henry@utzoo.uucp (Henry Spencer) (08/07/88)

In article <1221@ficc.UUCP> peter@ficc.UUCP (Peter da Silva) writes:
>> do you want your CPU power in one block that you can allocate as you please,
>> or divided up into fixed-size chunks, most of which are not under your
>> control?
>
>As big a block as possible. Unfortunately in the real world cheap computers
>(such as Amigas or Suns or IRISes) have to use prepackaged parts...
>So, you *have* to use fixed size blocks. Once you have pulled all the
>power you can out of your 680x0 or RISC chip, and you need more MIPS,
>you have to add more processors.  So, you have the graphics (and serial-IO
>and disk IO and any other IO you care to name) wheel of life.
>The other alternative is to build your own custom [processor]...

You forgot the third alternative:  add a second (third, etc.) 680x0 or
RISC or whatever.  That adds to your pool of centrally-managed power,
rather than balkanizing it as specialized auxiliary processors do.  A
multiprocessor system isn't quite as good as one fast processor, but with
competent designers it can come close.
-- 
MSDOS is not dead, it just     |     Henry Spencer at U of Toronto Zoology
smells that way.               | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

aglew@urbsdc.Urbana.Gould.COM (08/08/88)

..> Talk about general purpose vs. special purpose processors,
..> expanding by adding a general purpose CPU, or special
..> purpose, graphics, etc., devices.

I am continually reminded of something a Gould designer
said: in our (super-mini ECL) systems, the cheapest source
of CPU cycles is the main (multiboard, ECL) processor,
*not* the stock microprocessors that we have scattered
through the system.

lamaster@ames.arc.nasa.gov (Hugh LaMaster) (08/08/88)

In article <1173@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:
>If you want an expensive example of a machine made of diverse processors,
>consider the CDC 6600->170s. A machine consists of 1 or 2 60-bit CPs and
>10 to 20 12-bit PPs. All the number crunching goes in the CP and the
>io in the PPs with all processors in parallel.

It is my belief that this is the primary example of what special purpose
processors are for:  when you have a requirement for at least two high
speed data paths, it is usually cheaper to keep those paths separate
than to try to combine the paths on an even faster common bus.  A separate
processor for that data path may be very useful.  Consider the channel,
for example.  Usually, a channel is a bus with only two interfaces on it,
and typically some means by which the CPU can start up channel commands.

What makes channels useful is that because the channels operate in parallel,
you can use, on a large mainframe, 64 (say) 3 Megabyte/Second channels
instead of trying to build a bus that can operate a 200 MB/second over
a distance of 100 feet while supporting 128 devices.  Graphics is another
tempting place to use a dedicated data path and special purpose hardware,
since the flow of data tends to be unidirectional from program to image.
(But watch out for designs that are so unidirectional that you can't read
your processed image back into main memory.)

Since, as many people have observed, off the shelf components are usually
much better wrt price/performance during periods of rapid technology
advancement, a logical solution seems to be to use general purpose CPUs
as the processors for special purpose systems.  So, for example, the disk
controller with a 68020 in it supporting a system with a faster 68020 in
it as the main CPU.  Or, as I understand the recent Ardent announcement, a
machine which uses the same CPU type as its graphics engine and main CPU,
eschewing custom graphics engine hardware.  At times in the past, when
progress was slower, the motivation to use custom hardware to reduce costs
was there, and I assume that there will be periods like that in the future
as well.  So, I don't assume that the dedicated CPU will disappear- I think
it will just fade rapidly for the next few years.

-- 
  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)694-6117

jesup@cbmvax.UUCP (Randell Jesup) (08/09/88)

In article <1216@nud.UUCP> df@nud.UUCP (Dale Farnsworth) writes:
>Yes, that does work, but what Henry has been saying is that when you
>run out of MIPS, stick in another GP processor as an additional main CPU.
>That way you can dynamically assign functions to processors rather than
>having them "hard" partitioned.

	This can work, but is not always optimal.  It depends heavily on
either seperate address spaces or very fast shared memory.

>Modern processors typically contain a superset of the performance and
>functionality of typical graphics accelerators chips.  I believe that
>we are near the point (if we haven't already crossed it) where the GP
>chip beats the special purpose chip in price as well as performance.

	What is a typical "graphics accelerator chip"?  Very few people make
such things, since it requires a foundry and silicon expertise.  You're
right, add-on accelerators rarely produce big improvements, for a number of
reasons.  One is that often the software architecture isn't suited for
hardware assist, or the chip was seen as a stopgap, and is thus often very
simple, and doesn't help with much.  An example is the (still not released,
I believe) Atari blitter.  It was just a rectangle-copy chip, no special ops,
nothing else.  Compare this to the amiga chips, where the blitter was part of
the original design.  The blitter has 256 operations (3-source,1 dest), can
also do line-draw and fills, and is only a small part of chip it's on, and
shares hardware with other special purpose functions of the chips, like
dma channel addressing and arbitration.

	Custom hardware can do operations in a cycle that even the best risc
cpus will take several to do, since they are general purpose.  In some cases
a general purpose CPU is fine (memory transfer is usually reasonable), but
not for all uses can it keep up with custom harware.

-- 
Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup

prh@actnyc.UUCP (Paul R. Haas) (08/09/88)

In article <1173@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:
>If you want an expensive example of a machine made of diverse processors,
>consider the CDC 6600->170s. A machine consists of 1 or 2 60-bit CPs and
>10 to 20 12-bit PPs. All the number crunching goes in the CP and the
>io in the PPs with all processors in parallel. (Well actually, PPs share
>a barrel.)
[a PP is a peripheral processor, a CP is a Central Processor.]
>
>Most of the operating system for NOS actually runs in the PPs so that the
>CP spends much of its time in user state.
Nice idea, unfortunately, it was frequently faster and easier to do things
in the CP.  The PPs are slow, compared to the CP, and have only 4096, 12 bit
words.  If you try to do all of the operating system system in the PPs what
really happens is the CP spends much of its time in the idle state waiting
on the PP.
>
>The 180 retains the same philosophy with 64-bit CPs and 16-bit PPs, though
>apparently NOS/VE uses the PPs as just drivers with operating system mostly
>in the CP.
CDC does learn from experience.

On a machine with slow context switches an IO processor which is just bright
enough to buffer transactions so as to avoid some context switches, is a net
win.  There doesn't seem to be any reason to divert any more resources from
the CP.
------
Paul Haas
uunet!actnyc!prh

elg@killer.DALLAS.TX.US (Eric Green) (08/09/88)

In message <1988Aug7.013952.7842@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) says:
>In article <1221@ficc.UUCP> peter@ficc.UUCP (Peter da Silva) writes:
>>So, you *have* to use fixed size blocks. Once you have pulled all the
>>power you can out of your 680x0 or RISC chip, and you need more MIPS,
>>you have to add more processors.  So, you have the graphics (and serial-IO
>>and disk IO and any other IO you care to name) wheel of life.
>>The other alternative is to build your own custom [processor]...
>
>You forgot the third alternative:  add a second (third, etc.) 680x0 or
>RISC or whatever.  That adds to your pool of centrally-managed power,
>rather than balkanizing it as specialized auxiliary processors do.  A
>multiprocessor system isn't quite as good as one fast processor, but with
>competent designers it can come close.

One problem with that is cost. It takes a lot of "glue" to build a
multiprocessor, whereas the central-CPU/smart-peripheral interface
only requires a DMA protocol of some sort, which most computers will
have already. I'm not quite sure how much a 68030 costs. But I just
read a brief product announcement of a new 32-bit TI video chip that
will cost $95 or so in quantity, for most of the logic needed for a
video display, plus blitting etc.... I doubt that you could add
another 68030 for that price, especially considering the "glue" etc.
Of course the dual-68030 system would be more powerful. But when
you're building a low-end workstation, you have to shave dollars and
cents wherever you can find them, while still meeting minimum
performance standards... else, you find your market share quickly
eroding. 

I love raw CPU power as much as the other guy. But, general-purpose
multiprocessing and "cheap" do not fit in the same sentence
(especially when you get finished paying the development costs of a
multiprocessor OS kernal & appropriate modifications to the rest of
the system -- software is not free, tho EEs would like to think so
;-).

--
Eric Lee Green    ..!{ames,decwrl,mit-eddie,osu-cis}!killer!elg
          Snail Mail P.O. Box 92191 Lafayette, LA 70509              
       MISFORTUNE, n. The kind of fortune that never misses.

peter@ficc.UUCP (Peter da Silva) (08/10/88)

In article <1216@nud.UUCP>, df@nud.UUCP (Dale Farnsworth) writes:
> Peter da Silva (peter@ficc.UUCP) writes:
> > Better to use a GP processor as your main CPU, and use a graphics library
> > that's implemented in hardware or software, whichever's cheaper. When you
                                                ^^^^^^^^^^^^^^^^^^^ -- NOTE
> > run out of MIPS, stick a graphics accelerator (like the Amiga Blitter)
> > in and pop in a new shared library.

> Yes, that does work, but what Henry has been saying is that when you
> run out of MIPS, stick in another GP processor as an additional main CPU.

Yes. Yes. Of course. If that's the cheapest way to go, then do it. If it's
not, then go whatever other way gives you more bang for the buck. If you
have your own silicon foundry and are already designing a bunch of custom
chips to get the part count down, then what do you think you'll find?

680[23]0s are *not* cheap chips.
-- 
Peter da Silva, Ferranti International Controls Corporation, sugar!ficc!peter.
"You made a TIME MACHINE out of a VOLKSWAGEN BEETLE?"
"Well, I couldn't afford another deLorean."
"But how do you ever get it up to 88 miles per hour????"

peter@ficc.UUCP (Peter da Silva) (08/10/88)

In article ... henry@utzoo.uucp (Henry Spencer) writes:
> In article <1221@ficc.UUCP> peter@ficc.UUCP (Peter da Silva) writes:
> >Once you have pulled all the
> >power you can out of your 680x0 or RISC chip, and you need more MIPS,
> >you have to add more processors....

> You forgot the third alternative:  add a second (third, etc.) 680x0 or
> RISC or whatever.

This costs money. A lot of money. Last I checked, a complete Amiga cost
less than a single 68030. When you're already making heavy use of a silicon
foundry to cut your parts count...

Besides, I said "you have to add more processors". Not "you have to add special
purpose CPUs". People have been using 68000s and other CPUs as I/O processors
for years.
-- 
Peter da Silva, Ferranti International Controls Corporation, sugar!ficc!peter.
"You made a TIME MACHINE out of a VOLKSWAGEN BEETLE?"
"Well, I couldn't afford another deLorean."
"But how do you ever get it up to 88 miles per hour????"

maverick@cl2devy.SGI.COM (Steve Whitney) (08/10/88)

In article <4438@cbmvax.UUCP>, jesup@cbmvax.UUCP (Randell Jesup) writes:
...
> ...or the chip was seen as a stopgap, and is thus often very
> simple, and doesn't help with much.  An example is the (still not released,
> I believe) Atari blitter.  It was just a rectangle-copy chip, no special ops,
> nothing else...

Atari has released its blitter.  It's included in all of the Mega computers.
It is my understanding (and I may be wrong) that the blitter does line 
drawing and area fills as well as block copies.  In fact, the latest 520
and 1040 ST boards have sockets for this blitter.

			--Steve

henry@utzoo.uucp (Henry Spencer) (08/13/88)

In article <4438@cbmvax.UUCP> jesup@cbmvax.UUCP (Randell Jesup) writes:
>>... stick in another GP processor as an additional main CPU.
>
>	This can work, but is not always optimal.  It depends heavily on
>either seperate address spaces or very fast shared memory.

The same is true, of course, of memory-intensive graphics processors.
Not quite to the same extent, since general-purpose CPUs run the memory
a bit harder than specialized processors... but close.
-- 
Intel CPUs are not defective,  |     Henry Spencer at U of Toronto Zoology
they just act that way.        | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

henry@utzoo.uucp (Henry Spencer) (08/13/88)

In article <1241@ficc.UUCP> peter@ficc.UUCP (Peter da Silva) writes:
>> You forgot the third alternative:  add a second (third, etc.) 680x0 or
>> RISC or whatever.
>
>This costs money. A lot of money. Last I checked, a complete Amiga cost
>less than a single 68030...

So add another 68000 instead of a 68030.  The memory bandwidth is there
anyway, since one 68000 only uses about half...

Also, you get what you pay for.
-- 
Intel CPUs are not defective,  |     Henry Spencer at U of Toronto Zoology
they just act that way.        | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

peter@ficc.UUCP (Peter da Silva) (08/16/88)

In article ... henry@utzoo.uucp (Henry Spencer) writes:
> In article <1241@ficc.UUCP> peter@ficc.UUCP (Peter da Silva) writes:
> >This costs money. A lot of money. Last I checked, a complete Amiga cost
> >less than a single 68030...

> So add another 68000 instead of a 68030.  The memory bandwidth is there
> anyway, since one 68000 only uses about half...

You left out the other half of my statement, which is that for the application
the blitter is far faster than another 68000. And of course 2 more 68000s
would burn more CPU.

> Also, you get what you pay for.

And you pay for what you can afford. This is a personal computer. If it
costs as much as a small car I can't afford it... I can just barely afford
a small car right now (Mazda 323, a nice little machine, for all it's got
less than 64K of RAM).
-- 
Peter da Silva, Ferranti International Controls Corporation, sugar!ficc!peter.
"You made a TIME MACHINE out of a VOLKSWAGEN BEETLE?"
"Well, I couldn't afford another deLorean."
"But how do you ever get it up to 88 miles per hour????"