[comp.sys.amiga.hardware] A3000 CPU Wars!!

aaf4808@venus.tamu.edu (FORD, ANDREW ALLEN) (06/05/91)

This is my first post, so bear with me....
MY  y bottom line question regarding the recent A3000 16/50
vs A3000 25./50 (or 100) is this:

Suppose you have threee systems seitting in front of you.
One is an A3000 with 040 that disables motherboiard CPU.
One is an A3000 wit 16 mhxzz with 040 that uses motherboard CPU.
One is an A3000 25 mhz with 040 that uses motherboard CPU.

All three run the same benchmark program.

Which one is slowesrt and which one is fastest.? ?

Andy.

n368bq@tamuts.tamu.edu (Raoul Rodriguez) (06/05/91)

All right out of all three A3000's... you have

1) A3000 with 040 that disables motherboard CPU
2) A3000 w/040 AND 16Mhz Motherboard 
3) A3000 w/040 and 25 Mhz on motherboard

and you run a benchmark to see which is fastest, for the sake of the argument
we will assume that the 040 in 1) is running at 25 Mhz, and that the
benchmark you are going to run is optimized for multiple CPU's...
the winner is number 3.

But, if the benchmark is NOT set up to use multiple processors, it
should be a tie between 1 and 3, (maybe not depending on the harkware
'hacks' in the 040 board that you us, because the hardware might use the
motherboard CPU on it's own accord...), but, in either case number 2
will lose because if you leave the motherboard CPU 'on' and it is
running at 16 Mhz it will 'pull' the 040 down to run at 16 Mhz...

I think... :)

Raoul "My 500 Has a Detachable Keyboard" Rodriguez
n368bq@tamuts.tamu.edu
Standard Disclaimers Apply (Within)

taab5@isuvax.iastate.edu (Marc Barrett) (06/05/91)

In article <16888@helios.TAMU.EDU>, aaf4808@venus.tamu.edu (FORD, ANDREW ALLEN) writes:
>This is my first post, so bear with me....
>MY  y bottom line question regarding the recent A3000 16/50
>vs A3000 25./50 (or 100) is this:
>
>Suppose you have threee systems seitting in front of you.
>One is an A3000 with 040 that disables motherboiard CPU.
>One is an A3000 wit 16 mhxzz with 040 that uses motherboard CPU.
>One is an A3000 25 mhz with 040 that uses motherboard CPU.
>
>All three run the same benchmark program.
>
>Which one is slowesrt and which one is fastest.? ?

   If I am interpreting your message right, you are trying to compare
three systems: one with just an '040 running, and two with both an '030
and an '040 running.  If this is what you are asking, I have an easy
answer: they will all produce exactly identical results.

   The reason is that neither AmigaDOS nor UNIX (the two operating
systems you can run on an A3000) support multiprocessing.  For this
reason, you can only have one processor running the benchmark, and the
'040 is the one that runs the benchmark.  In all three systems, the
'040 in the CPU slot will be running at 25Mhz, with the '030s 
sitting idly by waiting for a multiprocessor operating system to
be written.

      I hope this answers your question.

>
>Andy.
  -------------------------------------------------------------
 / Marc Barrett  -MB- | BITNET:   XGR39@ISUVAX.BITNET        /   
/  ISU COM S Student  | Internet: XGR39@CCVAX.IASTATE.EDU   /      
------------------------------------------------------------    
\  ISU : The Home of the Goon                             /
 \       Who wants to Blow Up the Moon                   /
  -------------------------------------------------------

rjc@geech.gnu.ai.mit.edu (Ray Cromwell) (06/05/91)

In article <1991Jun5.060518.9683@news.iastate.edu> taab5@isuvax.iastate.edu writes:
>   The reason is that neither AmigaDOS nor UNIX (the two operating
>systems you can run on an A3000) support multiprocessing.  For this

  Not quite. AT&T(or was it Sequent?) is working on a version of 
SysV which will support multiple processors. There are also some
microkernels floating around that support multiprocessing. Multiprocessing
isn't prohibited by AmigaDOS. For instance, it would be easy to add in
a 34010, 68030, or i860 and have certain graphic or i/o operations offloaded
to it. (Easy in the sense, that once you get the board working
with it's own ram, and graphic kernel you could patch gfxlib to
call the functions on the 34010. You could also add in some
super-duper floating point co-processor chip and write a new
mathtrans/doubbas.library to handle it.) 
  Multiprocessing in the area of 'several processors sharing the same
bus, memory, and code space, and executing different program threds to
accomplish the final goal (output the solution)' is a different problem.
It has less to do with the OS and more to do with _how_ you code it.
It's an ongoing problem in CS. Some operations are sped up by using
many processors (vector and matrix operations) other operations get
no bonus for multiple processors. The language extension Linda was
created to help the problem. I don't know how far it has gone, but
I don't think it has reached a point yet where you can keep adding
processors into a system and get a linear speed up in the execution.
Sometimes the algorithm may have to be rewritten to work in parallel.
Most Connection Machines I've seen have Sun/HP front ends that offload
stuff like Mathematica onto the CM? 

>      I hope this answers your question.

  I think the original poster was slightly confused and thought that
adding processors together is like adding numbers. It isn't that
easy. (reminds me of relativety and how people thought velocity
vectors were added like any other quantity)

  I'm waiting for the day when they can build a 1 teraflop computer and
actually have benchmarks run at that speed (without recoding the
benchmark to run in parallel)

  Hmm, anyone wanna tell me how transputers work? (multiprocessing)?

>>
>>Andy.
>  -------------------------------------------------------------
> / Marc Barrett  -MB- | BITNET:   XGR39@ISUVAX.BITNET        /   
>/  ISU COM S Student  | Internet: XGR39@CCVAX.IASTATE.EDU   /      
>------------------------------------------------------------    
>\  ISU : The Home of the Goon                             /
> \       Who wants to Blow Up the Moon                   /
>  -------------------------------------------------------


--
/ INET:rjc@gnu.ai.mit.edu     *   // The opinions expressed here do not      \
| INET:r_cromwe@upr2.clu.net  | \X/  in any way reflect the views of my self.|
\ UUCP:uunet!tnc!m0023        *                                              /

taab5@isuvax.iastate.edu (Marc Barrett) (06/05/91)

In article <1991Jun5.072620.18879@mintaka.lcs.mit.edu>, rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:
>In article <1991Jun5.060518.9683@news.iastate.edu> taab5@isuvax.iastate.edu writes:
>>   The reason is that neither AmigaDOS nor UNIX (the two operating
>>systems you can run on an A3000) support multiprocessing.  For this
>
>  Not quite. AT&T(or was it Sequent?) is working on a version of 
>SysV which will support multiple processors. There are also some
>microkernels floating around that support multiprocessing. Multiprocessing
>isn't prohibited by AmigaDOS.  [rest deleted]

   I should have said that multiprocessing is not supported by Amiga UNIX
yet.  Yes, there are versions of System V floating around that support 
symmetric multiprocessing, modified by Compaq, Solbourne, and other companies
that have multiple-processor UNIX systems.  OSF/1 is also supposed to
support symmetric multiprocessing as a standard feature.  However, none of
these multiprocessing operating systems are available for the Amiga.
AT&T is supposed to be working with Solbourne on a version of Sys V R4
that supports symmetric multiprocessing, and I am sure the Amiga will
have it as soon as it is available.

>>      I hope this answers your question.                
>
>  I think the original poster was slightly confused and thought that
>adding processors together is like adding numbers. It isn't that
>easy. (reminds me of relativety and how people thought velocity
>vectors were added like any other quantity)
>
>  I'm waiting for the day when they can build a 1 teraflop computer and
>actually have benchmarks run at that speed (without recoding the
>benchmark to run in parallel)
>
>  Hmm, anyone wanna tell me how transputers work? (multiprocessing)?
>
>>>
>>>Andy.
>>  -------------------------------------------------------------
>> / Marc Barrett  -MB- | BITNET:   XGR39@ISUVAX.BITNET        /   
>>/  ISU COM S Student  | Internet: XGR39@CCVAX.IASTATE.EDU   /      
>>------------------------------------------------------------    
>>\  ISU : The Home of the Goon                             /
>> \       Who wants to Blow Up the Moon                   /
>>  -------------------------------------------------------
>
>
>--
>/ INET:rjc@gnu.ai.mit.edu     *   // The opinions expressed here do not      \
>| INET:r_cromwe@upr2.clu.net  | \X/  in any way reflect the views of my self.|
>\ UUCP:uunet!tnc!m0023        *                                              /
>
  -------------------------------------------------------------
 / Marc Barrett  -MB- | BITNET:   XGR39@ISUVAX.BITNET        /   
/  ISU COM S Student  | Internet: XGR39@CCVAX.IASTATE.EDU   /      
------------------------------------------------------------    
\  ISU : The Home of the Goon                             /
 \       Who wants to Blow Up the Moon                   /
  -------------------------------------------------------

mks@cbmvax.commodore.com (Michael Sinz) (06/05/91)

In article <1991Jun5.060518.9683@news.iastate.edu> taab5@isuvax.iastate.edu writes:
>In article <16888@helios.TAMU.EDU>, aaf4808@venus.tamu.edu (FORD, ANDREW ALLEN) writes:
>>This is my first post, so bear with me....
>>MY  y bottom line question regarding the recent A3000 16/50
>>vs A3000 25./50 (or 100) is this:
>>
>>Suppose you have threee systems seitting in front of you.
>>One is an A3000 with 040 that disables motherboiard CPU.
>>One is an A3000 wit 16 mhxzz with 040 that uses motherboard CPU.
>>One is an A3000 25 mhz with 040 that uses motherboard CPU.
>>
>>All three run the same benchmark program.
>>
>>Which one is slowesrt and which one is fastest.? ?
>
>   If I am interpreting your message right, you are trying to compare
>three systems: one with just an '040 running, and two with both an '030
>and an '040 running.  If this is what you are asking, I have an easy
>answer: they will all produce exactly identical results.

Well, not quite... If the software runs on only one CPU, the first
sitiuation will run faster as there would be no competition for
RAM/motherboard/etc resources with the 68030 and thus the 68040 will
have the fastest access to those resources.

/----------------------------------------------------------------------\
|      /// Michael Sinz  -  Amiga Software Engineer                    |
|     ///                   Operating System Development Group         |
|    ///   BIX:  msinz      UUNET:  rutgers!cbmvax!mks                 |
|\\\///    Programming is like sex:                                    |
| \XX/     One mistake and you have to support it for life.            |
\----------------------------------------------------------------------/

daveh@cbmvax.commodore.com (Dave Haynie) (06/06/91)

In article <16888@helios.TAMU.EDU> aaf4808@venus.tamu.edu writes:

>Suppose you have threee systems seitting in front of you.
>One is an A3000 with 040 that disables motherboiard CPU.
>One is an A3000 wit 16 mhxzz with 040 that uses motherboard CPU.
>One is an A3000 25 mhz with 040 that uses motherboard CPU.

>All three run the same benchmark program.

>Which one is slowesrt and which one is fastest.? ?

That depends considerably on the system software, and what you're doing with 
the second processor.  Also depends alot on the benchmark.  If the second
processor is sleeping, 1 and 3 go fast, 2 a bit slower.  Most likely, no 
matter what you get the 68030 doing, it's not going to offset the faster
68040, so my **GUESS** would be that 1 is faster than 3.  Also, even with a
true SMP operating system, you aren't likely to split a process between
processors.  The existence of the second processor, though, implies that the
task load for the processor running your benchmark is less than for that of
the single processing system.  Since you have one 68030 and one 68040, you
aren't truely symmetric here anyway.  So even if SMD were available, it would
not likely apply here (neither UNIX nor AmigaOS support full symmetric 
multiprocessing now, though AT&T has announced plans for a future SVr4MP,
which will).  So the most likely use of the 68030 is as an assistant to the
OS in a more controlled fashion, which either AmigaOS or UNIX will support
just dandily.  For instance, if you ran the filesystem and device driver on the
68030, then the 68040 would be free of any disk management activities.  This
might not affect a CPU intensive benchmark, but it would affect a disk intensive
one.  You might have the 68030 manage X under UNIX, freeing the '040 of that
arduous task.  In any case, example 3 will be as fast or faster, depending on
the setup, as any of the others.  

-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	"This is my mistake.  Let me make it good." -R.E.M.

lord_zar@ucrmath.ucr.edu (wayne wallace) (06/06/91)

taab5@isuvax.iastate.edu (Marc Barrett) writes:
>AT&T is supposed to be working with Solbourne on a version of Sys V R4
>that supports symmetric multiprocessing, and I am sure the Amiga will
>have it as soon as it is available.

Wow!!!! Marc said something POSITIVE about the Amiga!
Way to go, Mark!!!!!!

Wayne

aaf4808@venus.tamu.edu (FORD, ANDREW ALLEN) (06/06/91)

Yes, you understood me correctly. Thanks for your reply, Marc.

Now for my next question: Why was everyone getting so worked up about this
040 / 030 multiprocessor thing if no OS will soon take advantage of it?

Andy. [AAF4808@VENUS.tamu.edu]

sschaem@starnet.uucp (Stephan Schaem) (06/06/91)

 On that matter: Will there be a 'library' so the 68030 can replace the
 blitter?
 How hard will it be to use the 68030 with the 68040 running?

							Stephan.

m0154@tnc.UUCP (GUY GARNETT) (06/06/91)

In article <16892@helios.TAMU.EDU> n368bq@tamuts.tamu.edu (Raoul Rodriguez) writes:
>1) A3000 with 040 that disables motherboard CPU
>2) A3000 w/040 AND 16Mhz Motherboard 
>3) A3000 w/040 and 25 Mhz on motherboard
ent
>should be a tie between 1 and 3, (maybe not depending on the harkware
>'hacks' in the 040 board that you us, because the hardware might use the
>motherboard CPU on it's own accord...), but, in either case number 2
>will lose because if you leave the motherboard CPU 'on' and it is
>running at 16 Mhz it will 'pull' the 040 down to run at 16 Mhz...
>
>I think... :)
>
>Raoul "My 500 Has a Detachable Keyboard" Rodriguez
>n368bq@tamuts.tamu.edu
>Standard Disclaimers Apply (Within)


Well, no.  First off, it is unlikely that the Amiga OS will support
multiple CPUs in the near future (the next three years, or until v3 is
released, whichever comes first).  So the multiple CPU question is a
moot point (unless you are running a multi-CPU OS that you wrote
yourself :).

Assuming that the 040 boards are all the same, your benchmark results
should be the same no matter whether it is installed in a 16Mhz or a
25Mhz A3000.  This is because the ZorroIII bus, Coprocessor slot, and
motherboard RAM subsystems are identical on both machines.  The clock
speed of the 68030 on the motherboard need not "drag down" the 040
board in the coprocessor slot, if I understood correctly what Dave H.
was saying about it (correct me if I'm wrong, Dave :).

Wildstar
"Once again, I cut a worthless object."

m0154@tnc.UUCP (GUY GARNETT) (06/06/91)

[re: Multiprocessing]

There are two factors which cause a multiple processor computing
system to show diminishing returns (instead of adding up processor
power linearly as you add processors).  One is the communications
overhead: in order for more than one processor to work on a problem,
both processors involved must spend some time communicating with each
other (passing partial results, status information, or whatever)
rather than directly working on the problem.  The other, which occurs
when the processors share a bus, memory, or communications subsystem,
is contention: when one processor is on the bus (or using the memory,
or sending data) the other processors cannot use that resource.  If
the resource involved is memory the problem is particularly acute; the
other processors can't get new instructions or data until the bus or
memory is free again.

The first bottleneck can be addressed by arranging the processors in a
"topology" (connecting them in a certain order or array) which is
suited to the problem being computed.  A snag is that there is no one
topology which is optimal for all problems.

The second bottleneck can be addressed by larger and smarter caches on
each processor, or by giving each processor its own, independent
memory and bus.

The Transputer is an interesting attempt at proposing a solution to
these problems.  It is a small RISC CPU, which can control its own,
local memory bus.  Each one also has 4 high-speed serial links for
communicating with other transputers.  The serial links are DMA
driven, and require very minimal processor overhead.  Since the links
are a single serial line, it is easy to use a plugboard or a
programmable switching array to change the topology of the transputer
array to suit the problem at hand.  Transputers also come with a
special programming language, Occam, which is supposed to make it easy
to program an application for parallel processing.  Transputer C and
FORTRAN compilers emit Occam code.

For something closer to an Amiga owner's heart, the 68040 is, at least
in theory, quite capable of operating in a multiprocessor environment,
especially with other 68040s.  The large cache means that several
processors could share memory, with a relatively low amount of
contention (only when the cache missed would the processor need the
memory subsystem: hopefully less than 20% of the time, if Motorolla is
to be believed).  In theory, a version of the Amiga OS could be
constructed to work with an arbitrary number of processors, but such
an operating system would only work on 68030 and 68040 systems.  Hmmm,
v3 or v4 anyone?

For another interesting thought:  While it is not possible to run
the Amiga OS as a task under UNIX (Amiga OS is a real-time system
while UNIX is not), it might be possible to run UNIX under the Amiga
OS.  The UNIX would have to have kernel modifications to operate
correctly, but it could allocate RAM from the Amiga OS, and then use
the MMU to map it for UNIX operation.  The UNIX file system could be
directed into an ASSIGNed logical device (UNIXFS:) and the swap space
to another (UNIXSW:).  How about it?

Wildstar
"Any lesser duck would have given up by now!" -- D.Duck

daveh@cbmvax.commodore.com (Dave Haynie) (06/07/91)

In article <1991Jun5.072620.18879@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:

>  Hmm, anyone wanna tell me how transputers work? (multiprocessing)?

Transputers define a loosely coupled multiprocessing system based on message
passing.  Essentially, each transputer has four hard message ports, which can
be used to send messages reasonably fast to four neighboring transputers.  This
is a decent architecture for building hypercube stype loose multiprocessing
systems.  There's really nothing that special about using a transputer rather
than some other system.  In fact, they're pretty weird.  The T800 series had
decent floating point, but weak integer instructions.  You do get some really
outrageous native MIPS figures for Transputer code, but each instruction does
much less work than a typical CISC or RISC instruction.  So a 68030 with a 
decent link chip as a periperheral would work better than an T800 at this.  
Which wouldn't have been a big issue if the Transputers had gone down in price,
since an '030 or any similar chip would need a link peripheral to do the same
job.  However, INMOS kept the pricing such that Transputers couldn't compete,
even with the built-in links.  As always, some people chose the Transputer
(easier solution) over a faster/cheaper but more work intensive solution based
on standard parts.  Others didn't, in fact, many of the hypercube architecture
machines are based around Intel processors.

On the other hand, INMOS was purchased by Thompson, and with some new money was
able to define a next generation Transputer.  It's not out yet, but the new
T9000 sounds much more intriguing.  It's considerably faster, so it could very
well stand a chance of competing with other modern CPUs on a 1:1 basis.  The
links are faster now, and they have a neat cross-point switch that works with
them.  The hardware supports virtual messages now.  Basically, a header in the
message structure indicates where it's going in your transputer network 
somehow.  When the message gets sent out the hardware port, it can get to its
ultimate destination through this message router, which will wait until a link
to the destination is free, create a temporary routing path from one to the
other, then dissolving it when the message has passed.  

Anyway, this new one looks to be pretty cool.  Yet, it doesn't solve the main
problem with loosely coupled system, which is, how to schedule work such that
it gets done faster on multiple processors than it does on a single one.  With
this model, you're basically dependent on splitting things at the task level.
This generally means that you have to write your code much differently, 
adapting the problem to the solution.  Some problems adapt, others don't. 

-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	"This is my mistake.  Let me make it good." -R.E.M.

daveh@cbmvax.commodore.com (Dave Haynie) (06/07/91)

In article <16953@helios.TAMU.EDU> aaf4808@venus.tamu.edu writes:

>Now for my next question: Why was everyone getting so worked up about this
>040 / 030 multiprocessor thing if no OS will soon take advantage of it?

I don't know if everyone was getting all worked up about it or not.  Someone
asked (actually, whole tribes of people asked) what the differences were 
between accelerated 16MHz and 25MHz systems.  The multiprocessing feature is
the only difference.  

-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	"This is my mistake.  Let me make it good." -R.E.M.

pwappy@well.sf.ca.us (Jeff Walkup) (06/08/91)

RE: the discussion of running an '040 board on a 25MHz vs. a 16MHz A3000.
 
I thought the whole idea of having the Fast Slot in the A3000 was that
you could add co-processor boards like an '040 board, and they would use
the motherboard's RAM and bus.  Thereby making it cheap relative to an
A2000 co-pro. board that would need its own RAM, bus, and SCSI host.
 
SO,  the realistic speed limit for an '040 board in an A3000 sounds like
25MHz, since going any faster means you wouldn't be able to use the
motherboard's resources.  And likewise, if you have a 16MHz A3000, then
16Mhz would be the limit.
 
Not that you *couldn't* go faster, just that the card would be more
expensive as it would need its own RAM, etc...
 
Am I right?
 

daveh@cbmvax.commodore.com (Dave Haynie) (06/11/91)

In article <25302@well.sf.ca.us> pwappy@well.sf.ca.us (Jeff Walkup) writes:

>I thought the whole idea of having the Fast Slot in the A3000 was that
>you could add co-processor boards like an '040 board, and they would use
>the motherboard's RAM and bus.  Thereby making it cheap relative to an
>A2000 co-pro. board that would need its own RAM, bus, and SCSI host.

That's correct.  To get any sort of noticable performance increase with a 
fast 32 bit coprocessor board, an A2000 coprocessor device needs its own
memory.  It can, of course, still use the motherboard memory, but only at
that memory's speed, and only 16 bits wide.  Since most 32 bit processors are
optimized for 32 bit wide data transfers, they can actually go slower with 
only 16 bit memory than the 16 bit 68000.

>SO,  the realistic speed limit for an '040 board in an A3000 sounds like
>25MHz, since going any faster means you wouldn't be able to use the
>motherboard's resources.  And likewise, if you have a 16MHz A3000, then
>16Mhz would be the limit.

>Not that you *couldn't* go faster, just that the card would be more
>expensive as it would need its own RAM, etc...

>Am I right?

No, you're confused.  But that's OK...

First of all, as mentioned about a quadrillion times in this very group, a 
coprocessor board can, if designed properly, up the motherboard clock rate of
a 16MHz A3000 to 25MHz.  This works because it was designed to work that way.
That's only speaking to how fast the motherboard clocks go.  The clock on the
coprocessor board's processor can theoretically be any rate the designer
chooses.  However, if the clock is unrelated to the motherboard clock (which,
as mentioned, can be either 16MHz or 25MHz, pick one), the designer will have
to solve the synchronization problems such a setup will inherently create.
The motherboard clock fixes the speed of the motherboard memory system, no 
matter what speed your coprocessor board's processor runs at.  A faster 
coprocessor can talk just fine to the A3000's motherboard memory (in fact,
it's required to).  And if the processor is faster, it's very likely to go
faster than a 25MHz system, regardless of the fact that the memory system
isn't going faster.  To use such a faster CPU to its fullest ability, though,
the designer can stick a full speed cache or some faster 32 bit wide memory
on the coprocessor board.



-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	"This is my mistake.  Let me make it good." -R.E.M.

jdickson@jato.jpl.nasa.gov (Jeff Dickson) (06/13/91)

In article <22213@cbmvax.commodore.com> daveh@cbmvax.commodore.com (Dave Haynie) writes:
>In article <1991Jun5.072620.18879@mintaka.lcs.mit.edu> rjc@geech.gnu.ai.mit.edu (Ray Cromwell) writes:
>
>>  Hmm, anyone wanna tell me how transputers work? (multiprocessing)?
>
>Transputers define a loosely coupled multiprocessing system based on message
>passing.  Essentially, each transputer has four hard message ports, which can
>be used to send messages reasonably fast to four neighboring transputers. 

	If memory serves (BYTE a few years ago), it was something like 80
mega bits per second. They also did some benchmark, where a 1MHZ T800 was
as fast as a 4MHZ Z80. It was an attractive processor, because for instance
it performed multitasking in hardware. 

> The T800 series had decent floating point, but weak integer instructions.

	I didn't know that.

>On the other hand, INMOS was purchased by Thompson, and with some new money was
>able to define a next generation Transputer.  It's not out yet, but the new
>T9000 sounds much more intriguing.  It's considerably faster, so it could very
>well stand a chance of competing with other modern CPUs on a 1:1 basis.  The
>links are faster now, and they have a neat cross-point switch that works with
>them.  The hardware supports virtual messages now.  Basically, a header in the
>message structure indicates where it's going in your transputer network 
>somehow.  When the message gets sent out the hardware port, it can get to its
>ultimate destination through this message router, which will wait until a link
>to the destination is free, create a temporary routing path from one to the
>other, then dissolving it when the message has passed.  

	Oooh, nifty! Sure'd be nice if this one could make it down to some
platform so I could experiment with it. 
>
>Anyway, this new one looks to be pretty cool.  Yet, it doesn't solve the main
>problem with loosely coupled system, which is, how to schedule work such that
>it gets done faster on multiple processors than it does on a single one.  With
>this model, you're basically dependent on splitting things at the task level.
>This generally means that you have to write your code much differently, 
>adapting the problem to the solution.  Some problems adapt, others don't. 

	But perhaps it could better serve the system as an intelligent
coprocessor. I haven't even studied the T800 instruction set, so I don't know
its true possibilities. I just suspect that with its high speed message port
for instance, it could coordinate activites with other T800 coprocessors.
I'm exited, just can't get my hands on one!

	
>
>-- 
>Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
>   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
>	"This is my mistake.  Let me make it good." -R.E.M.

-jeff

pwappy@well.sf.ca.us (Jeff Walkup) (06/14/91)

In article (22301@cbmvax.commodore.com), daveh@cbmvax.commodore.com
(Dave Haynie) writes:
 
>First of all, as mentioned about a quadrillion times in this very
>group, a coprocessor board can, if designed properly, up the
>motherboard clock rate of a 16MHz A3000 to 25MHz.  This works because
>it was designed to work that way.
 
Does this involve installing a new clock crystal on the motherboard?
Or does the motherboard use the co-pro.'s clock in this case?
 
>However, if the clock is unrelated to the motherboard clock ... the
>designer will have to solve the synchronization problems such a setup
>will inherently create.
[Stuff deleted]
>To use such a faster CPU to its fullest ability, though,the designer
>can stick a full speed cache or some faster 32 bit wide memory on the
>coprocessor board.
 
So it still sounds like an economically reasonable limit would be an
'040 running at 25MHz (for under $1000), since going any faster would
mean handling the sync problems and adding extra RAM on the board.
Altough I *was* confused about the 16MHz situation.
 
*However*, I can see someone making an '040 board at say 50MHz, and
having a small (512K) static RAM cache, that wouldn't be too expensive.
Although the lack of many MB of super-fast 32-bit RAM might hamper the
speed a bit, I can see it really racing through raytracing and other
floating-point-intensive operations.
Of course, this is all speculation at this point, since the 68040
doesn't seem to be shipping in quantity yet, and the only version
Motorola has done so far is the 25MHz one.
Waddaya think?

daveh@cbmvax.commodore.com (Dave Haynie) (06/14/91)

In article <25432@well.sf.ca.us> pwappy@well.sf.ca.us (Jeff Walkup) writes:
>In article (22301@cbmvax.commodore.com), daveh@cbmvax.commodore.com
>(Dave Haynie) writes:

>>First of all, as mentioned about a quadrillion times in this very
>>group, a coprocessor board can, if designed properly, up the
>>motherboard clock rate of a 16MHz A3000 to 25MHz.  This works because
>>it was designed to work that way.

>Does this involve installing a new clock crystal on the motherboard?
>Or does the motherboard use the co-pro.'s clock in this case?

The coprocessor board supplies the clock.  You adjust a few strip-post 
jumpers on the motherboard and it'll accept the two system clocks from the 
coprocessor slot rather than the motherboard.

>>To use such a faster CPU to its fullest ability, though,the designer
>>can stick a full speed cache or some faster 32 bit wide memory on the
>>coprocessor board.

>So it still sounds like an economically reasonable limit would be an
>'040 running at 25MHz (for under $1000), since going any faster would
>mean handling the sync problems and adding extra RAM on the board.

Unlikely.  Certainly the "cheap" '040 boards might be without extra memory,
and certainly without external cache.  But I'm certain some companies will
build them as fast as Motorola can get their 68040s going.  Also, it's not
overly expensive to build a 68040 memory control circuit without any memory;
just a couple of PALs and some additional board space (maybe $15-$20 extra,
if you don't get too clever).  The synchronization problems aren't all that
much extra work; in fact, due to the speeds we're talking about here, it's
just about as hard to get a purely synchronous board working as an asynchonous
one.  You have to be real careful about clock skews if you're trying to make
a 68040 board fully synchronous to the A3000, even if it sources the clocks.

>*However*, I can see someone making an '040 board at say 50MHz, and
>having a small (512K) static RAM cache, that wouldn't be too expensive.

That's what I've been thinking.  You CAN do it that way, and it has some 
advantages.  First of all, the cache will tend to speed up your CPUs operation
everywhere.  You can cache A3000 motherboard RAM (which many people will 
already have lots of before they add '040s) and any Zorro III RAM that comes
along (memory on the Zorro III bus isn't quite as fast as on the motherboard,
but it has advantages: there's room there for lots of memory, and you can be
sure that a Zorro III card will work in all future high end slotted Amigas).

In any case, its up to the designer.  We made it flexible on purpose.  It
should make A3000 coprocessor boards much more interesting.  The A2000 boards
got kind of boring, they all wound up looking more or less like the A2630
(asynchronous design, 25MHz or more, DRAM, a hard disk controller on some).

>Although the lack of many MB of super-fast 32-bit RAM might hamper the
>speed a bit, I can see it really racing through raytracing and other
>floating-point-intensive operations.

With 128K-512K of cache, in addition to the '040's internal cache, you should
be getting an excellent combined primary/secondary hit rate.  Obviously, you
aren't going to go as fast as with 0 wait state memory, but keep in mind you
don't get close to 0 wait states at 25MHz, and by the time you're at 50MHz,
you're lucky if the secondary cache doesn't have wait states of its own (the
68040, however, should make this easier than the '030 did).



-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	"This is my mistake.  Let me make it good." -R.E.M.

doconnor@srg.UUCP (Dennis O'Connor x4982 room 6-230N) (06/17/91)

Every once and a while technology whips right by my expectations,
upsetting my onbaord fuzzy logic, like what "lots of memory" is.
For example:

daveh@cbmvax.commodore.com (Dave Haynie) writes:
] memory on the Zorro III bus isn't quite as fast as on the motherboard,
] but it has advantages: there's room there for lots of memory

See, my A3000 motherboard already has room for 18 megabytes of memory,
and I thought that WAS a lot. Sigh. Time to re-calibrate my English.
:-)

Of course, I'm not running UNIX on my A3000 yet. Maybe that's why I
thought 18meg was a lot. ;-)
--
--
Dennis O'Connor,      		uunet!srg!titania!doconnor
non-representative.