[comp.sys.amiga.tech] 80860 as a math processor

richard@gryphon.COM (Richard Sexton) (04/15/89)

In article <865@savax.UUCP> thompson@savax.UUCP (thompson mark) writes:
>In article <1619@dretor.dciem.dnd.ca> king@dretor.dciem.dnd.ca (Stephen King) writes:
>
>Not foolish at all. Actually, the 120 MOPS is achieved on the 40MHz part
>by simultaneously executing a floating point add, floating point multiply,
>and an integer ALU operation in a single clock cycle. This doesn't even
>include whatever operation the graphics block is performing. Consequently,
>the 50 MHz part can attain 150 MOPS.

Yes, the 80860 looks like the only decent piece of silicon to come
out of intel since the 8051.

However, I believe it would be impossible to sustain that FLOP rate
as a math coprocessor in an Amiga.

Consider the 68881/2 or the Weitek chip: when a floating point
instruction is hit, BINGO, the hardware executes it.

Bit since the 80860 is not a 680x0 coprocessor, some hardware/software
interface will need to be worked out. This is a simplified version;
it coulkd take a dozen instructions each for the 680x0 and 80860
to completely pass the operands and resukt back and forth.

	68000: Hmm, I need to take the square root of 6.6

					80860: dum de dum dd dum

        68000: hey 80860, ar you busy ?

					80860: nope

	68000: here, 80860, heres a number

					80860: ok, got it

	68000: hey 80860, are you busy ?

					80860: nope ?

	68000: Ok, do a square root will ya ?

	68000: 80860 are you done

					80860: of course!



Whereas a true 680x0 math coprocessor has none of this nonsense, it
just executes inline floating point instructions.

So it will take some calculations to determine if the 80860
is actually a superior math processor. Where it would really
shine though, would be where you could unbundle a monolithic
process, such as a matrix inversion or the like. Clearly this
is something a pure math coprocessor could not do.

-- 
       ``Parents who have children, have children who have children''
richard@gryphon.COM  decwrl!gryphon!richard   gryphon!richard@elroy.jpl.NASA.GOV

bakken@arizona.edu (Dave Bakken) (04/15/89)

In article <14716@gryphon.COM>, richard@gryphon.COM (Richard Sexton) writes:
> Yes, the 80860 looks like the only decent piece of silicon to come
> out of intel since the 8051.
Close.  Their 80960 (aka p7), which is targetted for embedded 
applications, is very interesting.  It has good performance, but has
a 4 deep register cache (registers map into the stack frame) and, more
significantly, the hardware can schedule tasks for you. (Interestingly,
but perhaps not surprisingly, I have not seen one trade rag mention the
HW scheduling.  About 1.5 years ago I benchmarked Ada on the chip and 
worked on a debug monitor for it, and I think that feature is 
very significant).  Also, I think they are offering versions with 
fault tolerant support (last I heard it was QMR, not TMR, 
unfortunately).  Of course, this chip will live in niche applications 
that most of us never see.
-- 
Dave Bakken
bakken@arizona.edu
uunet!arizona!bakken
"The fact that I'm paranoid doesn't prove everyone's *not* out to get me"

karl@sugar.hackercorp.com (Karl Lehenbauer) (04/15/89)

In article <14716@gryphon.COM>, richard@gryphon.COM (Richard Sexton) writes:
> 	68000: Hmm, I need to take the square root of 6.6
> 					80860: dum de dum dd dum

I was thinking that the 860 could be a concurrent coprocessor; that is, it could
run concurrently with the 68000, mostly executing within its own private memory.

You could use it to get "reasonable" times on your photorealistic HAM 
raytracings (just kidding about the HAM part...hilk hilk) and your 
Mandlebrots -- kind of what the IBM "Wizard" board for the PiS/2 appears to do.
-- 
-- uunet!sugar!karl  | "Time is an illusion.  Lunchtime doubly so."
--		     |				-- Ford Prefect
-- Usenet BBS (713) 438-5018

thompson@savax.UUCP (thompson mark) (04/21/89)

In article <14716@gryphon.COM> richard@gryphon.COM (Richard Sexton) writes:
>Yes, the 80860 looks like the only decent piece of silicon to come
>out of intel since the 8051.
>
>However, I believe it would be impossible to sustain that FLOP rate
>as a math coprocessor in an Amiga.
>
>Consider the 68881/2 or the Weitek chip: when a floating point
>instruction is hit, BINGO, the hardware executes it.
>
>But since the 80860 is not a 680x0 coprocessor..[stuff about a slow interface]

I didn't realize the Weitek part was so compatible with 680x0's.
Actually, I never envisioned the i860 (Intel N10) as replacing the 68881/2
but using it as a somewhat dedicated auxillary processor in some sort of
multi-processor scheme with shared memory. The application I specifically had
in mind was a graphics card using the i860 for display list traversal,
3D transformations, smooth shaded solid rendering, and hidden surface removal.
The Amiga would simply DMA the display list into the i860's memory and off
it would go cranking out Phong shaded unicycles in real time. (This implies
that you put the display buffer on the i860 graphics card.) The thing to
remember is that the i860 is processor all by itself (unlike the 68881/2),
but I guess you knew that already.

>Where it would really
>shine though, would be where you could unbundle a monolithic
>process, such as a matrix inversion or the like. Clearly this
>is something a pure math coprocessor could not do.

Yeah.
--------------------------------------------------------------------------
|      Mark Thompson                                                     |
|      decvax!savax!thompson       Designing high performance graphics   |
|      (603)885-9583               silicon today for a better tomorrow.  |
--------------------------------------------------------------------------

raz@kilowatt.uucp (Raz- Berry) (04/22/89)

	From bilan@morder Fri Apr 21 17:08:47 1989
	Return-Path: <bilan@morder>
	Received: from morder. by kilowatt. (4.0/SMI-4.0)
		id AA00571; Fri, 21 Apr 89 17:08:34 PDT
	Received: by morder. (4.0/SMI-4.0)
		id AA04597; Fri, 21 Apr 89 17:02:17 PDT
	Date: Fri, 21 Apr 89 17:02:17 PDT
	From: bilan@morder (Steve Bilan ext. 31031)
	Message-Id: <8904220002.AA04597@morder.>
	To: raz@kilowatt, rlam@neptune
	Subject: sysdiag bug with e-net pal change
	Cc: donald@dswalker, hsiegel@margo, iwerness@broadmoor, kho@ho, rwlee@onyx
	Status: R

	The test problem that you ran into (for ECO G579)
	does not really seem to be a bug.  I ran sysdiag
	on two 3/E systems (with the new pals) overnight
	and only had one problem.  When the ethernet cable
	was unplugged, sydiag generated the message:

		enet - no replies to broadcast echo packet on enet0

	But after plugging the cable back in, the devtop
	test would work again.  It is still running at
	this point; I will continue to run it over the

In article <877@savax.UUCP> thompson@savax.UUCP (thompson mark) writes:
)In article <14716@gryphon.COM> richard@gryphon.COM (Richard Sexton) writes:
))Yes, the 80860 looks like the only decent piece of silicon to come
))out of intel since the 8051.
))
))However, I believe it would be impossible to sustain that FLOP rate
))as a math coprocessor in an Amiga.
))
))Consider the 68881/2 or the Weitek chip: when a floating point
))instruction is hit, BINGO, the hardware executes it.
))
))But since the 80860 is not a 680x0 coprocessor..[stuff about a slow interface]
)
)I didn't realize the Weitek part was so compatible with 680x0's.
)Actually, I never envisioned the i860 (Intel N10) as replacing the 68881/2

I think we are a little confused here, There is no way that the Weitek
can possibly emulate a 68881. Not without a major hardware kluge. I think 
that richard means that the Weitek chip SET (IU and FPU) together function
simalirly to the 680X0 and coprocessor series.

)but using it as a somewhat dedicated auxillary processor in some sort of
)multi-processor scheme with shared memory. The application I specifically had
)in mind was a graphics card using the i860 for display list traversal,
)3D transformations, smooth shaded solid rendering, and hidden surface removal.

God this sounds familiar. Didn't there used to be a company that did stuff like this?
You Know what else this fictional board needs... a standard graphics language.
How about PHIGS+? Call me an Amiga PHIG.

)The Amiga would simply DMA the display list into the i860's memory and off
)it would go cranking out Phong shaded unicycles in real time. (This implies
)that you put the display buffer on the i860 graphics card.) The thing to
)remember is that the i860 is processor all by itself (unlike the 68881/2),
)but I guess you knew that already.

Exactly! This would be the IDEAL application for this chip. Putting
this baby on or at the mercy of the Amiga bus would strangle it's performance.
Of course if you do the board right, you could include the Clown...

This would be neet, but I fear that it would easily cost more than the
entire machine (+Caligrapher).-- 
Steve -Raz- Berry      Disclaimer: I didn't do nutin!
UUCP: sun!kilowatt!raz                    ARPA: raz%kilowatt.EBay@sun.com
"Fate, it protects little children, old women, and ships named Enterprize"

richard@gryphon.COM (Richard Sexton) (04/26/89)

In article <33739@kilowatt.uucp> raz@sun.UUCP (Steve -Raz- Berry) writes:
>
>I think we are a little confused here, There is no way that the Weitek
>can possibly emulate a 68881. Not without a major hardware kluge. I think 
>that richard means that the Weitek chip SET (IU and FPU) together function
>simalirly to the 680X0 and coprocessor series.

No, Richard meant it was a co-processor, like the 68881/2. Thats what
the ads implied.

So I called Weitek.

Buggers.

The stupid thing is memory mapped. You write the operands into
memory addresses, then give it an operation, then poll it for
completion. Some co-processor.

Some performence data: 

Linpack - single precision 8 MFLOPS
	  double           6 MFLOPS
Whenstone Single precision 2.0 MWhets (is that what he said ? Mwhets ?)
	  Double           1.2

How do these figures compare with '881/2 and 860 ?

Some interesting stuff about this part - it doesnt exist. It seems
Weitek has a nice math chip, the 3116 or something like that. They
sell a daughtorboard that has some glue logic and this chip
for use with the 386. What they have for the 68000 is a board
that uses the 3116 and some glue.

Every time I pumped him for information, he pumped me
wanting to know how many i needed, so thay could make their 
projections and deicide to make this chip or not.

Hell, I told 'em 10,000. Couldnt hurt. Weitek is in Sunnyvale.

-- 
               ``Bring me the head of fettucini alfredo''
richard@gryphon.COM  decwrl!gryphon!richard   gryphon!richard@elroy.jpl.NASA.GOV

raz@kilowatt.uucp (Raz- Berry) (04/28/89)

In article <15147@gryphon.COM> richard@gryphon.COM (Richard Sexton) writes:
)In article <33739@kilowatt.uucp> raz@sun.UUCP (Steve -Raz- Berry) writes:

))I think we are a little confused here, There is no way that the Weitek
))can possibly emulate a 68881. Not without a major hardware kluge. I think 
))that richard means that the Weitek chip SET (IU and FPU) together function
))simalirly to the 680X0 and coprocessor series.

)No, Richard meant it was a co-processor, like the 68881/2. Thats what
)the ads implied.

Your kidding. I don't claim to know all, but I thought I woulda 
heard that. Oh well.

)So I called Weitek.

)Buggers.

)The stupid thing is memory mapped. You write the operands into
)memory addresses, then give it an operation, then poll it for
)completion. Some co-processor.

Not a true co-processor if you ask me, ok so it fits the description...
but it's a kluge.

)Some performence data: 

)Linpack - single precision 8 MFLOPS
)	  double           6 MFLOPS
)Whenstone Single precision 2.0 MWhets (is that what he said ? Mwhets ?)
)	  Double           1.2

Did it say how this was set up? What I mean is, is it a 680x0 peripheral
or is it running it's own code memory (probably). If it needs it's own
special memory, like cache, then you might as well build a seperate board
and do it up as black box math server. If you are going to go to that
much trouble, might as well go with the '860 and get REAL speed.
I guess I define co-processor as a transparent hardware accelerator.
If you can hook it up to the main processor, with minimum amount of
hassel, and have it run out of processor memory space, then it's a 
co-processor. My definition. Weitek would probably disagree.

)How do these figures compare with '881/2 and 860 ?

I have no clue. I looked in the 881 manual and couldn't find the performance
figures.

)Some interesting stuff about this part - it doesnt exist. It seems
)Weitek has a nice math chip, the 3116 or something like that. They
)sell a daughtorboard that has some glue logic and this chip
)for use with the 386. What they have for the 68000 is a board
)that uses the 3116 and some glue.

Is it compatable with the 881 instruction set? If not, why bother? You'ld
have to write your own compiler for it. YECHH. Unless it does sin(x)/cos(y)
in two cycles...

)Every time I pumped him for information, he pumped me
)wanting to know how many i needed, so thay could make their 
)projections and deicide to make this chip or not.

Should have asked for a sample ;-)

)Hell, I told 'em 10,000. Couldnt hurt. Weitek is in Sunnyvale.

Gee, I can pick one up on my way to work!

)richard@gryphon.COM  decwrl!gryphon!richard   gryphon!richard@elroy.jpl.NASA.GOV

I don't like Weitek. When I was at Raster Tech. I talked to a few of the
engineers that had worked on previous projects involving their 64 bit
FPU/IU chip set. I heard nothing but horror stories about hardware bugs
and instructions that didn't work as advertized. Abort and stall are good
examples.
-- 
Steve -Raz- Berry      Disclaimer: I didn't do nutin!
UUCP: sun!kilowatt!raz                    ARPA: raz%kilowatt.EBay@sun.com
"Fate, it protects little children, old women, and ships named Enterprize"

dillon@POSTGRES.BERKELEY.EDU (Matt Dillon) (05/01/89)

:No, Richard meant it was a co-processor, like the 68881/2. Thats what
:the ads implied.
:
:So I called Weitek.
:
:Buggers.
:
:The stupid thing is memory mapped. You write the operands into
:memory addresses, then give it an operation, then poll it for
:completion. Some co-processor.

	The neat thing about the Weitek chip is that it has a stack.
Results are automatically pushed on the stack.  So, something like:

	(a + b) + (c + d)

	is: fpush a, fpush b, fadd, fpush c, fpush d, fadd, fadd, fpop (to) Mem

	That is, temporaries need not be moved off the FP unit.

	The sequent (parallel processor UNIX machine (running dynix)) uses
the Weitek chip.  Our current configuration has 12 386 boards (when you are 
running UNIX you generally don't care about the processor), each with a
Weitek fp unit.  While not a super computer, the thing can, running 12
compiles in parallel, compile all 246 C files (about half of postgres, the
rest being in lisp at the moment) in less than 5 minutes.

					-Matt

mcp@ziebmef.uucp (Marc Plumb) (05/04/89)

In article <15147@gryphon.COM> richard@gryphon.COM (Richard Sexton) writes:
>The stupid thing is memory mapped. You write the operands into
>memory addresses, then give it an operation, then poll it for
>completion. Some co-processor.

Well, it still counts as a coprocessor.  The coprocessor instructions
it ises aren't the F-line ones, they're loads and stores with magic
values, but there are bytes you can stick in the instruction stream
to make the thing compute.  Isn't that the point?

It is true that dedicated instructions can be decoded faster *if you
really work at it*, but have you ever looked at the 68020 coporcessor
protocol?  Nice and general, but *slow*.  The 6888[12] is also memory
mapped, albeit in a separate address space (FC = 111), and the processor
writes the instruction word to a certain place, then reads another to
see what needs doing, and generally keeps polling the coprocessor until
it tells the 020 it can go away.  Generally disgusting.

The only difference is the FC bits (that's why a 68010 helps a lot - it
lets you control them explicitly) and the fact that the load/store sequence
is done by microcode, not user code.  And I'll bet long odds Weitek's
protocol is more efficient.
-- 
	-Colin