[comp.unix.i386] What would you use for Fast Floating-Point?

edhall@randvax.UUCP (Ed Hall) (06/20/89)

I want to put together a reasonably priced 386-based system which will
support blindingly fast floating-point calculations.  To give you an
idea of how fast I want, I currently have a 286-based system with a
10Mhz '287 in it--and I need something at least 10 times faster.
This means something with performance close to 1 MFlop.

Does anyone know just how much faster the '387 is compared to its
lowly cousin the '287?  How about a coprocessor like the Weitek?
And, before you get any ideas about some add-in vector processor,
whatever I get will have to be supported by some reasonable version
of UNIX, and be accessable from C.

If there is any interest, I'll summarize any advice I get and post.

Thanks!

		-Ed Hall
		edhall@rand.org
		..!uunet!edhall@rand.org
		..!decvax!randvax!edhall

chasm@killer.DALLAS.TX.US (Charles Marslett) (06/23/89)

In article <2105@randvax.UUCP>, edhall@randvax.UUCP (Ed Hall) writes:
> I want to put together a reasonably priced 386-based system which will
> support blindingly fast floating-point calculations.  To give you an
> idea of how fast I want, I currently have a 286-based system with a
> 10Mhz '287 in it--and I need something at least 10 times faster.
> This means something with performance close to 1 MFlop.
> 
> Does anyone know just how much faster the '387 is compared to its
> lowly cousin the '287?  How about a coprocessor like the Weitek?

The Intel '387 processor is about 3 or 4 times faster than the '287 at
the same clock speed.  I think the Weitek chip is perhaps 2 or 3 times that
speed if the code is well written (it can, if the code is just copied over,
be slower, however).

With Microsoft C (5.1), I get about 160,000 Flops on a 20MHz Everex Step/386
with a 387.  And on a 10 MHz AST Premium/286 with a 287, I get about 28,000
(AST runs the 287 at 8 MHz).

For those watching IIT (and their 287/387 clones), the IIT 287 is a lot faster
than the Intel part (say 2x or 3x), but the IIT 387 is much closer to Intel's
speed.  The IIT '387 will run about 200,000 Flops in the example above (25-35%
faster, that is).  I really reccommend it for anyone needing just a little
push doing fp stuff on a 286!

As soon as I get a real production IIT part, I'll post some real measurements.

Charles Marslett
chasm@killer.dallas.tx.us
> And, before you get any ideas about some add-in vector processor,
> whatever I get will have to be supported by some reasonable version
> of UNIX, and be accessable from C.
> 
> If there is any interest, I'll summarize any advice I get and post.
> 
> Thanks!
> 
> 		-Ed Hall
> 		edhall@rand.org
> 		..!uunet!edhall@rand.org
> 		..!decvax!randvax!edhall

gors@well.UUCP (Gordon Stewart) (06/24/89)

>With Microsoft C (5.1), I get about 160,000 Flops on a 20MHz Everex Step/386
>with a 387.  And on a 10 MHz AST Premium/286 with a 287, I get about 28,000
>(AST runs the 287 at 8 MHz).
>
>Charles Marslett
>chasm@killer.dallas.tx.us


I have a Mylex MI-386/20 w/a 387 chip running the Microway NDP C compiler
which requires the Phar Lap Assembler and DOS Xtender to run under MS DOS
(btw they have a UNIX version for V/386)

My results are (based on Linpack and Whetstone and Dhrystone):
	
	~4 MIPS
	560,000 FLOPS

An optimizing compiler will give better results than a faster FPU, since
even with an infinitely fast FPU, a CPU-bound program may only run twice
as fast.  The true 32 bit code helps a whole lot, a smart global optimizer
helps a whole lot, and writing optimizable code helps a lot.

NDP have a Fortran compiler, and if that's yer thang, read the new book
called (paraphrase) "a guide to Fortran on Supercomputers.." on Academic
Press by a coupla guys from Pacific Sierra Research.

They discuss issues of coding style and how it affects an optimizers
ability to generate good code.

-- 
    Michael Sierchio (in his guise as Gordon Stewart)

{apple, pacbell, hplabs, ucbvax}!well!gors	or	well!gors@apple.com

golds@rlgvax.UUCP (Rich Goldschmidt) (06/27/89)

In article <2105@randvax.UUCP>, edhall@randvax.UUCP (Ed Hall) writes:
> I want to put together a reasonably priced 386-based system which will
> support blindingly fast floating-point calculations.  

> This means something with performance close to 1 MFlop.
> -Ed Hall edhall@rand.org ..!uunet!edhall@rand.org

These are the numbers I have seen, approximately:

386/20MHz  with 80387: 0.4 MFlop  ~$250 (?)
  with Weitek 1167/20: 2.0 MFlop  ~$2500 with software
  32 bit integer math: 2.4 MFlop  your time to convert to integer math
  FP/AP board       : 12.5 MFlop  ~$2500 with software

The FP/AP board above is a floating point array processor for DOS only, from
Eighteen Eight (something like that, if anyone really needs a followup, I will
find all the details from my brochure at home).  The software support appears
to be pretty good based on the brochure I saw:  it comes with a library of 
nearly 500 functions you can call from several different languages.
However, it does not support Unix on a 386/AT.  It can't cope with virtual
memory...

Rich Goldschmidt
uunet!rlgvax!golds  or  golds@rlgvax.uu.net

ralf@b.gp.cs.cmu.edu (Ralf Brown) (06/29/89)

In article <1216@rlgvax.UUCP> golds@rlgvax.UUCP (Rich Goldschmidt) writes:
}386/20MHz  with 80387: 0.4 MFlop  ~$250 (?)
}  with Weitek 1167/20: 2.0 MFlop  ~$2500 with software
}  32 bit integer math: 2.4 MFlop  your time to convert to integer math
			^^^^^^^^^
			0.0 MFlop -- you're doing 0 floating point operations
(sorry, couldn't resist picking that nit....)

-- 
{harvard,uunet,ucbvax}!b.gp.cs.cmu.edu!ralf -=-=- AT&T: (412)268-3053 (school)
ARPA: RALF@CS.CMU.EDU     |"The optimist is the kind of person who believes a
FIDO: Ralf Brown 1:129/46 | housefly is looking for a way out."--Geo.J.Nathan
BITnet: RALF%CS.CMU.EDU@CMUCCVMA -=-=-=-=-=- DISCLAIMER? I claimed something?