[net.micro.amiga] Floating point libraries

eve@ssc-bee.UUCP (Michael Eve) (10/14/86)

	With an Amiga purchase in the immediate future, I have been following
	the Lattice/Manx discussions, and found them very useful.  One area
	remains unclear (actually, many areas, but one of particular concern):
	floating point libraries.

	Manx is about 7 times faster than Lattice for floating point 
	operations.  Seems to indicate Manx is using assembly-coded
	routines and Lattice has C-coded routines.  Is it possible 
	to write and substitute one's own floating point libraries
	for either compiler?  I am considering hardware add-ons
	such as 32081 or 68881 (possibly home-brewed), and want a
	compiler which can take advantage of the new hardware.
	Also, it should be possible to speed up Lattice if the fp
	library can be replaced.
-- 
	Mike Eve     Boeing Aerospace, Seattle
	...uw-beaver!ssc-vax!ssc-bee!eve

dillon@CORY.BERKELEY.EDU (Matt Dillon) (10/16/86)

	I believe Manx allows you to use the FFP and IEEE rt-libraries via
float/double declarations.  W/ Lattice , you have to make the calls manually.
Manx wins hands-down on floating point operations.  From the sorry state
Lattice's math library is in, it looks like they simply ported it from
the IBM (or other) version... extremely slow.

				-Matt

hadeishi@husc4.harvard.edu (mitsuharu hadeishi) (10/16/86)

In article <8610160630.AA06315@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes:
>	I believe Manx allows you to use the FFP and IEEE rt-libraries via
>float/double declarations.  W/ Lattice , you have to make the calls manually.
>Manx wins hands-down on floating point operations.  From the sorry state
>Lattice's math library is in, it looks like they simply ported it from
>the IBM (or other) version... extremely slow.

	Actually the old Manx uses FFP only.  This makes Manx fast,
but inaccurate.  Also FFP mathtrans.library doesn't support all of
the transcendental functions that are available in most IEEE
implementations; I'm not sure which functions aren't implemented.
(FFP is only 32 bits, singel precision only.)

	The next release of Manx will have options to use FFP, IEEE,
68881 IEEE, and perhaps some others.

	The old Lattice uses IEEE but converted all arguments
to double precision before doing computations.  The algorithms
were also very slow and unoptimized.  Thus old Lattice IEEE was about
five to ten times slower than FFP.  However, it was a lot more
accurate than FFP and also provided the full gamut of transcendental
functions.

	However, the new release of Lattice C contains a floating
point library which is IEEE but is reputed to be about five times
faster or more.  This makes it competitive with FFP, and also
the data will be compatible with the 68881 (a VERY large plus for
portability of floating point data between applications and machines.)
Overall I would prefer to use the new Lattice fast IEEE libraries
(highly optimized) over FFP since FFP is nonstandard (will not work
with the 68881) and only works in single precision.

				-Mitsu

lel@wuphys.UUCP (10/20/86)

In article <417@husc6.HARVARD.EDU> hadeishi@husc4.UUCP (mitsuharu hadeishi) writes:
>In article <8610160630.AA06315@cory.Berkeley.EDU> dillon@CORY.BERKELEY.EDU (Matt Dillon) writes:
>
>	Actually the old Manx uses FFP only.  This makes Manx fast,
>but inaccurate.  Also FFP mathtrans.library doesn't support all of
>the transcendental functions that are available in most IEEE
>implementations; I'm not sure which functions aren't implemented.
>(FFP is only 32 bits, singel precision only.)
>
>	The next release of Manx will have options to use FFP, IEEE,
>68881 IEEE, and perhaps some others.
>
>	However, the new release of Lattice C contains a floating
>point library which is IEEE but is reputed to be about five times
>faster or more.  This makes it competitive with FFP, and also
>the data will be compatible with the 68881 (a VERY large plus for
>portability of floating point data between applications and machines.)
>Overall I would prefer to use the new Lattice fast IEEE libraries
>(highly optimized) over FFP since FFP is nonstandard (will not work
>with the 68881) and only works in single precision.
>
>				-Mitsu

I just got my new Absoft F77 compiler last week and have been
familiarizing myself with it and running benchmarks.  The first
tests I did were of the included IEEE floating point libraries.
The 1st tests compare Lettuce, AmigaBasic, and Fortran on just
multiplication loops.
          a=1
          loop i=1 to 100000
               a=a*1.0001
          end loop
Results: (from memory...)

            Lattice double:  ~120 sec
	    Lattice float:   ~180 sec
            Basic (double):  140 sec
	    Basic (float):   115 sec
	    Fortran float:   11.5 sec
	    Fortran double:  ~26  sec

The Fortran math library is IEEE compatible.
Lattice double is faster than Lattice single since it must pad and
truncate. What surprized me was the agonizing slowness of their
floating point calculations. Absoft, on the other hand, clearly
did a good job of optimizing their
floating point libraries. When I get time I'll
run some more benchmarks. If there is any interest, I'll post the
results. 

By the way, when is Lattice's new release due out? I haven't gotten
a card yet...  In the mean time, could someone run this benchmark
on a beta release for comparison:

	main()
	{
	int num;
	float/double a;
	a=1.;
	printf("%.0f\n",a);
	for (num=1;num<=100000;++num)
		a=a*1.0001;
	printf("%.0f\n",a);
	}

I just timed it with a stop watch.
 
			Thanks!


===============================================================================
	Without physics, computers themselves would be impossible.
	Without computers, physics itself would be impossible.

	(An example of dichotomy in nature)

		   Lyle E. Levine

                   Paths ->   ihnp4!wuphys!lel
			      seismo!wucs!wuphys!lel

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

ingoldsby@calgary.UUCP (Terry Ingoldsby) (10/22/86)

In article <432@wuphys.UUCP>, lel@wuphys.UUCP writes:
> 
> I just got my new Absoft F77 compiler last week and have been
> familiarizing myself with it and running benchmarks.  The first
> tests I did were of the included IEEE floating point libraries.
> The 1st tests compare Lettuce, AmigaBasic, and Fortran on just
> multiplication loops.
>           a=1
>           loop i=1 to 100000
>                a=a*1.0001
>           end loop
> Results: (from memory...)
> 
>             Lattice double:  ~120 sec
> 	    Lattice float:   ~180 sec
>             Basic (double):  140 sec
> 	    Basic (float):   115 sec
> 	    Fortran float:   11.5 sec
> 	    Fortran double:  ~26  sec
> 
> The Fortran math library is IEEE compatible.
> Lattice double is faster than Lattice single since it must pad and
> truncate. What surprized me was the agonizing slowness of their
> floating point calculations . . .

If I understand the situation correctly, all floating point
operations in C are performed using double precision
arithmetic (at least unless you have a compiler that
conforms to the new ansii standard).  I don't have an
Amiga, but I presume that Lattice C observes this convention
thus explaining the anomaly.

             Terry Ingoldsby

....!ihnp4!alberta!calgary!ingoldsby

stever@videovax.UUCP (Steven E. Rice, P.E.) (10/24/86)

In article <432@wuphys.UUCP>, Lyle E. Levine (lel@wuphys.UUCP) writes:

> . . .
>             Lattice double:  ~120 sec
> 	    Lattice float:   ~180 sec
>             Basic (double):  140 sec
> 	    Basic (float):   115 sec
> 	    Fortran float:   11.5 sec
> 	    Fortran double:  ~26  sec
> 
> The Fortran math library is IEEE compatible.
> Lattice double is faster than Lattice single since it must pad and
> truncate. What surprized me was the agonizing slowness of their
> floating point calculations. Absoft, on the other hand, clearly
> did a good job of optimizing their
> floating point libraries.  . . .

There may be more here than meets the eye!  It is my recollection that
the IEEE Standard specifies that all operands are to be extended to the
maximum precision of the implementation, the operation(s) performed, and
the result(s) rounded to the final precision using the specified rounding
mode.  Although Lattice seems to take it to an extreme (50% overhead
to extend the operand and round it after seems awfully steep!), the fact
that the Fortran double-precision takes more than twice as long as the
single-precision indicates they are not following this procedure.

Probably what is meant by "IEEE compatible" is that the bits are in
the same place as they are in the IEEE specification.  Although it takes
longer when all computations are performed to the maximum precision of
the implementation, it produces better results.

Cheer up!  The 68881 does this (maximum precision is 80 bits, consisting
of sign, 15-bit exponent, and 64-bit mantissa), and it's blindingly fast.
Any day, now, we'll have 68020s and 68881s in our Amigas, and all these
problems will go away!  (I can dream, can't I?)

					Steve Rice

----------------------------------------------------------------------------
{decvax | hplabs | ihnp4 | uw-beaver}!tektronix!videovax!stever

alverson@decwrl.DEC.COM (Robert Alverson) (10/25/86)

In article <2019@videovax.UUCP> stever@videovax.UUCP (Steven E. Rice, P.E.) writes:
>There may be more here than meets the eye!  It is my recollection that
>the IEEE Standard specifies that all operands are to be extended to the
>maximum precision of the implementation, the operation(s) performed, and
>the result(s) rounded to the final precision using the specified rounding
>mode.  Although Lattice seems to take it to an extreme (50% overhead
>to extend the operand and round it after seems awfully steep!), the fact
>that the Fortran double-precision takes more than twice as long as the
>single-precision indicates they are not following this procedure.

Hold on a sec! For a single operation, IEEE says that the operation is
to be performed as if it were with infinite precision.  However, only
a rounded result is stored.  Careful analysis will show that this can
be accomplished with only a few bits more than the result precision.
There is no need to always switch to double-extended!  I'm not sure
about intermediate results.  However, my feeling is that they will
only guarantee that operations be carried out at least at the result
precision--keeping intermediate results in double-extended format
would be a bonus.

Bob