[net.micro.68k] Nothing New Here / Re: VAX Floating Point

phipps@fortune.UUCP (Clay Phipps) (09/11/84)

> ... the floating point on the VAX ... makes perfect sense 
> when you look at what you can do with it. 
> The 32 and 64 bit floating point formats 
> are identical in the first 32 bits.
> The 64 bit format just adds more precision bits. 
> This allows a user to reference a 64 bit floating point value 
> as if it were a 32 bit one without any conversion. 
> It also allows a user > to reference a 32 bit floating point value 
> as if it were a 64 bit floating point one without any conversion. 

Strange, this sounds like a description of the floating point formats
on the IBM 360, an architecture that preceded the VAX by a few years
(about 14 years, actually: 1964 versus 1978).
Many numerical analysis types think that the IBM 360 floating-point
is one of the worst.  Among other things, I think many people believe
that they are entitled to a few more exponent bits when they *double*
the size of each data value (from 32 to 64 bits).
The hexadecimal (i.e., 4 bits instead of 1 bit) normalization on the 360
reduces the effective precision of floating-point numbers
and is another source of irritation to numerical analysis people;
I'm not sure what the VAX does on this one.

-- Clay Phipps

-- 
            { amd  hplabs!hpda  sri-unix  ucbvax!amd }          
                                                      !fortune!phipps
   { ihnp4  cbosgd  decvax!decwrl!amd  harpo  allegra}

haapanen@watdcsu.UUCP (Tom Haapanen [DCS]) (09/12/84)

The 360/370/30XX/43XX series IBM machines use hexadecimal
normalization.  This is the pits.

The VAX uses binary normalization.  This is MUCH better.

Tom Haapanen
University of Waterloo
{allegra,decvax,ihnp4,utzoo}!watmath!watdcsu!haapanen

...thanks to CS375 (Numerical Analysis)...

dmmartindale@watcgl.UUCP (Dave Martindale) (09/12/84)

The VAX (whose F and D floating formats were copied from the PDP-11) has
quite reasonable floating point - normalization is binary and the leading
'1' bit is not stored, so it gets several bits more precision than the
IBM three-sickly.  And it takes some care to round properly.
It has a relatively small exponent range, fixed in the G and H floating
point types introduced a few years ago.  I believe that it was proposed
as one of the competing standards for the IEEE 754 floating-point standard.

If you want to see a really interesting floating-point format, read up
on what was eventually adopted for the IEEE 754 standard.  It starts
out similar to VAX floating point in representation, but adds
gradual underflow of numbers too small to normalize, and representations
for infinity and NaN (Not a Number - basically an undefined result).
There is an extended-precision format defined for partial results
in internal registers.  Rounding can be done to the nearest integer,
towards 0 (truncate), or towards plus or minus infinity.  All the
special cases of doing arithmetic on operands of zero, infinity,
and NaN are specified, as are the actions if a result overflows or
underflows or an operation is invalid.  Lots of neat stuff.

It is nice to see someone make a hardware design more complex in order
to make the software simpler and more straightforward (and more likely
to be correct).

rcb@rti-sel.UUCP (09/14/84)

	The IBM (Itty Bitty Machines) floating point is NOT the
same as the VAX floating point format. The IBM is hex based floating point
and must be normalized such that the highest digit is not zero. The VAX
floating point is binary based, which means that the highest normalized
fraction digit must not be zero which means that it must be one.
VAX takes advantage of this fact by not storing that bit at all.
Also, any number that is a power or factor or 2 (i.e. 1, 2, 4, .5, .25, etc.)
does not have any bits in the exponent and the floating point hardware
can take advantage of this. A benchmark shows that 100 million floating
multiplications takes 45 seconds when these special values are used and
takes 2.5 minutes when any old numbers are used.

	And, for the numerical analysis types that want greater range,
they have 2 options on the VAX. G format floating point uses 64 bits
and gives a range of .56*10**-308 to .9*10**308 with 15 digits precision.
The standard double floating gives .29*10**-38 to 1.7*10**38 with 16 digits
precision. And for you people who really like big numbers, there is H format
floating point which uses 128 bits to give a range of .84*10**-4932 to
.59*10**4932 with a mighty 33 digits of precision. Enough to satisfy
even the most manical numerical analyst.

					Randy Buckland
					Research Triangle Institute
					...!mcnc!rti!rcb