[comp.lang.smalltalk] heaps of numbers

wilson@carcoar.Stanford.EDU (Paul Wilson) (03/06/90)

[ continuing a thread from comp.arch ]

FLAME-RETARDANT NOTE:

Despite some positive and (very) negative responses about non-IEEE 
low-precision floats, that's not really what I'm talking about here.  
I'm *not* sacrificing any precision or range, I'm just special-casing 
the representation and handling of the most common case, for performance
reasons.

>-->    [ I wrote: ]
>--> I'm looking for info that would be useful in implementing short
>--> floating point numbers for Scheme/Lisp/Smalltalk/Icon etc.
>--> 
>--> The basic idea is that I want short (1-word) floating point numbers to
>--> fit in one 32-bit words, *with* a runtime tag included.  My plan is
>--> [...shoehorning floats within the usual range into one word, and heap-
>-->     allocating those that that won't fit into the 1-word format... ]
>--> Of course, this means I have to sacrifice a few bits of a standard 32-bit
>--> float to make it fit.
>
>Standard 32-bit floats have 1 sign bit, 8 exponent bits and 23 (+1) bits of
>mantissa.  This gives an exponent range of 10**-38 to 10**+38, and a
>precision of about 7 decimal digits.
>
>For each bit of exponent you drop you cut the range in half -- 4 bit exponents
>only leave a range of .004 to 511.  Each four bits of mantissa you drop will
>lose you slightly more than 1 decimal digit.

I don't get this.  It seems to me that if 8 bits gives you a range of 10**-38
to 10**+38, then 4 bits ought to give you half as many orders of magnitude,
or about 10**-19 to 10**+19.  That's a LOT smaller/bigger than just
.004 to 511.  Maybe you lose a little because of the sign bit or something,
but I'd guess you could express the large majority of numbers that occur
in *most* programs.  (There would certainly be a significant minority of
programs for which this would break down, though, as a couple of astro guys
have pointed out to me.)

>Furthermore, anything you do will mean that the FP hardware won't work with
>your number format.  And, no, you couldn't fake it, truncate it or trick it
>to get "approximately right" results from a chain of calculations "most of
>the time".  Your users would shoot you.

I'm not planning to use approximations, ever, except in the sense that IEEE
formats are themselves approximations.  (In particular, I wouldn't try to
truncate mantissas _at_all_.) I'm just doing a little compression
of common ranges of data.  As with Lisp "infinite precision"
integers, numbers that wouldn't fit into the shoehorned
one-word format would be represented differently -- as pointers to
full-sized objects (with headers added) on the heap.  They could overflow
into doubles if they need the extra range.  You could also use doubles
for your temporaries to keep from losing any precision or causing unnecessary
over/underflows, and then squish things down to short floats before storing
them into the heap (iff they'll fit).

The idea is to make the shoehorned format transparent to the user.  The
potential hitch is that if numbers frequently overflow, you get a
performance hit.  (You not only have to spend the space and time of
allocating several words on the heap, but with a generational gc you 
have to do at least a handful more, to check whether you're creating 
a pointer into a younger generation than you're storing into.  
Nonpointer objects are therefore much nicer.)

My users may kill me for the performance hit, if they have programs that
allocate lots of floats that end up on the heap.  (This is the bane of
generation scavenging.)  In particular, non-temporary double floats are 
always going to have to be allocated on the heap.  I've got ideas for 
dealing with them, too, but that would require a better compiler than I
feel like writing at the moment, or the use of ugly declarations.

It's been pointed out to me that a lot of crunch-intensive programs use
normalization for really big/small numbers to keep from overflowing the IEEE 
float format.  I'm wondering whether this would work to keep things within a 
4-bit exponent range.  If you're normalizing anyway, do you lose big by
being constrained to an even smaller range?  (If necessary, I can sacrifice
as few as two bits of my floats, but four would be easier.  Assuming 8-bit
exponents are standard, that leaves 4 to 6 bits before overflowing the 
shoehorned format.)

If this worked out well in practice, a guideline "small float"
format could be established, and libararies could be written that were
careful to avoid this performance hit.  (Implementations that didn't use
immediate floats, -- or whose immediate floats didn't have enough range -- 
could still run the same code _and_get_the_exact_same_results_, but might
not perform as well.  You could translate your FORTRAN programs to Scheme,
without losing big because of precision problems.)

The question is this -- how many bits should be in the "minimal" range
for immediate floats?  It would be nice to leave several bits for the
tag scheme, so that you could distinguish between more immediate types
(e.g., you might also have immediate complex numbers and ratios, whose
components were two 13-bit ints packed into the word with a 6-bit tag.)

John Levine sez that the VAX has a tiny little immediate float format,
with a three-bit mantissa and three-bit exponent, which suffices for a lot
of small constants.  (e.g., 1, 15, 1/2, 3/4)  He suggests similarly optimizing
a few special cases of heap floats (perhaps just zero).  Apparently these 
little VAX floats account for a big fraction of compiled constants.  (John 
also suggested a flat table-lookup technique for packing and unpacking 
floats, combining the tag checks & range overflow checks, and even
the checks for special cases.)

Anybody got any *dynamic* float distribution data?  I'm more worried 
about what gets stored into heap objects than what occurs in arithmetic 
expressions.  (I assume a decent compiler will just use normal 
representations for intermediate results results that the gc never 
sees, and that compile-time literals can be stored somewhere in standard 
IEEE format.)  I assume that the dynamic distribution will not be nearly 
as friendly as the static one, but I'd guess that it's still pretty modal. 

Actually, I'd be interested in any kind of empirical information on float
frequencies -- static, quasi-static, or dynamic.  (By "static" I mean 
compile-time, by "dynamic" I mean what gets operated on the most, and by
"quasi-static" I mean what the distribution of floats _in_memory_ is if 
you make a snapshot of memory at random intervals.)  Data for 
languages like C and FORTRAN is fine, since I wouldn't want to assume
people would keep writing the same kinds of programs in Lisp/Smalltalk,
etc. that they do now.

Thanks,

   Paul

Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@bert.eecs.uic.edu
Box 4348   Chicago,IL 60680

jlg@lambda.UUCP (Jim Giles) (03/06/90)

From article <1990Mar5.220724.3718@Neon.Stanford.EDU>, by wilson@carcoar.Stanford.EDU (Paul Wilson):
> [...]
> I don't get this.  It seems to me that if 8 bits gives you a range of 10**-38
> to 10**+38, then 4 bits ought to give you half as many orders of magnitude,
> or about 10**-19 to 10**+19.  [...]

Exponent range scales as the _LOG_ (base 2) of the number of bits in
the exponent.  So, 8 bits gives an exponent range of (about) 2^-127 thru
2^127 - that is, about 10^-39 thru 10^39.  A 4 bit exponent only gives
an exponent range of about 2^-15 thru 2^15 - that is, a little more than
10^-5 thru 10^5.

J. Giles

dave@tygra.UUCP (David Conrad) (03/06/90)

>
>Exponent range scales as the _LOG_ (base 2) of the number of bits in
>the exponent.  So, 8 bits gives an exponent range of (about) 2^-127 thru
>2^127 - that is, about 10^-39 thru 10^39.  A 4 bit exponent only gives
>an exponent range of about 2^-15 thru 2^15 - that is, a little more than
>10^-5 thru 10^5.
>
>J. Giles

Well, he said he could use as little as 2 bits, so with 6 bits of
exponent the range should be 2^63 thru 2^-63 or roughly 10^19 thru 10^-19,
probably sufficient for his purposes.
--
David R. Conrad
dconrad%wayne-mts@um.cc.umich.edu
dave%tygra.uucp@sharkey.cc.umich.edu

-- 
=  CAT-TALK Conferencing Network, Prototype Computer Conferencing System  =
-  1-800-825-3069, 300/1200/2400/9600 baud, 8/N/1. New users use 'new'    - 
=  as a login id.    <<Redistribution to GEnie PROHIBITED!!!>>>           =
E-MAIL Address: dave%tygra.uucp@sharkey.cc.umich.edu