[comp.lang.smalltalk] CORRECTION to: heaps of numbers

wilson@carcoar.Stanford.EDU (Paul Wilson) (03/06/90)

Thanks to Carl Lowenstein, Jim Giles and Herman Rubin for pointing out
my misunderstanding of the IEEE float format.  I had not realized that
the exponent is really interpreted with a double exponentiation -- that
changes things a bit.  (Or maybe two bits.)

(I seem to recall that the only floating point format I ever had to learn
used a power of a fixed number, and I thought IEEE would be the same.)

So it looks like each marginal bit would be more important than I thought,
favoring stealing _very_few_ bits from the exponent. 2**(2**4) would
only give you a range of 1/32K to 32K.  But 2**(2**5) would give you
a range from 1/2nano- to 2giga- which seems pretty reasonable.  And 2**(2**6)
goes from pretty seriously small to pretty seriously big (by my standards).

I interpret this to mean that I can steal 2 bits from the 8-bit exponent, 
leaving 6, or maybe 3, leaving 5.

It would be awkward to use less than two bits for primary tags, given that
you probably want separate tags for pointers, immediate ints, and other
immediates.  So it looks like the question comes down to this:  is the
1/2nano- to 2giga- range enough for the large majority of floats that
get stored into memory, or should I go with the less convenient scheme 
of only stealing 2 bits?  In the latter case I'd have to use up one of our 
four primary (2-bit low-) tags, but we could live with it.

Any empirical info relevant to this tradeoff would be greatly appreciated.

And if I've got it wrong somehow, please point it out.  (As I see it, my
real assumption is this:  the upper 2 or 3 non-sign bits of an IEEE short
are usually zero.  If I've misunderstood FP representation or
distributions, and this is not true, let me know.)

   -- Paul

Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@bert.eecs.uic.edu
Box 4348   Chicago,IL 60680

jkenton@pinocchio.encore.com (Jeff Kenton) (03/06/90)

From article <1990Mar6.015230.20068@Neon.Stanford.EDU>, by wilson@carcoar.Stanford.EDU (Paul Wilson):
> 
> And if I've got it wrong somehow, please point it out.  (As I see it, my
> real assumption is this:  the upper 2 or 3 non-sign bits of an IEEE short
> are usually zero.  If I've misunderstood FP representation or
> distributions, and this is not true, let me know.)
> 

Unfortunately for your purposes, the exponent is "biased" by 127.  This means
that the representation of exponents runs from 1 to 255 (smallest to largest).
All floats >= 2.0 have the high bit of the exponent set.  Yet another gotcha.







- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
      jeff kenton  ---	temporarily at jkenton@pinocchio.encore.com	 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

moore%cdr.utah.edu@cs.utah.edu (Tim Moore) (03/06/90)

In article <1990Mar6.015230.20068@Neon.Stanford.EDU> wilson@carcoar.Stanford.EDU (Paul Wilson) writes:

>So it looks like each marginal bit would be more important than I thought,
>favoring stealing _very_few_ bits from the exponent. 2**(2**4) would
>only give you a range of 1/32K to 32K.  But 2**(2**5) would give you
>a range from 1/2nano- to 2giga- which seems pretty reasonable.  And 2**(2**6)
>goes from pretty seriously small to pretty seriously big (by my standards).
>
>I interpret this to mean that I can steal 2 bits from the 8-bit exponent, 
>leaving 6, or maybe 3, leaving 5.
>
>It would be awkward to use less than two bits for primary tags, given that
>you probably want separate tags for pointers, immediate ints, and other
>immediates.  So it looks like the question comes down to this:  is the
>1/2nano- to 2giga- range enough for the large majority of floats that
>get stored into memory, or should I go with the less convenient scheme 
>of only stealing 2 bits?  In the latter case I'd have to use up one of our 
>four primary (2-bit low-) tags, but we could live with it.
>

I've been thinking about implementing short floats for Utah Common
Lisp. I think it would be better to steal bits from the mantissa than
the exponent. With our low tags scheme, a simple mask operation would
restore a short float to an IEEE 32-bit float with the least
significant bits of the mantissa zeroed, allowing the floating point
hardware to do its thing.

The original post by Paul Wilson discussed the problem of deciding
when to use short floats vs. when to use long floats to avoid losing
precision. This isn't an issue in Common Lisp (I'm reading this in
comp.lang.lisp); the rules of floating point contagion (promotion) are
defined by the language, and the user can chose the reader's default
float type. 

Tim Moore                     moore@cs.utah.edu {bellcore,hplabs}!utah-cs!moore
"Ah, youth. Ah, statute of limitations."
		-John Waters

schaerer@gorgo.ifi.unizh.ch (03/08/90)

jlg@lambda.UUCP (Jim Giles) writes:
> ...A 4 bit exponent only gives an exponent range of about 2^-15 thru
> 2^15 - that is, a little more than 10^-5 thru 10^5.

wilson@carcoar.Stanford.EDU (Paul Wilson) writes:
> Thanks to Carl Lowenstein, Jim Giles and Herman Rubin for pointing out
> my misunderstanding of the IEEE float format.  I had not realized that
> the exponent is really interpreted with a double exponentiation -- that
> changes things a bit.  (Or maybe two bits.)
>
> (I seem to recall that the only floating point format I ever had to learn
> used a power of a fixed number, and I thought IEEE would be the same.)
>
> So it looks like each marginal bit would be more important than I thought,
> favoring stealing _very_few_ bits from the exponent. 2**(2**4) would
> only give you a range of 1/32K to 32K.  But 2**(2**5) would give you
> a range from 1/2nano- to 2giga- which seems pretty reasonable.  And 2**(2**6)
> goes from pretty seriously small to pretty seriously big (by my standards).
>
> I interpret this to mean that I can steal 2 bits from the 8-bit exponent,
> leaving 6, or maybe 3, leaving 5.

dave@tygra.UUCP (David Conrad) writes:
> Well, he said he could use as little as 2 bits, so with 6 bits of
> exponent the range should be 2^63 thru 2^-63 or roughly 10^19 thru 10^-19,
> probably sufficient for his purposes.

Please folks. It's so simple.

- With four bits you can represent sixteen values, e.g. -8 to 7.

- An exponent between -8 and 7, together with a mantissa between 0.5 and
  (almost) 1, covers the range 0.5 * 2^-8 to (almost) 2^7. (Assuming that
  you don't represent denormalized numbers, they don't make sense for the
  application being discussed.)

- Alternatively, an exponent between -7 and 8 would cover 0.5 * 2^-7 to
  (almost) 2^8. You can fine-tune the range by defining how you interpret
  the sixteen four-bit-combinations.

So, five bits gives approximately 1/64K to 64K. Six bits (not five) gives
the nano to giga range. Seven bits gives approximately 10^-19 to 10^19.
And eight bits approximately 10^-38 to 10^38, as we all know.

Besides, there is no "double exponentiation" involved. The constant "two"
is simply raised to the power indicated by the exponent. But having one
more exponent bit gives you twice as many possible exponent values: The
exponent range is DOUBLED, which means the floating-point number range is
SQUARED.

But I like Paul Wilson's idea of special-caseing the common values.

---
Daniel Schaerer, University of Zurich/Switzerland
schaerer@ifi.unizh.ch