wilson@carcoar.Stanford.EDU (Paul Wilson) (03/05/90)
I'm looking for info that would be useful in implementing short floating point numbers for Lisp/Smalltalk/etc. The basic idea is that I want short (1-word) floating point numbers to fit in one 32-bit words, *with* a runtime tag included. My plan is to store the number shifted left a few bits, at a tag in the lower bits. Having the number fit in a word is a big advantage, especially if you've got a generational garbage collector. (They have to be careful about pointer creation, in ways that can needlessly slow down programs that allocate a lot of floats.) Of course, this means I have to sacrifice a few bits of a standard 32-bit float to make it fit. I figure that I can just lop a couple of bits of of the exponent part, and leave the fraction part alone. That way, I sacrifice range of floats but not precision. If a float result goes out of that range, I'll make a full-sized floating point object on the heap, and use a tagged pointer to it. This is analogous to the "fixnum" (immediate short int) / "bignum" (long, heap-allocated) strategy used for integers. Now the question: How far can I carry this? Can I take five bits out of the exponent and still express most numbers most people would use? (How many bits do IEEE 32-bit floats use for the expt in the first place?) If not 5, can I do 4, 3, or 2? Has anybody done any studies of the distribution of floating point values for real programs? My guess is that most programs don't generate *any* floating point values outside of a fairly narrow range, so that most programs would not require *any* big floats. (I'm not looking for Lisp program data here -- I wouldn't want to assume that people only want to run the kinds of programs traditionally written in Lisp. C and Fortran data would be fine.) And if I've made any stupid assumptions, such that this is a bad idea anyway, please gently clue me in. (I know close to nothing about computer arithmetic.) Thanks, Paul Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu Box 4348 Chicago,IL 60680 Paul R. Wilson Software Systems Laboratory lab ph.: (312) 996-9216 U. of Illin. at C. EECS Dept. (M/C 154) wilson@bert.eecs.uic.edu Box 4348 Chicago,IL 60680
jkenton@pinocchio.encore.com (Jeff Kenton) (03/05/90)
From article <1990Mar5.031003.12107@Neon.Stanford.EDU>, by wilson@carcoar.Stanford.EDU (Paul Wilson): > I'm looking for info that would be useful in implementing short > floating point numbers for Lisp/Smalltalk/etc. > > The basic idea is that I want short (1-word) floating point numbers to > fit in one 32-bit words, *with* a runtime tag included. My plan is > . . . > > Of course, this means I have to sacrifice a few bits of a standard 32-bit > float to make it fit. > Standard 32-bit floats have 1 sign bit, 8 exponent bits and 23 (+1) bits of mantissa. This gives an exponent range of 10**-38 to 10**+38, and a precision of about 7 decimal digits. For each bit of exponent you drop you cut the range in half -- 4 bit exponents only leave a range of .004 to 511. Each four bits of mantissa you drop will lose you slightly more than 1 decimal digit. Furthermore, anything you do will mean that the FP hardware won't work with your number format. And, no, you couldn't fake it, truncate it or trick it to get "approximately right" results from a chain of calculations "most of the time". Your users would shoot you. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - jeff kenton --- temporarily at jkenton@pinocchio.encore.com - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
aglew@oberon.csg.uiuc.edu (Andy Glew) (03/06/90)
I don't have any references on distribution of floating point numbers (and hope you'll be able to point us to them). However, a few thoughts on "smaller-than-word" floating point: - My first reaction was that you should use one of the 20 or 24 bit floating point formats that are used by a number of DSPs - say the AT&T chip. - Second, it occurred to me that you could sacrifice the low order bit(s) of the mantissa, in much the same way that many LISP systems use the least significant bit of the integer format as a tag. If you used LSB=0 as the tag for your <32 bit tagged FP value, you wouldn't need any masking operations on the operands before dispatching the FP operation. However, you would need a cleanup afterwards, probably implementing correct rounding in software. Trouble is, the LSB=0 tag would conflict with the most convenient tag value for integers. A little bit of masking for <32 bit FP operations might be acceptable if it permitted the dynamically more frequent integer operations to have no tag manipulation overhead. You could still use the native FP hardware. A machine like the 88K, where FP and integers share the same register file, might be preferrable for you. A seperate FP register set will usually require a bit more footwork in order to perform the "integer" tag masking operations on the FP value. -- Andy Glew, aglew@uiuc.edu
khoult@bbn.com (Kent Hoult) (03/06/90)
In article <11298@encore.Encore.COM> jkenton@pinocchio.encore.com (Jeff Kenton) writes: >From article <1990Mar5.031003.12107@Neon.Stanford.EDU>, by wilson@carcoar.Stanford.EDU (Paul Wilson): >> I'm looking for info that would be useful in implementing short >> floating point numbers for Lisp/Smalltalk/etc. >> >> The basic idea is that I want short (1-word) floating point numbers to >> fit in one 32-bit words, *with* a runtime tag included. My plan is > >For each bit of exponent you drop you cut the range in half -- 4 bit exponents >only leave a range of .004 to 511. Each four bits of mantissa you drop will >lose you slightly more than 1 decimal digit. > >Furthermore, anything you do will mean that the FP hardware won't work with >your number format. And, no, you couldn't fake it, truncate it or trick it >to get "approximately right" results from a chain of calculations "most of >the time". Your users would shoot you. Not quite true. I've built a lisp machine where we did exactly this for short floats. Even in a truncated format, they can be very useful for certain things (graphics is one of the best examples). As long as it is clearly document what you lose and gain by using the short format (which is definately not IEEE anymore) there isn't a problem. Users that really want true IEEE will select the IEEE format. But the shorts are good for lots of other stuff. The format we used was to truncate the bottom 6 bits of mantissa and shifted the word right 6 bits. This left room for a 6 bit tag while still having just over 5 decimal digits af accuracy (2^17). Conversion was then trivial in either direction ( short->single is a 6 bit left shift, and single->short is 6 bit right shift and OR in the datatype, plus beware of NAN's). Kent Hoult TEL: (617) 873-4385 ARPA: khoult@bbn.com
ok@goanna.oz.au (Richard O'keefe) (03/06/90)
In article <1990Mar5.031003.12107@Neon.Stanford.EDU>, wilson@carcoar.Stanford.EDU (Paul Wilson) writes: > [ wants to implement short-floats; thinks it would be cool to take bits > out of the exponent; wants to know how many ] > I figure that I can just lop a couple of bits of of the exponent part, and > leave the fraction part alone. That way, I sacrifice range of floats > but not precision. The snag with that is that you sacrifice *all* the precision at the ends of your ranges. The IEEE-754 formats are very carefully balanced; there are actually rules that tell you what would be good sizes for the fields. Obviously, a short-float cannot be IEEE-754, but you *could* try to see how many of the constraints listed in IEEE-854 you can satisfy. To give you some idea of the kind of thinking that is involved: suppose you have N effective mantissa bits (counting the hidden bit). You would like nextafter(1.0,/*in the direction of*/ 2.0) - 1.0 (that is, the next number just bigger than 1.0, minus 1.0) to be representable; that's your epsilon. So you want 2**(1-N) to be representable, and it's a general rule that the exponent range should be roughly symmetric, so you want the range to be at least 1-N..N-1 which means you need approx ceil(log2(N)+1) exponent bits. For N=24, that means an absolute minimum of 6 exponent bits. More than that, you'd really like an epsilon's worth of epsilon, so you want at least ceil(log2(N)+2) exponent bits, and now we're just 1 short of 754'8 8 exponent bits. It is much better to steal bits out of the significand field. There are several Prolog systems that do it; however that approach is losing its popularity. It was ok when you were just entering co-ordinates for graphics, but for serious calculation anything less than the usual 32 bits is unsafe in unskilled hands. If you do decide to steal bits out of the significand field, make a conscious design decision about whether you want to preserve the "graceful underflow" property of the IEEE standards or not. If you do, you'd have quite a reasonable system, but the conversions are not as simple. If you don't, the conversions are simpler, but the resulting arithmetic system isn't much like IEEE. That's what most people have talked about. But it's *not* what Paul Wilson *asked* about. His proposal is: > If a float result goes out of that range, I'll make > a full-sized floating point object on the heap, and use a tagged > pointer to it. This is a really neat idea. I like it a LOT. He's not actually proposing to change the properties of the arithmetic system at all, merely the representation in memory. (Rather like the VAX tiny-float immediate operands.) Presumably DEC picked the range of their tiny-floats based on some expectation of the range of constants in programs; if your coding covers that, you're doing at least as well as the VAX. A suggestion: why not scan through CALGO (ACM Collected Algorithms) and issues of JRSS Series C and such things to see what some typical ranges are like. I've had ranges of 10**9 or more in my code, but so what?
weaver@weitek.WEITEK.COM (03/07/90)
In article <1990Mar5.031003.12107@Neon.Stanford.EDU> wilson@carcoar.Stanford.EDU (Paul Wilson) writes: >I'm looking for info that would be useful in implementing short >floating point numbers for Lisp/Smalltalk/etc. > >The basic idea is that I want short (1-word) floating point numbers to >fit in one 32-bit words, *with* a runtime tag included. My plan is >to store the number shifted left a few bits, at a tag in the lower bits. Someone has already mentioned the standard IEEE-854. This is a useful document that describes how to implement a floating point system, with your own selection of the number of bits used for exponent and fraction. (In contrast, IEEE-754 describes actual formats as well as aspects such as rounding convered in IEEE-854). I believe that if, rather than reducing the size of the exponent, you reduce the size of the fraction, you can more easily use the existing floating point support. Supposing you use the least significant bits for tag, to execute a floating point operation you would: 1. copy operands. 2. clear least signifigant bits that were used for tags. 3. add, etc., with pre-existing hardware/software. 4. check for exceptions and round to your precision (lots of work). 5. add new tag. It seems to me (without a great deal of study) that if your base system already does IEEE-754, that you could use this as I have set out to implement IEEE-854 with your choice of fraction size.