johnl@ima.UUCP (07/15/85)
A while ago I said I'd send out a note on how the design of the floating point unit on the System/360 was botched. They ended up with a design which is quite fast but produced poor answers, losing 3 bits of accuracy in each operation. Here's the sketch of what IBM did. They seem to have thought that they could get faster floating point using hex rather than binary without losing precision, but they were wrong. The analysis went sort of like this: We look at the high order digit of a hex floating point number. Assuming that leading digits are uniformly distributed from 0 to F, on the average there will be one leading zero bit. But since we use hex rather than binary exponents, we can make the exponent one bit smaller than if we had a binary exponent and the fraction one bit bigger, and you don't lose any precision, get better exponent range, and it's faster since you need fewer normalization steps to normalize hex rather than binary numbers. It turns out that leading digits are geometrically distributed so that on the average there are two leading zeros rather than one, so you lose one bit that way. Also, when you use binary floating point you know that the high order bit of every normalized number is a 1 so you don't have to store it -- in hex this "hidden bit" trick doesn't work. Third, the 360's floating point unit truncates results rather than rounding (again in the interest of speed) which is equivalent to keeping one less bit than if you rounded. This means that compared to a machine like the VAX, a 360 gets 3 bits less precision on each floating point operation of the same word size. The points about faster normalization were true, so the 360's floating point unit produces inaccurate results very quickly. Originally, the floating point unit computed stuff only to the fraction size keeping no guard digits, which produced incredibly wrong results. IBM retrofitted guard digits shortly after the first 360s were shipped and customers started to complain. Their eventual precision fix has been to add quadruple precision floating point, since it's too late to change floating point formats now. John Levine, Javelin Software, Cambridge MA 617-494-1400 { decvax!cca | think | ihnp4 | cbosgd }!ima!johnl, Levine@YALE.ARPA
ark@alice.UUCP (Andrew Koenig) (07/18/85)
John Levine argues in a rather lengthy article that the IBM floating point format loses 3 bits compared with a binary format. This isn't quite true; he neglects to mention the gain of two bits because the exponent is a power of 16 rather than a power of 2. Of course, when compared to the VAX, the IBM format choses to realize that gain by allowing numbers up to 7e75 rather than 1e38, but it's still a gain. And chopped arithmetic has its advantages too. For instance, conversion from double to single cannot overflow, as it can on the VAX.
franka@mmintl.UUCP (Frank Adams) (07/18/85)
In article <36900010@ima.UUCP> johnl@ima.UUCP writes: >Assuming that leading >digits are uniformly distributed from 0 to F, on the average there will be one >leading zero bit. But since we use hex rather than binary exponents, we can >make the exponent one bit smaller than if we had a binary exponent and the >fraction one bit bigger, and you don't lose any precision, get better >exponent range, and it's faster since you need fewer normalization steps to >normalize hex rather than binary numbers. > >It turns out that leading digits are geometrically distributed so that on the >average there are two leading zeros rather than one, so you lose one bit that >way. Actually, it's worse than that. The proper measure of the precision is not the average precision, but the worst precision -- the precision is what you can count on in your results. Thus you lose two bits, not one.
carter@masscomp.UUCP (Jeff Carter) (07/20/85)
In article <4006@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes: >John Levine argues in a rather lengthy article that the IBM floating ....etc.. >allowing numbers up to 7e75 rather than 1e38, but it's still a gain. > >And chopped arithmetic has its advantages too. For instance, conversion >from double to single cannot overflow, as it can on the VAX. There seems to be a little confusion here concerning chopped arithmetic/IBM format/VAX format for floating point. The reason that you can't overflow from double -> single on some hardware architectures relates to the width of the exponent field in the floating point number. In the IEEE format, 8 bits are used in single precision for the biased exponent, and 11 bits are used in double precision. It does NOT matter if you round/chop/clear/set the low order bits, the number will overflow if the exponent won't fit in the space allotted. Its been a long time since I used 360/370 family machines, but I seem to remember that there are indeed more exponent bits in double precision than single. ( I could be wrong on the IBM details a lot of water's passed under the bridge since I last used an IBM, I prefer computers that accuarately reflect the calculations I intended to perform ) This means that the IBM *will* overflow on conversions, totally unrelated to the rounding mode. The last machine that I worked on that used the same number of bits for exponent in both SP and DP was a DG MV/8000 (and I presume the 4, 6, & 10) The 3 saved bits are used as part of the mantissa, resulting in much more accurate (but smaller dynamic range ) calculations. Jeff Carter MASSCOMP 1 Technology Park, Westford, MA 01886 UUCP: ....!{ihnp4|allegra|decvax}!masscomp!carter
ark@alice.UUCP (Andrew Koenig) (07/20/85)
Jeff Carter tried to correct a point I made with too little detail: > The reason that you can't overflow from double -> single on some > hardware architectures relates to the width of the exponent field in > the floating point number. This may well be true, but is unrelated to the point I made. Let me be more specific. The VAX floating-point format is identical in single and double precisions except for the number of mantissa bits. However, overflow is still possible on double->single conversion because the VAX rounds rather than chops. All you need to do is convert some number with the maximum possible exponent, all the high-order mantissa bits (i. e.: the ones that correspond to short format) turned on, and the next bit also turned on. Then the rounding that accompanies the conversion will cause an overflow. Jeff goes on to say: > Its been a long time since I used > 360/370 family machines, but I seem to remember that there > are indeed more exponent bits in double precision than single. > ( I could be wrong on the IBM details a lot of water's passed under > the bridge since I last used an IBM, I prefer computers that > accuarately reflect the calculations I intended to perform ) He is indeed wront; the 370 has 7 exponent bits in both single and double. Since the exponent represents a power of 16, the largest magnitude is about 7.237e75.
herbie@watdcsu.UUCP (Herb Chong [DCS]) (07/20/85)
In article <741@masscomp.UUCP> carter@masscomp.UUCP (Jeff Carter) writes: >Its been a long time since I used >360/370 family machines, but I seem to remember that there >are indeed more exponent bits in double precision than single. >( I could be wrong on the IBM details a lot of water's passed under > the bridge since I last used an IBM, I prefer computers that > accuarately reflect the calculations I intended to perform ) >This means that the IBM *will* overflow on conversions, totally >unrelated to the rounding mode. the first eight bits are sign, and 7 bit exponent in excess 64 notation to the base 16 in all three floating point formats on 360/370 machines. i have the principle of operations open right in front of me as i write this. you get maximum of 24, 56, and 112 bits of precision in short, long, and extended floating point. Herb Chong... I'm user-friendly -- I don't byte, I nybble.... UUCP: {decvax|utzoo|ihnp4|allegra|clyde}!watmath!water!watdcsu!herbie CSNET: herbie%watdcsu@waterloo.csnet ARPA: herbie%watdcsu%waterloo.csnet@csnet-relay.arpa NETNORTH, BITNET, EARN: herbie@watdcs, herbie@watdcsu
hes@ecsvax.UUCP (Henry Schaffer) (07/22/85)
Really Re: IBM 360 exponent length I believe that the exponent stays the same for single, double (and quadruple) precision. That's a plus in changing precisions, but a loss in that the higher precisions could well spare a few bits to greatly extend the range of possible magnitudes. An extra two bits in the exponent would increase the 10E78 to 10E300 or so. In the types of scientific computing I've done that would make a substantial improvement. --henry schaffer
carter@masscomp.UUCP (Jeff Carter) (07/23/85)
My apologies to all those who noticed my error concerning 360 floating point format. That should teach me to trust my memory. But, I still wonder what the great advantage of chopped arithmetic is ? That was the original subject of the article replied to. First the correction:: The IBM floating point format (as with several other vendors) uses the same number of exponent bits for all precisions. On to the other issue: One (alleged) advantage of chopped (as opposed to rounded) arithmetic is that you can never overflow on a conversion of double -> single precision, given that the number of exponent bits is the same. The only time this could occur is at the very edges of the dynamic range of representable numbers. What you give up for this is one bit of precision on *all* calculations. Is this a fair trade? I'm not entirely convinced. If there are others who firmly believe that this is the best way to do it, I'm always glad to listen. I'm a firm believer in the IEEE representation. As Henry Shaffer pointed out, 10^300 is sometimes a realistic number, so it needs to be provided for.
thomson@uthub.UUCP (Brian Thomson) (07/25/85)
> One (alleged) advantage > of chopped (as opposed to rounded) arithmetic is that you can never > overflow on a conversion of double -> single precision, given that > the number of exponent bits is the same. The only time this could > occur is at the very edges of the dynamic range of representable > numbers. Since the rounded result is more accurate, it should be argued that in such a situation overflow is the preferred result. The defence of "it's not a bug it's a feature" was probably invented by a marketing department. -- Brian Thomson, CSRI Univ. of Toronto {linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!uthub!thomson