schooler@inmet.UUCP (10/25/84)
What About Floating Point Representation for CD's? Currently, a CD waveform sample is represented as a 16-bit integer, giving a dynamic range of approx. 2^16 = 10^4.8 = 96 dB (roughly). This integer representation has extremely high (relative) precision at the upper end of the scale, and low precision at the lower end of the scale. Consider a 16-bit floating point representation: say 6 bits of exponent (base 2) and 10 bits of fraction. Using the normal implicit-first-bit- is-1 representation for the fraction, the smallest representable number is .5, and the largest is 2^63. The dynamic range is thus 2^64 = 10^19 = 385 dB (roughly). The precision is .1% of a "dynamic octave". This sounds like a big win. Is there something wrong with my math? Are there grave difficulties with floating point (or equivalently, logarithmic) A-D/D-A devices? -- Richard Schooler
lutton@inmet.UUCP (10/28/84)
<> There were some experiments with floating-point representation of digitized sound back in the 70's. The results indicated that you gained something with the exponent bits but lost something by having only 12 or fewer mantissa bits; net result was that 16 bit floating point and 16 bit fixed point sounded just about the same. Fixed point is easier to work with, so floating point was forgotten about. HOWEVER: the test signals were computer-generated, i.e. known beforehand not to exceed a certain fixed level. Perhaps it's time for a fresh look at floating-point.
schooler@inmet.UUCP (10/28/84)
As someone pointed out, a 16-bit floating-point representation is over-kill. (Who needs > 150 dB?!) Consider, however, an 8-bit logarithmic representation, as proposed by Edgar and Lee in "FOCUS Microcomputer Number System", Comm. ACM, Vol. 22, Num. 3, March 1979. This representation has one sign bit and a seven-bit exponent in fixed-point format with three fraction bits. The sign of the exponent is encoded by an offset, i.e. 0 1000.000 = +2^0 = 1. The authors claim a range of 96 dB, an absolute S/N of 93 dB, and an instantaneous S/N (precision, roughly) of 32 dB. I quote, "In audio applications the noise level of even the 8-bit FOCUS compares favorably to the highest quality cassette recordings as a means of signal handling." Furthermore, they suggest a simple circuit for logarithmic A-D's and exponential D-A's. While addition and subtraction are to be avoided, multiplication, division and exponentiation are fast and exact. If all this is really valid, we can cram twice as much sound per bit onto our favorite digital medium. -- Richard Schooler
herbie@watdcsu.UUCP (Herb Chong, Computing Services) (10/28/84)
I think the main problem is the extra hardware that has to be added to do these things. Aside from that, I can see no major problem. However, logarithmic compression has almost exactly the same effect and has been used in most of the 14 bit digital systems. The other problem is that with the exponent and mantissa repressentation of the number, at least 16 bits have to be used for the mantissa and about 8 bits for the exponent. Unless you use a hexadecimal coding scheme for floating point number as IBM does for their computers, you may even need more than that. The net result is at least 50% higher data rate. When the original specs for CD's were drawn up, 16 bits was considered barely achieveable on a commercial scale. Herb... I'm user-friendly -- I don't byte, I nybble.... UUCP: {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdcsu!herbie CSNET: herbie%watdcsu@waterloo.csnet ARPA: herbie%watdcsu%waterloo.csnet@csnet-relay.arpa BITNET: herbie at watdcs,herbie at watdcsu
dmmartindale@watcgl.UUCP (Dave Martindale) (10/29/84)
Arghh! Using hexadecimal-base floating point, as in the IBM S/360 and its successors, is the least efficient way to use the bits in a word. The reason is that, on the average, two of the mantissa bits will be zero and carrying no useful information at all. Binary-base floating point, particularly if the always-1 leading bit is not stored, is most efficient. If this seems unclear, talk to any numerical analyst. And you need 8 bits of exponent only if you want to cover a fairly wide "dynamic range" of numbers, about 10^77. An audio dynamic range of 120dB is a 10^6 range of voltage levels, which needs somewhere between 4 and 5 bits of exponent. Four might be enough.
mwm@ea.UUCP (10/30/84)
There is a floating point format known as FOCUS developed by a grad student at the University of Oklahoma. It was designed for electronic sampling instrumentation, and avoids the "hole around zero" problem with normal floating point representations. Perhaps this would be a better choice than either a normal floating point or integer representation? I don't have the article anywhere near me, but will gladly dig it out for anyone sending mail asking about it. <mike
jlg@lanl.ARPA (11/02/84)
I've been thinking about floating point music representation for some time. The best format seems to be 1 sign-bit, 3 (or 4) exponent bits, and 13 (or 12 ) significand bits. The 3 bit exponent with a IEEE floating point type of gradual underflow and a hidden normalization bit gives a dynamic range of about 120 db. The signal to noise ratio is always about 78 db - that is 78 db below present signal ( If I'm listening to a sound at 90 db, I can't hear noises at 12 db - I don't think anyone else can either). The 4 bit exponent format has a dynamic range of over 160 db, with a signal to noise ratio of about 70 db. This is beyond the dynamic range of human hearing - should be sufficient. It is also beyond the dynamic range of current recording techniques, the widest A/D I've seen with the ability to run the speeds required for audio recording were 18 bits (several thousand $). The four bit exponent format is suitable for compression of 28 bit data, I don't think we'll ever see that! Both the 3 bit and 4 bit exponent formats given here require 16 bits of space on the recording medium. That makes them compatible with the current data encoding, error reduction schemes. The only thing required is circuitry to convert such floating point numbers into integers for the D/A conversion.
jlg@lanl.ARPA (11/07/84)
> There is a floating point format known as FOCUS developed by a grad student > at the University of Oklahoma. It was designed for electronic sampling > instrumentation, and avoids the "hole around zero" problem with normal > floating point representations. Perhaps this would be a better choice than > either a normal floating point or integer representation? > > I don't have the article anywhere near me, but will gladly dig it out for > anyone sending mail asking about it. > > <mike If I remember to FOCUS design correctly, it used a scheme for varying the size of the exponent field to prevent underflow (or overflow for that matter). Unfortunately, this required the use of more exponent bits to begin with, thus making the significand smaller. The 'hole around zero' is not much of a problem for audio purposes since a signal that small would be 120 to 150 DB below the maximum representable signal (maybe more, depending on which floating point format you chose to use). This would probably be below the threshold of hearing for any reasonable volume level.