rog@speech.kth.se (Roger Lindell) (04/03/91)
Hello, I would like to know if there exists any good and moderately fast compression programs that will compress 16-bit soundfiles by a large amount. These files are stored for archiving purposes and therefore I need a non lossy compression. I have tried compress, freeze and yabba, and I get best compression with freeze but it take the longest time both to compress and decompress. What I would like is a program that has equal or better compression than freeze and has faster (preferably much faster) decompression speed. Btw this is for a Unix machine. -- Roger Lindell rog@speech.kth.se Phone: +46 8 790 75 73 Fax: +46 8 790 78 54 Dept. of Speech Communication and Music Acoustics Royal Institute of Technology Sweden
rog@speech.kth.se (Roger Lindell) (04/03/91)
Hello, I would like to know if there exists any good and moderately fast compression programs that will compress 16-bit soundfiles by a large amount. These files are stored for archiving purposes and therefore I need a non lossy compression. I have tried compress, freeze and yabba, and I get best compression with freeze but it take the longest time both to compress and decompress. What I would like is a program that has equal or better compression than freeze and has faster (preferably much faster) decompression speed. Btw this is for a Unix machine. -- Roger Lindell rog@speech.kth.se Phone: +46 8 790 75 73 Fax: +46 8 790 78 54 Dept. of Speech Communication and Music Acoustics Royal Institute of Technology Sweden
lance@motcsd.csd.mot.com (lance.norskog) (04/04/91)
rog@speech.kth.se (Roger Lindell) writes: >Hello, >I would like to know if there exists any good and moderately fast compression >programs that will compress 16-bit soundfiles by a large amount. These files >are stored for archiving purposes and therefore I need a non lossy compression. > ... [ standard compresses are slow and don't do very well ] The standard naive sound compressor is just saving the deltas between samples. You save deltas as a stream of 2-, 3-, 4-, ..., N-bit records. What I have noticed playing around with voice files is that they have lots of "flip" deltas with zero-crossings. That is: "100,-101,104,-99". It might work well to save the deltas AND note that some deltas have zero-crossings, thus a stream like "100,+4,+6,-7,C-9" comes out "100,104,110,103,-112". Lance Norskog
phillips@sparky.mit.edu (Mike Phillips) (04/11/91)
>I would like to know if there exists any good and moderately fast compression >programs that will compress 16-bit soundfiles by a large amount. These files >are stored for archiving purposes and therefore I need a non lossy compression. >I have tried compress, freeze and yabba, and I get best compression with freeze >but it take the longest time both to compress and decompress. What I would like >is a program that has equal or better compression than freeze and has faster >(preferably much faster) decompression speed. Btw this is for a Unix machine. We found that we have been able to get fairly good compression (about 50-60% for most of our data) by simply packing the numbers into less bits. So the compression program writes out a little header for chunks of speech saying "here comes X words packed into Y bits per word". This works well because much of speech is pretty quiet. It's simple, it's fast, it seems to work pretty well and it's free. You can get this via anonymous ftp from lightning.lcs.mit.edu (18.27.0.147) it's all in pub/compression/tarfile.Z Mike (phillips@goldilocks.lcs.mit.edu)
agulbra@siri.unit.no (Arnt Gulbrandsen) (04/15/91)
Most sound data doesn't include much in the top octave, so you can compress the sound data to a bit over 50% by adding together two items. The result needs one bit more, has half as may items and sounds much the same. 16-bit 44kHz becomes 17-bit 22kHz (or 18-bit 11kHz, but then the signal starts to deteriorate.) Arnt Gulbrandsen, University of Trondheim, Norway agulbra@siri.unit.no
daemon@felix.UUCP (The devil himself) (04/15/91)
In article <rog.670690472@color> rog@speech.kth.se (Roger Lindell) writes: >Hello, > >I would like to know if there exists any good and moderately fast compression >programs that will compress 16-bit soundfiles by a large amount. From: peterg@felix.UUCP (Peter Gruenbeck) I did some electronics tinkering about 10 years ago with an OKI semiconductor chip set for digitial sound capture and synthesis. It was essentially an A/D converter and digital signal processor which compressed the sounds on the fly using a method called ADPCM. ADPCM which stands for Adaptive Differential Pulse Code Modulation basically coded each sound sample as a 3 or 4 bit difference (+-4 or +-8 amplitude levels) from the previous sample. This is based on the assumption that the sounds being recorded (speech & music) are more or less continuous. This technology is now widely available in digital telephone answering machines. Using this method, 15 seconds of speech at an 8K sample rate can be compressed into about 64KB of memory. -- Pete Gruenbeck -- The poor grammar, spelling errors, and errors o o in usage are included with a purpose: I write ^ something for everybody. (---)
campbell@wookumz.gnu.ai.mit.edu (Paul Campbell) (04/16/91)
As far as sound compression goes, why not do a Fast Fourier transform or a Discrete Cosine Transform to get it to a frequency spectrum and then compress that after quantizing it? It should end up much smaller after compression because the frequency conversion on anything but noise spectra should introduce lots of zeroes. It's not that hard to do the reverse transform for either case, either. You do introduce a significant loss in speed, but a lot of sound editing is easier on the frequency form and the original goal was compression of the data, after all.
rogerc@thebox.rain.com (Roger Conley) (04/17/91)
From what I've read the standard compresion for sound files is ADPCM. 2X and 4X compresion is standard. I beleive the method uses relative amplitude changes instead of absolute changes since the differences between two adjacent samples should usually be small. I think CD's use this technique.
jj@alice.att.com (jj, like it or not) (04/18/91)
Those of you who care about the subject should read Johnston's paper in JSAC 1988 about audio coding, Stoll and Dehery's ICC 1990 paper, and Brandenburg's 1989 paper in ICASSP for starters. You're oversimplifying a whole body of recent work. -- -------->From the pyrolagnic keyboard of jj@alice.att.com<-------- Copyright alice!jj 1991, all rights reserved, except transmission by USENET and like free facilities granted. Said permission is granted only for complete copies that include this notice. Use on pay-for-read services specifically disallowed.
aipdc@castle.ed.ac.uk (Paul Crowley) (04/18/91)
In article <14938@life.ai.mit.edu> campbell@wookumz.gnu.ai.mit.edu (Paul Campbell) writes: >As far as sound compression goes, why not do a Fast Fourier transform or >a Discrete Cosine Transform to get it to a frequency spectrum and then >compress that after quantizing it? You could probably fit a curve to the noise values and send that! (meaningful noise-like sound does crop up a lot) ____ \/ o\ Paul Crowley aipdc@uk.ac.ed.castle \ / /\__/ Part straight. Part gay. All queer. \/
d88-jwa@byse.nada.kth.se (Jon W{tte) (04/21/91)
In article <> rogerc@thebox.rain.com (Roger Conley) writes:
From what I've read the standard compresion for sound files is ADPCM.
2X and 4X compresion is standard. I beleive the method uses relative
I think CD's use this technique.
I'm quite sure that CDs use a standard, "straight" storage method.
Itherwise you wold lose about the one advantage that CDs have over
LPs, sound-wise: good response to transients (spikes) and other
high-energy, high-frequency sound.
Furthermore, CDs use absolute encoding, which is a shame, since
relative coding (still 16 bit) would give a lot better dynamics -
at a little loss in transients, of course ...
No, CDs aren+t compressed. Rather, they're expanded for error
correction (added redundancy) If this is Hamming codes or some
sort of parity / checksum / CRCs, I do not know.
(Please. No comments about CD / LP, Okay ? At least put them in
rec.audio...)
--
I remain: h+@nada.kth.se (Jon W{tte) (Yes, a brace !)
"It's not entirely useless. It came in this great cardboard box !" - Calvin
"Life should be more like TV. I think all women should wear tight clothes,
and all men should carry powerful handguns" - Calvin, again
madler@nntp-server.caltech.edu (Mark Adler) (04/21/91)
Jon W{tte (no, your modem is fine--it's a brace) writes: >> I'm quite sure that CDs use a standard, "straight" storage method. >> Itherwise you wold lose about the one advantage that CDs have over >> LPs, sound-wise: good response to transients (spikes) and other >> high-energy, high-frequency sound. Yes, they store 16-bit samples (one per channel) with no compression. However, they could have done a lossless compression, using differential methods, and gotten about twice the time (well over two hours) on a CD. >> No, CDs aren+t compressed. Rather, they're expanded for error >> correction (added redundancy) If this is Hamming codes or some >> sort of parity / checksum / CRCs, I do not know. They use Reed-Solomon codes. As I recall, it is a 3/4 rate (which means one-fourth of the bits are for error-correction) 6-bit symbol code, with some amount of interleaving. These codes have excellent burst correction capabilities, so macroscopic scratches in the CD result in no loss of data. Mark Adler madler@pooh.caltech.edu
myhui@bnr.ca (Michael Hui) (04/21/91)
In article <1991Apr21.002203.4414@nntp-server.caltech.edu> madler@nntp-server.caltech.edu (Mark Adler) writes: [...] >Yes, they store 16-bit samples (one per channel) with no compression. >However, they could have done a lossless compression, using differential >methods, and gotten about twice the time (well over two hours) on a CD. I wonder why no compression was used? Certainly the IC technology at that time was advanced enough to have made it a cheap proposition. The FIR (I guess...) filters used to interpolate between samples in most CD players must take up at least as much silicon as a delta-modulation decoder. Michael MY Hui Ottawa Canada myhui@bnr.ca
abl@thevenin.ece.cmu.edu (Antonio Leal) (04/21/91)
myhui@bnr.ca (Michael Hui) writes: > >In article <1991Apr21.002203.4414@nntp-server.caltech.edu> madler@nntp-server.caltech.edu (Mark Adler) writes: >[...] >>Yes, they store 16-bit samples (one per channel) with no compression. >>However, they could have done a lossless compression, using differential >>methods, and gotten about twice the time (well over two hours) on a CD. > >I wonder why no compression was used? Certainly the IC technology at that time >was advanced enough to have made it a cheap proposition. The FIR (I guess...) >filters used to interpolate between samples in most CD players must take up at >least as much silicon as a delta-modulation decoder. Let's go easy here. These are distinct issues: 1 - Use of delta modulation - the converter is _much_ cheaper than the 16-bit D/As, which are finicky as hell. But, unless you jack up the data rate to equivalent levels, it's a lossy compression - transients may get it in the neck. Never mind honest audio engineer objections, can you imagine what the baboons in the mystic audio business would have made of it ? We got enough of a circus with a bullet-proof PCM scheme as it was ... Incidentally, the "1-bit converter" business is delta mod. Do some manipulation on the 16 bit samples, and feed a delta converter at a rate high enough to sound good. Sell as a major improvement (well, it _is_ guaranteed to be monotonic, which 16-bit ADCs should, but may not, be). 2 - Lossless compression, e.g. Huffmann or LZW. Even assuming that compression and error-correction didn't get on each other's hair, can you say "enough memory and computational power to decode a data stream at 2*16*44.1 kbit/s" ? Probably this should move over to rec.audio ... -- Antonio B. Leal Dept. of Electrical and Computer Engineering Bell: [412] 268-2937 Carnegie Mellon University Net: abl@ece.cmu.edu Pittsburgh, PA. 15213 U.S.A.
dce@smsc.sony.com (David Elliott) (04/21/91)
In article <1991Apr21.002203.4414@nntp-server.caltech.edu> madler@nntp-server.caltech.edu (Mark Adler) writes: >They use Reed-Solomon codes. As I recall, it is a 3/4 rate (which means >one-fourth of the bits are for error-correction) 6-bit symbol code, with >some amount of interleaving. These codes have excellent burst correction >capabilities, so macroscopic scratches in the CD result in no loss of >data. Correct. According to Pohlmann, all of the interleaving, multiple error-correction sets (before and after interleaving), and the eight-to-fourteen modulation (EFM) used to make the difference between the number of 1's and 0's small make the actual audio data on an audio CD about 1/4 the possible data space on a CD. The results are that a CD player that handles all of the levels of error correction can correct a bad burst of 3874 bits (2.5mm) or conceal a bad burst of 13,282 bits (7.7mm). On the compression side, I suspect that when the audio CD was being designed that processor technology wasn't advanced enough. The extra costs of adding a processor to do the decompression were probably just too high. The same types of problems befell MIDI.
dce@smsc.sony.com (David Elliott) (04/21/91)
In article <1991Apr21.020231.8109@bmerh408.bnr.ca> myhui@bnr.ca (Michael Hui) writes: >I wonder why no compression was used? Certainly the IC technology at that time >was advanced enough to have made it a cheap proposition. The FIR (I guess...) >filters used to interpolate between samples in most CD players must take up at >least as much silicon as a delta-modulation decoder. Advanced enough, yes, but I doubt it was cheap and rugged enough. As it was, many people didn't buy CD players for years because of high cost and questionable reliability. Also, the goal wasn't to come up with a storage mechanism that would hold tons of data, but to come up with a good replacement for a vinyl record. All they had to do was to create a medium that would hold 35 minutes worth of music, and they went much further than that. As it stands, very few artists fill up CDs. At best, they add a few B-sides or EP songs to help justify the added cost. I haven't studied the data headers, but I suspect that there's no reason that a compressed audio CD standard couldn't be developed. It's just a question of whether there's a need.
dce@smsc.sony.com (David Elliott) (04/21/91)
In article <1991Apr21.002203.4414@nntp-server.caltech.edu> madler@nntp-server.caltech.edu (Mark Adler) writes: >However, they could have done a lossless compression, using differential >methods, and gotten about twice the time (well over two hours) on a CD. Are you sure about that? Can you do lossless compression of sound in general with differential methods? How do you handle big zero-crossings of fairly high-frequency square-ish waves efficiently?
tmb@ai.mit.edu (Thomas M. Breuel) (04/22/91)
[why don't CD's use compression] 2 - Lossless compression, e.g. Huffmann or LZW. Even assuming that compression and error-correction didn't get on each other's hair, can you say "enough memory and computational power to decode a data stream at 2*16*44.1 kbit/s" ? With fixed codes, decoding requires little more hardware than a ROM and a register. To achieve some additional robustness to errors, you probably want to include a synchronization code every few hundred bits.
madler@nntp-server.caltech.edu (Mark Adler) (04/22/91)
In article <1991Apr21.163913.2249@smsc.sony.com> dce@smsc.sony.com (David Elliott) writes: >In article <1991Apr21.002203.4414@nntp-server.caltech.edu> madler@nntp-server.caltech.edu (Mark Adler) writes: >>However, they could have done a lossless compression, using differential >>methods, and gotten about twice the time (well over two hours) on a CD. > >Are you sure about that? Can you do lossless compression of sound in >general with differential methods? How do you handle big zero-crossings >of fairly high-frequency square-ish waves efficiently? I'm sure that you can get about 2:1 compression losslessly on the average. It would vary with the source material, of course. There is no way to get 2:1 all the time, over short pieces of sound. But some regions will be less, some more. CD players can already handle some buffering to vary the rate of data of the disk to attain an exactly constant rate to the D/A converters. Come the think of it, this may explain why it wasn't done--it would require more ram in the decoder. As was pointed out, the processing requirements for decompression would be minimal. But the buffering may not be. Mark Adler madler@pooh.caltech.edu
aipdc@castle.ed.ac.uk (Paul Crowley) (04/22/91)
In article <1991Apr21.020231.8109@bmerh408.bnr.ca> myhui@bnr.ca (Michael Hui) writes: >I wonder why no compression was used? One possibility: some parts of the music could be more compressible than others. That means that you might get 50k of data one second and 20k the next. But you can't change the speed the disc spins at quickly: you have to get the same amount of data in one second as the next. Therefore no compression. ____ \/ o\ Paul Crowley aipdc@castle.ed.ac.uk \ / /\__/ Part straight. Part gay. All queer. \/
tmb@ai.mit.edu (Thomas M. Breuel) (04/22/91)
One possibility: some parts of the music could be more compressible than others. That means that you might get 50k of data one second and 20k the next. But you can't change the speed the disc spins at quickly: you have to get the same amount of data in one second as the next. Therefore no compression. Data compression only speeds up transfers (for regions that compress poorly, you simply turn it off), so data compression never causes the disk reads to fall behind. On the other hand, and you can start/stop reading from a CD easily when data is coming in too fast. Maybe data compression would increase the amount of buffering needed somewhat, but it is unlikely that the effect would be large enough to make the use of data compression impractical even on low-end systems.
nbvs@cl.cam.ac.uk (Nicko van Someren) (04/22/91)
As I see it there are three reasons why the CD standard people (Philips?Sony)
did not put in compression:
1) They would need extra hardware to decode it and the cost was high enough
already.
2) The data rate would end up uneven because some bits would compress better
than others. Do you remember how much RAM cost in 1983?
3) Using compression would make it harder to gloss over the errors that the
correction hardware could not fix. If you lose raw data it is probably
easier to guess what it was.
The whole idea of 'lossless' or 'lossy' compression of signals that are to be
put into your ear seems a bit silly to me anyway. The fact is it has alot to
do with the person in question. If you look at the data that gets stored
most of it is only about 13bit as most music spends its time at least 10dB
down from the peal level. 16bit linear is too high a resolution at high
amplitudes and too low a resolution at low ones. A 12 or 13 bit log system
would give better quality for the dynamic ranges that music has and take up
only 3/4 of the space (or give you 1/3 more time). While log ADCs cost more
you only have to make a very few of them and log DACs are pretty simple.
So to go back to the original topic of the thread, if you want to compress
your sound files, try storing 10bit log values. It gives you 1.6:1 compression
and I doubt you will notice the 'loss'.
Nicko
+-----------------------------------------------------------------------------+
| Nicko van Someren, nbvs@cl.cam.ac.uk, (44) 223 358707 or (44) 860 498903 |
+-----------------------------------------------------------------------------+
ge@dbf.kun.nl (Ge' Weijers) (04/23/91)
abl@thevenin.ece.cmu.edu (Antonio Leal) writes: > Incidentally, the "1-bit converter" business is delta mod. Do some > manipulation on the 16 bit samples, and feed a delta converter at a > rate high enough to sound good. Sell as a major improvement (well, > it _is_ guaranteed to be monotonic, which 16-bit ADCs should, but > may not, be). The "Do some manipulation" makes the result less than perfectly monotonic. A straightforward implementation of deltamodulation would need a terribly high sampling frequency (2^16 * 44 kHz) so the digital data is preprocessed, and this process adds noise. Care is taken to put most of the noise above the 22kHz by a process called 'noise shaping'. I don't know how that works, as the publications I've seen are less than clear. Ge' -- Ge' Weijers Internet/UUCP: ge@cs.kun.nl Faculty of Mathematics and Computer Science, (uunet.uu.net!cs.kun.nl!ge) University of Nijmegen, Toernooiveld 1 6525 ED Nijmegen, the Netherlands tel. +3180652483 (UTC-2)
jk87377@cc.tut.fi (Juhana Kouhia) (04/24/91)
Have anyone researched possibility to use a fractal compression (IFS) to music. I can see that music contains a lot more selfsimilarities than pictures; for example listen to disco hit music. Same thing with some symphonies. It might be possible compress a file by checking those 'same' waveforms and save one of them and store others as relatives to the saved waveform -- because between the saved and others have a minor differences it needs only few bits to store those relative waveforms. (Huh, got that? My English is not good; well, you might noted that allready. :-) Maybe 1:2 compression ratio... or worst. Juhana Kouhia
mskuhn@faui09.informatik.uni-erlangen.de (Markus Kuhn) (04/25/91)
Scientists from the University of Erlangen have (as far as I know) developed a lossy sound compression method for voice, music, etc. You can select freely the compression ratio. With 64 kbit/s you get the hifi quality you are used to receive with radio. With 128 kbis/s even experts have _very_ big problems to hear any differences from the original CD data. The algorithm may be implemented in real-time on a DSP and is in discussion for being used in ISDN telephones. The human ear has for each frequency a certain perception level. If a sound is weaker than this level, you have no chance to recognize it. A loud frequency component will cause the perception level to raise for nearby frequencies. The algorithm uses this effect in order to throw away the data you can't hear. If you are interessted in this, I can ask the people hear wether there are any English publications. Markus -- Markus Kuhn, Computer Science student -- University of Erlangen, Germany E-mail: G=Markus;S=Kuhn;OU1=rrze;OU2=cnve;P=uni-erlangen;A=dbp;C=de
weigl@sibelius.inria.fr (Konrad Weigl) (05/05/91)
In article <1991Apr23.221537.21108@cc.tut.fi>, jk87377@cc.tut.fi (Juhana Kouhia) writes: > > Have anyone researched possibility to use a fractal compression > (IFS) to music. > I can see that music contains a lot more selfsimilarities than > pictures; for example listen to disco hit music. > Same thing with some symphonies. > > It might be possible compress a file by checking those 'same' > waveforms and save one of them and store others as relatives to the > saved waveform -- Look at newsgroup comp.fractals; Otherwise, there are quite a few papers on IFS & time series out there, I believe; you just have to look it up. As far as music compression via "waveforms", look at Gabor-transforms, or wavelet-decomposition: some french are heavily into this, they call them "ondelettes" which means wavelets. Konrad Weigl Tel. (France) 93 65 78 63 Projet Pastis Fax (France) 93 65 78 58 INRIA-Sophia Antipolis email weigl@mirsa.inria.fr 2004 Route des Lucioles B.P. 109 06561 Valbonne Cedex France