[comp.compression] Compressing SOUND files

warwick@cs.uq.oz.au (Warwick Allison) (03/25/91)

Sound files I use are of a simple, common format where a series
of amplitude samples, taken at regular time intervals (at about
11KHz, or something).

Does anyone have any information on compressing such a class of
information?  The class no doubt includes many other real-world
samples.

Some obvious things are:

	If samples are taken at very high frequencies, there
	is a high correlation between successive samples, and
	a run-length encoding of DIFFERENCES between successive
	samples would be good.

Unfortunately, most samples are taken at a speed which is JUST
sufficient to capture the sound (2H and all that stuff).

My first guess is that a lossy compression method would perhaps
work well (eg. one which first removes all the `noise' from the
channel).

Any ideas/references?

Warwick.
--
  _--_|\   	warwick@cs.uq.oz.au
 /      *  <--	Computer Science Department,
 \_.--._/ 	University of Queensland,
       v    	AUSTRALIA.

<LEEK@QUCDN.QueensU.CA> (03/26/91)

I am not into this area, but I came across a little bit of material on the
subject...  If you don't mind the lost of sound quality, then there is a
technique call "Fibonacci Delta Compression" (1)

This is like the trasitional delta encoding where each value is a delta of
the previous one, but only discrete delta values are used.  Small delta
deltas are encoded precisely, and large ones are approximated.  From time
to time, some minor adjustments have to make to correct for all those
approximations.  This is more of a post processing thing unless your
hardware happens to be able to process the incoming digitizing sample
without much help from the CPU (eg. DMA)

The values he uses are: { -34,-21,-13,-8,-5,-3,-2,-1,0,1,2,3,5,8,13,21}
(Numbers that are part of Fibonacci series and hence the name) Note that the
delta value can be encoded in 4 bit with some minor distortion.

K. C. Lee
Elec. Eng. Grad. Student

Reference
1: Algorithm by Steve Hayes at Electronic Arts.  It is outlined and shown in C
code example for the decoding algorithm in "Amiga ROM Kernal Reference Manual:
EXEC" by Commodore Business, published by Addison Wesley (ISBN 0-201-11099-7) on
Appendix B-68.  This is part of the list of program example on a common file
format (sound, graphics and lots of things) developed for (but not restricted
to)  the Commodore Amiga Computer.  Most of these stuff are in public Domain.

golds@fjc.GOV (Rich Goldschmidt) (03/27/91)

In article <394@uqcspe.cs.uq.oz.au>, warwick@cs.uq.oz.au (Warwick Allison) writes:
> 
> Sound files I use are of a simple, common format where a series
> of amplitude samples, taken at regular time intervals (at about
> 11KHz, or something).
> 
> Warwick.
> --
>   _--_|\   	warwick@cs.uq.oz.au
>  /      *  <--	Computer Science Department,
>  \_.--._/ 	University of Queensland,
>        v    	AUSTRALIA.

This may not match your needs, but may be of general interest.

I saw a recent announcement for a Walkman sized box that plugs into a 
parallel port on one side, and audio input from a mike or amplifier on
the other.  It uses a 16-bit adaptive differential pulse coding method that 
uses about a quarter of the usual space required by PCM coding.  It is
available now, costs about $250, does not take a slot, and complies with 
Microsoft's multimedia extensions to Windows 3.0.  Contact Meridian Data,
(408) 438-3100 and ask about "SoundByte".  

-- 
Rich Goldschmidt: uunet!fjcp60!golds or golds@fjc.gov
Commercialization of space is the best way to escape the zero-sum economy.
Disclaimer: I don't speak for the government, and it doesn't speak for me...