[comp.sys.next] Soundfiles

kasdan@cunixa.cc.columbia.edu (John Kasdan) (03/25/91)

I realize that this is a very basic question, but I did look in the
FAQ and couldn't find the answer, so ...

What is a reference for the format of the Soundfiles?  From the
on-line documentation, I get the impression that data is stored in the
form of 1 byte which is interpretted by something called mulaw.  How
do I find out what all this means?  To be more specific, I would like
to be able to get my hands on data in pure amplitude form, so as to be
able to do FFT work on it.  Like, is there some form of conversion to
16 bit format, and would that do what I want?

Again, I apologize if this is too basic for this group, but I didn't
know where else to ask, and our local "gurus" were not helpful on this.
/KAS            
John Kasdan,  Columbia University School of Law
"If there weren't so many people righting wrongs, there wouldn't be so
many wrongs to right". H.J. Simon, Bridge in the Menagerie.

cpenrose@sdcc13.ucsd.edu (Christopher Penrose) (03/26/91)

In article <1991Mar24.195924.3238@cunixf.cc.columbia.edu> kasdan@cunixa.cc.columbia.edu (John Kasdan) writes:
>
>What is a reference for the format of the Soundfiles?  From the
>on-line documentation, I get the impression that data is stored in the
>form of 1 byte which is interpretted by something called mulaw.  How
>do I find out what all this means?  To be more specific, I would like
>to be able to get my hands on data in pure amplitude form, so as to be
>able to do FFT work on it.  Like, is there some form of conversion to
>16 bit format, and would that do what I want?

look at the header files in /usr/include/sound.  Here is an excerpt from
soundconvert.h:

unsigned char SNDMulaw(short n);
short SNDiMulaw(unsigned char m);
/*
 * Routines to convert from 8 bit mulaw sound to/from 16 linear sound.
 * SNDMulaw returns the mulaw value for the given 16 bit linear value,
 * and SNDiMulaw returns the 16 bit linear value for the given mulaw value.
 */

soundstruct.h will give you some info on the header that is prepended to 
each NeXT soundfile.  Look at the on-line documentation again.  It is
very helpful. 

Christopher Penrose
jesus!penrose

minich@unx2.ucc.okstate.edu (Robert Minich) (03/26/91)

kasdan@cunixa.cc.columbia.edu (John Kasdan) writes:
! What is a reference for the format of the Soundfiles?  From the
! on-line documentation, I get the impression that data is stored in the
! form of 1 byte which is interpretted [sic] by something called mulaw. How
! do I find out what all this means?  To be more specific, I would like
! to be able to get my hands on data in pure amplitude form, so as to be
! able to do FFT work on it.  Like, is there some form of conversion to
! 16 bit format, and would that do what I want?
 
by cpenrose@sdcc13.ucsd.edu (Christopher Penrose):
| look at the header files in /usr/include/sound.  Here is an excerpt from
| soundconvert.h:
| 
| unsigned char SNDMulaw(short n);
| short SNDiMulaw(unsigned char m);
| /*
|  * Routines to convert from 8 bit mulaw sound to/from 16 linear sound.
|  * SNDMulaw returns the mulaw value for the given 16 bit linear value,
|  * and SNDiMulaw returns the 16 bit linear value for the given mulaw value.
|  */
| 
| soundstruct.h will give you some info on the header that is prepended to 
| each NeXT soundfile.  Look at the on-line documentation again.  It is
| very helpful. 
| 
| Christopher Penrose
| jesus!penrose  

How about a reference that explains mulaw encoding for those of use
without these info sources. I don't have access to a NeXT (still too
expensive at this time...price is right, wallet is wrong...) but I want
to know how it works.

Thanks.
-- 
|_    /| | Robert Minich            |
|\'o.O'  | Oklahoma State University| "I'm not discouraging others from using
|=(___)= | minich@d.cs.okstate.edu  |  their power of the pen, but mine will
|   U    | - "Ackphtth"             |  continue to do the crossword."  M. Ho

eps@toaster.SFSU.EDU (Eric P. Scott) (03/31/91)

In article <1991Mar26.004133.4930@unx2.ucc.okstate.edu>
	minich@unx2.ucc.okstate.edu (Robert Minich) writes:
>How about a reference that explains mulaw encoding for those of use
>without these info sources.

Mu-law encoding is a form of nonuniform companding used for
digitized voice in U.S. telephony.  8 bit mu-law provides a
small-signal S/N ratio and dynamic range roughly equivalent to a
12-bit linear representation--thus it can be thought of as a data
compression technique.  [In practice, it *is* losing information
(and the difference between the original and the encoded version
is digital noise); it's designed to get 50% more "value" out of a
communications channel without sacrificing too much speech
intelligibility.  It's not something you'd want to use for
arbitrary audio.]  One property is does share with compressed
data is that "you can't do anything useful with it" without
expanding it first.

mu-law converts normalized real amplitudes according to

		input: 0 <= |x| <= 1

		    log ( 1 + {mu} |x| )
		y = --------------------
		      log ( 1 + {mu} )

		parameter {mu} = 255

[8-bit mu-law is a fixed-point representation, and something I
read suggests that it may actually be a 15-segment linear
approximation to the above.  I confess to not quite doing my
homework on this one; I need to track down a copy of CCITT
Recommendation G.711 next.]

Anyway, since mu-law quantization levels are more closely spaced
at low amplitudes, their S/N ratio improves, while higher
amplitudes tend to mask quantization noise.  Low frequencies with
changing amplitudes would have the most perceptible problems, but
the nominal bandpass filtering used with 8KHz sampling attenuates
frequencies below 300Hz ... so we stay out of trouble.

					-=EPS=-