[comp.sys.next] VoiceMail format

bhine@pioneer.arc.nasa.gov.arpa (Butler Hine sst) (12/09/88)

[gobble]

I kept the sample of NeXT voice mail sent out a while ago, but I never
saw any explanation of the format.  Does anyone know what the format is?
I saw a note from someone saying they had decoded it, but I can't find
their address.  Help!  Thanks in advance.

			Butler Hine
			NASA Ames Research Center
			hine@galileo.arc.nasa.gov

desnoyer@Apple.COM (Peter Desnoyers) (12/09/88)

In article <19355@ames.arc.nasa.gov> bhine@nike.UUCP (Butler Hine  sst) writes:
>[gobble]
>
>I kept the sample of NeXT voice mail sent out a while ago, but I never
>saw any explanation of the format.  Does anyone know what the format is?
>I saw a note from someone saying they had decoded it, but I can't find
>their address.  Help!  Thanks in advance.
>
>			Butler Hine
>			NASA Ames Research Center
>			hine@galileo.arc.nasa.gov

I posted that message. The format is simple, like uuencoded data but
just a hair different. The binary data is standard mu-law PCM voice at
8k samples/second, with 8 bit samples. 

the ascii format is lines of:
<space>abcdabcd...(64chars)

where each group of 4 characters ('abcd') decodes to 3 bytes as
follows: subtract 33 (decimal) from each character to get a 6-bit
value. Concatenate 4 6-bit values to get 3 8-bit values.

				Peter Desnoyers

rminnich@super.ORG (Ronald G Minnich) (12/13/88)

In article <21949@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes:
>just a hair different. The binary data is standard mu-law PCM voice at
                                                    ^^^^^^^^^^
Is it possible to explain this in, say, less than 500 words. 
An equation maybe? This is outside of my domain ...
Thanks,
ron

wcs@skep2.ATT.COM (Bill.Stewart.[ho95c]) (12/14/88)

In article <2062@super.ORG> rminnich@duper.UUCP (Ronald G Minnich) writes:
:In article <21949@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes:
:>just a hair different. The binary data is standard mu-law PCM voice at
:                                                    ^^^^^^^^^^
:Is it possible to explain this in, say, less than 500 words. 

Well, to start with, PCM is Pulse Code Modulation - you sample the analog
waveform at N samples per second (telephone business uses 8000 samples/sec)
and send a digital code representing the amplitude of the signal.
There are two standard encodings around - USA & Japan use mu-law and Europe
uses A-law.  Both have basically the same approach - 8 bits of data,
with a non-linear representation.  Because human hearing is non-linear,
you preserve the most sound fidelity if you represent low amplitudes more
precisely, and higher ones less precisely.  So the change in sound level
between byte values 2 and 3 is much smaller than the change between  126
and 127.  I don't have my formulas handy, but the basic difference between
mu-law and A-law is whether there's a code to represent 0, or whether
the encoding is symmetric, with +/- epsilon represented by codes 0 and -1.
-- 
#				Thanks;
# Bill Stewart, AT&T Bell Labs 2G218 Holmdel NJ 201-949-0705 ho95c.att.com!wcs
#
#	News.  Don't ask me about News.

desnoyer@Apple.COM (Peter Desnoyers) (12/14/88)

In article <2062@super.ORG> rminnich@duper.UUCP (Ronald G Minnich) writes:
>In article <21949@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes:
>>just a hair different. The binary data is standard mu-law PCM voice at
>                                                    ^^^^^^^^^^
>Is it possible to explain this in, say, less than 500 words. 
>An equation maybe? This is outside of my domain ...
>Thanks,
>ron

Sure. Mu-law encoding is what everyone uses (e.g. Sprint) unless
they're really cheap. (Note that almost all telephone transmission
except the local loop to your phone is now done digitally, usually in
this format.)

The way it works can be viewed thus: Sample voice at 13 bits per
sample, 8000 times per second. Apply a sort-of-logarithmic function to
squeeze these 13 bits into 8: +1 -> +1, while +4096 -> +128. Unsqueeze
it at the destination. Thus your sampling noise is very small for
quiet signals at the cost of increased sampling noise for loud
signals. The trade-off must be optimal - Bell Labs spent years
researching it. (1/2 :-)

The function is actually a piece-wise linear approximation to
something sort of logarhythmic, most likely because it was easier to
do back in the 60's when they first started using digital
transmission. (Yup, even then it was cheaper to spend _lots_ of money
on digital hardware than it was to find space to string more wires
underground in Manhattan. I think it was the first application of
transistors in the public network.)

[I apologize for any technical inaccuracies or vagueness. I couldn't
find any of my references on mu-law. I have a mu-law to linear table
on line, but the machine is down. oh well...]

				Peter Desnoyers

keith@prism.gatech.EDU (Keith Edwards) (11/21/89)

Forgive me if this has already gone by before, but is there any documentation
available (from NeXT or otherwise) which describes the VoiceMail format?  I've
dug through the online documentation but can't find anything.

Thanks,
Keith


-- 
keith edwards -- the software engineering research center  georgia tech
  internet:  keith@gatech.edu                              atlanta, ga 
    uucp:  {the_known_world}!gatech!keith                  30332-0280