[comp.compression] Voice Standards

sean@hazel.eos.scg.hac.com (05/29/91)

I'm looking into multimedia standards and compression methods.  Can someone 
tell me about voice standards and compression techniques?  Are there any 
standards for handling voice data?  Are there any standard compression 
techniques for voice data?  Would this be covered by MPEG or are there
a separate voice compression standard?

Thanks for any info.

glenn@zeus.ocs.com (Glenn Ford) (06/03/91)

In article <15060@hacgate.UUCP> sean@hazel.eos.scg.hac.com () writes:
>I'm looking into multimedia standards and compression methods.  Can someone 
>tell me about voice standards and compression techniques?  Are there any 
>standards for handling voice data?  Are there any standard compression 
>techniques for voice data?  Would this be covered by MPEG or are there
>a separate voice compression standard?
>
>Thanks for any info.

What I do know, albeit not alot (yet i hope), is that Multimedia voice
data, in digital form must be in either PCM (Pulse coded Modulation) or
ADPCM. That former for IFF standards and ADPCM for RTF (Real Time Files)
format.  This, however, is Phillips standard, I do not think there is an
International Standard.  Probably will not be for a long time. For    
compression, We do not compress our data, due to the lack of processor
speed.  Our multimedia platform (Phillips CDI) is meant to be a simple
inexpensive machine for CONSUMERS.  Thats the whole waahoo over multimedia.
So I can't help you there, sorry!

Glenn Ford
glenn@ocsmd.ocs.com
..uunet!ocsmd!glenn

campbell@sol.cs.wmich.edu (Paul Campbell) (06/04/91)

Look again. There are some Internet standards (although shaky at best) that
use far better compression than ADCPM. Almost all of them are based off of
LPC coding. In LPC coding, the best numbers I have seen were from Dr.
Markhoul (did I get the name right? I don't have my notes pile nearby), in
which he got the bit rate down to a more reasonable 150 bps or less.

kornai@csli.Stanford.EDU (Andras Kornai) (06/04/91)

In <1991Jun3.203049.7349@sol.cs.wmich.edu> campbell@sol.cs.wmich.edu (Paul Campbell) writes:

>In LPC coding, the best numbers I have seen were from Dr.
>Markhoul (did I get the name right? I don't have my notes pile nearby), in
>which he got the bit rate down to a more reasonable 150 bps or less.

Dream on! Your average language has some 40 to 60 phonemes, so you
need 5 to 6 bits/phoneme. In moderately fast speech there are 15
phonemes/sec -- this gives an absolute lower limit around 80 bps.  If
you want to code intonation/prosody at all, you will need another 4
bits/phoneme, bringing it up to 100-200bps, depending on speech rate.

John Makhoul's work at BBN is at the cutting edge of research -- it is
very far from a product, even farther from a standard. I think he is
at 300bps (using some very sophisticated vector quantization (VQ) and
linear predictive coding (LPC) techniques) which is pretty impressive
compared to the 9.6kbps now more or less standard for voice
compression, but makes decompressed speech sound pretty synthetic.

Andras Kornai (kornai@csli.stanford.edu)