[comp.dsp] US Gov. sends real-time audio at 2400 baud!

sl35746@uxa.cso.uiuc.edu (By-Tor) (11/09/90)

Hi.  I heard that the US government uses 2400 baud modems to send real-time
audio, to get a secure line anywhere in the world.  Does anyone know what
kind of algorithm they are using for compression?  I think it would be great
if we could make a real-time audio irc for people with sampling/playback
capability.  Anyways, consider that 2400 baud=2400 bits per second, meaning
without compression, you could get 2400 hz sampling at 1 bit resoultion.
This is totally unintelligable, so how the heck could they compress it
so much?  Obviously some kind of destructive method must be used.  I thought
maybe they look for duplicate waves or something.  Any ideas?

hammond@cod.NOSC.MIL (John A. Hammond) (11/09/90)

I suspect that there is some misinformation here.  The article
is probably refering to STUIII's which have a data port which
currently runs at 2400 baud.  The instrument samples and sends
voice at a higher bit rate that 2400.  Some of the original
glossies on the beast promised 9600 baud data but that is
still a dream as far as I know.

jbuck@galileo.berkeley.edu (Joe Buck) (11/09/90)

In article <2449@cod.NOSC.MIL>, hammond@cod.NOSC.MIL (John A. Hammond) writes:
|> I suspect that there is some misinformation here.  The article
|> is probably refering to STUIII's which have a data port which
|> currently runs at 2400 baud.  The instrument samples and sends
|> voice at a higher bit rate that 2400.  Some of the original
|> glossies on the beast promised 9600 baud data but that is
|> still a dream as far as I know.

There's misinformation, but only in article <2449@cod.NOSC.MIL>.  Yes,
the STU-III boxes are 2400 bps speech codecs; they use linear predictive
coding (LPC) to achieve this rate.  It doesn't sound all that great,
but it's 2400 bps, it's understandable, and you can recognize who is
at the other end (at least if there isn't background noise).

Commercial and military 2400 bps speech codecs have been around for
quite a while.  Conceptually they are very simple: break up the speech
into frames of about 20 msec, average; estimate the pitch, do LPC
spectrum analysis, and send parameters corresponding to pitch, LPC
parameters (reflection coefficients or line spectrum frequencies),
and gain.  Then you do lots of tricks to improve the speech quality.

There are now speech coders running at 9.6 Kbps that sound nearly as
good as a telephone line for voice; they typically use CELP (code
excited linear prediction) or a related algorithm.


--
Joe Buck
jbuck@galileo.berkeley.edu	 {uunet,ucbvax}!galileo.berkeley.edu!jbuck

pitaro@rocket.uucp (Michael Pitaro) (11/09/90)

In article <1990Nov8.210640.2893@ux1.cso.uiuc.edu> sl35746@uxa.cso.uiuc.edu (By-Tor) writes:

   From: sl35746@uxa.cso.uiuc.edu (By-Tor)
   Subject: US Gov. sends real-time audio at 2400 baud!
   Date: 8 Nov 90 21:06:40 GMT
   Sender: news@ux1.cso.uiuc.edu (News)

   Hi.  I heard that the US government uses 2400 baud modems to send real-time
   audio, to get a secure line anywhere in the world.  Does anyone know what
   kind of algorithm they are using for compression?  I think it would be great
   if we could make a real-time audio irc for people with sampling/playback
   capability.  Anyways, consider that 2400 baud=2400 bits per second, meaning
   without compression, you could get 2400 hz sampling at 1 bit resoultion.
   This is totally unintelligable, so how the heck could they compress it
   so much?  Obviously some kind of destructive method must be used.  I thought
   maybe they look for duplicate waves or something.  Any ideas?

An example of a secure telephone is a STU III.  It uses 1200 bps
LPC10-E for voice compression followed by encryption and a
REED-SOLOMON block code for error recovery.  The voice quality is poor
but intelligible.  I understand the government is going to upgrade the
STU III to use 2400 bps CELP voice compression.  Motorola sells one
version of the STU III.  I hear ATT sells another version.  The
encryption keys come from the government.  I don't know if either
company offers a commercial encryption version.  The source code for
both of these voice compression algorithms (without encryption) can be
ordered from NSA for a few hundred dollars.
--
-------------------------------------------------------------------------
Michael Pitaro                                   Lockheed Sanders, Inc.
Technical Initiative Manager                     PO Box 2034, MER24-1583
Voice Communications Initiative                  Nashua, NH 03061-2034
Signal Processing Center of Technology           (603)885-9794 (office)
                                                 (603)885-0631 (fax)
e-mail: pitaro@rocket.sanders.com

elliott@optilink.UUCP (Paul Elliott x225) (11/10/90)

In article <1990Nov8.210640.2893@ux1.cso.uiuc.edu>, sl35746@uxa.cso.uiuc.edu (By-Tor) writes:
> Hi.  I heard that the US government uses 2400 baud modems to send real-time
> audio, to get a secure line anywhere in the world.  Does anyone know what
> kind of algorithm they are using for compression?  [...]

The following might be obsolete information...

The Gov't has used voder / vocoder technology (are those terms correct?) to
send very low bit-rate voice communications.  This technique breaks up the
speech input into multiple frequency bands, measures the energy level in 
each band, and sends the level info at a fairly slow rate ( approx 10 ms?).
At the receiving end, the data is used to control the amplitude of multiple
fixed-frequency sources.

This was originally done with an all analog approach, but obviously FFT
and digital synthesis could be used.  The trade-offs are the number (and
placement) of the frequency bands, and the sampling rate.  The signal
quality is poor, but understandable.  Individual voice characteristics
are pretty much lost; the voice usually sounds robot-like.

I've been told that on the Pink Floyd Album "Wish You Were Here", on the
song "Sheep", they use a vocoder for the "Through the vally of death"
spoken portion.  To my ear, it sounds more like a DSB (ring) modulator,
but on the otherhand I've never heard a vocoder in person, so it could be.

The Heinlein novel _The Moon Is A Harsh Mistress_ has a nice section where
a vocoder is connected to "Mike", the self-aware computer.

And since my technical education on the subject of vodocer technology
is derived largely from Heinlein novels and Pink Floyd records, you are free
to take this with a grain of salt ;-)

-- 
      Paul M. Elliott      Optilink Corporation     (707) 795-9444
            {uunet, pyramid, pixar, tekbspa}!optilink!elliott
                          "Think Blue, Count Two."

jbuck@galileo.berkeley.edu (Joe Buck) (11/10/90)

In article <PITARO.90Nov9085439@sleepy.uucp>, pitaro@rocket.uucp (Michael Pitaro) writes:
|> An example of a secure telephone is a STU III.  It uses 1200 bps
|> LPC10-E for voice compression followed by encryption and a
|> REED-SOLOMON block code for error recovery.  The voice quality is poor
|> but intelligible.  I understand the government is going to upgrade the
|> STU III to use 2400 bps CELP voice compression.  Motorola sells one
|> version of the STU III.  I hear ATT sells another version.

Correct except that your rates are off by a factor of two.  The LPC10-E
algorithm is 2400 bps; the federal standard CELP algorithm is 4800bps.

--
Joe Buck
jbuck@galileo.berkeley.edu	 {uunet,ucbvax}!galileo.berkeley.edu!jbuck

jean@mcgill-vision.uucp (Pierre Racz) (11/10/90)

It is possible to send intelligible voice at bit rates as low as 1k However
the quality suffers.  Speech quality can fall in one of the following
classes:

1 - Broadcast
2 - Toll
3 - Communication
4 - Synthetic

Currently, broadcast quality requires 64kbps and is your typical AM radio
station quality.  Toll quality is the standard telephone fare.  By complex
coding the bit rate can be reduced to about 10k bps.  Communication quality
is highly intelligible but has noticeable distortions, your would expect this
from short wave radio.  This level can be achieved with 4.8kbps.  Synthetic
quality, while 80% to 90% of the speech is intelligible, sounds "machine
like", "buzzy" and the identity of the speaker cannot be recognized any more.
Code rates of less than 4.8kbps (and as low as 1kbps) can be achieved at
this level of quality.

As time goes on, these bit rates will be reduced further.

pitaro@rocket.uucp (Michael Pitaro) (11/12/90)

In article <39484@ucbvax.BERKELEY.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes:

>   Correct except that your rates are off by a factor of two.  The LPC10-E
>   algorithm is 2400 bps; the federal standard CELP algorithm is 4800bps.
>
>   --
>   Joe Buck
>   jbuck@galileo.berkeley.edu {uunet,ucbvax}!galileo.berkeley.edu!jbuck

You are, of course, correct on the vocoder bps rates.  We recently
completed the implementation of a 4800 bps Multi-Band Excitation (MBE)
vocoder here so I should have caught that error.

For anyone who's interested I've listed a few references for vocoders.
It's not meant to represent anything but some papers I had in a
vocoder folder sitting on my desk.

------------------------------
Linear Predictive Coding (LPC)
------------------------------
Thomas E. Tremain, "The Government Standard Linear Predictive Coding
Algorithm: LPC-10", Speech Technology, April 1982, pp. 40-49.

"Telecommunications: Analog to Digital Conversion of Voice by 2,400
Bit/Second Linear Predictive Coding",  FED-STD-1015 November 28, 1984

George S. Kang and Stephanie S. Everett, "Improvement of the
Excitation Source in the Narrow-Band Linear Prediction Vocoder", IEEE
Trans on ASSP, Vol. ASSP-33, No. 2, April 1985, pp. 377-386.

J.R. Blakley, et. al., "An Efficient Implementation of the LPC-10e
Speech Coding Algorithm", MILCOM-87, pp. 24.6.1-24.6.5

John C. Nagengast, "The STU-III, Narrowband Speech Comes of Age",
Military Speech Tech-87, pp. 44-47.

---------------------------
Multi-Band Excitation (MBE)
---------------------------

Daniel W. Griffin, "Multi-Band Excitation Vocoder," PhD Thesis, EECS
Department, MIT, 1987.

John C. Hardwick and Jae S. Lim, "A 4.8 kps Multi-Band Excitation
Speech Coder", ICASSP-88 (??), pp. 374-377.

-------------------------------------
Code Excited Linear Prediction (CELP)
-------------------------------------

David P. Kemp, "Department of Defense Requirements for a 4800 bps
Voice Coder", Military Speech Tech-87, pp. 48-50.

Daniel Lin, "High-Quality 4800 bps Speech Coding for Real-Time
Applications", Military Speech Tech-87, pp. 51-53.

--
Michael Pitaro
pitaro@rocket.sanders.com

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (11/12/90)

In <1990Nov8.210640.2893@ux1.cso.uiuc.edu> sl35746@uxa.cso.uiuc.edu
(By-Tor) writes:

>...consider that 2400 baud=2400 bits per second...

2400 baud may or may not equal 2400 bits per second, depending on the
modulation technique used.  More typically, 2400 baud equals 9600 or
more bits per second (i.e., 4 or more bits per baud).

We now return you to your regularly scheduled misinformation.
--
Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com>
UUCP:  oliveb!cirrusl!dhesi

duerr@motcid.UUCP (Michael L. Duerr) (11/13/90)

From article <1990Nov8.210640.2893@ux1.cso.uiuc.edu>, by sl35746@uxa.cso.uiuc.edu (By-Tor):
> Hi.  I heard that the US government uses 2400 baud modems to send real-time
> audio, to get a secure line anywhere in the world.  Does anyone know what
> kind of algorithm they are using for compression? 
> Anyways, consider that 2400 baud=2400 bits per second, meaning
> without compression, you could get 2400 hz sampling at 1 bit resoultion.
> This is totally unintelligable, so how the heck could they compress it
> so much?  

From my experience, I would guess that they are using linear predictive coding 
of some sort. I also know that years ago the US Gov't had some systems that
used subband coding.  However, LPC seems to be the scheme most preferred,
probably for cryptological reasons that make the encryption hard to break.
LPC works by modeling the vocal tract.  No attempt is made to compress 
samples: this method works only on speech. 
Also note that baud defined as the number of transitions per second.
If each possible symbol has 4 or 8 possible values, for instance, the
bit rate would be 4800 or 9600.  Modulations like QPSK, QAM, MFSK, etc.
have these type of symbols.
As a last note, I would guess that whatever comes out of the spook 
boxes is intelligible, but not necessarily what would be considered
quality speech.
A lot of good work on speech compression appears in back issues of the
Bell System Technical Journal; also, I would recommed the book by
Oppenheim and Lim.

raoul@eplunix.UUCP (Nico Garcia) (11/14/90)

In article <1990Nov8.210640.2893@ux1.cso.uiuc.edu>, sl35746@uxa.cso.uiuc.edu (By-Tor) writes:
> Hi.  I heard that the US government uses 2400 baud modems to send real-time
> audio, to get a secure line anywhere in the world.  Does anyone know what
> kind of algorithm they are using for compression?  I think it would be great
> if we could make a real-time audio irc for people with sampling/playback
> capability.  Anyways, consider that 2400 baud=2400 bits per second, meaning
> without compression, you could get 2400 hz sampling at 1 bit resoultion.
> This is totally unintelligable, so how the heck could they compress it
> so much?  Obviously some kind of destructive method must be used.  I thought
> maybe they look for duplicate waves or something.  Any ideas?

One respondent pointed out that 2400 baud may actually equal 9600 bits/second.
But a more fundamental question: who says 2400 bps is unintelligible? I had
to look at some work on speech comprehension with clipped waveforms: a guy
named Lickliter documented that people could understand something over 80%
of speech, even with absolute clipping (equivalent to one bit data). I don't
know if he ever tried reducing the bandwidth, but humans have a marvelous
ability to make speech out of the most corrupted signals. And partial
clipping has been used to turn a 60-dB range input to a 20-dB range
transmission for equipment like transmitters and hearing aids for years.

-- 
			Nico Garcia
			Designs by Geniuses for use by Idiots
			eplunix!cirl!raoul@eddie.mit.edu

prism@chinet.chi.il.us (PRISM Interactive Products) (11/14/90)

Any knowledgeable souls out there who could tell me where to get a copy of the
US Federal Standard for 4800bps CELP?  (And any other Federal Standards related 
to audio/voice coding).   Thanks for any help.

Bob Dyas
PRISM Interactive Products

cfreese@super.ORG (Craig F. Reese) (11/16/90)

In article <1990Nov13.201631.464@chinet.chi.il.us> prism@chinet.chi.il.us (PRISM Interactive Products) writes:
>
>Any knowledgeable souls out there who could tell me where to get a 
>copy of the US Federal Standard for 4800bps CELP?  (And any other 
>Federal Standards related to audio/voice coding).
>

I posted this awhile back but maybe it needs posted again:

There is a pretty good article that describes CELP in the Apr/May '90
issue of Speech Technology Magazine.  The title is "The Proposed Federal
Standard 1016 4800bps Voice Coder: CELP."  It would be a good place
to start to satisfy one's curiosity.

In that article they give the place to send for CELP:

     Bob Fenichel
     National Communication System,
     Office of Technology and Standards
     Washington, DC  20305-2010

     Phone: 202-692-2124
     Email: (none that I know of)

I also heard of (but have not seen) a paper that compares CVSD, LPC, CELP.
In light of some of the earlier discussions in this thread some readers
might be interested.

    "Comparison of U.S. Government Std. Voice Coders"
    Welch, V.C, et al.
    1989 IEEE Military Communications (MilCom) Conference.

Have fun,
craig

*** The opinions expressed are my own and do not necessarily reflect 
*** those of any other land dwelling mammals (including my management)....
-----------------
Craig F. Reese                           Email: cfreese@super.org
Institute for Defense Analyses/
Supercomputing Research Center
17100 Science Dr.
Bowie, MD  20715-4300