[comp.dcom.telecom] 7kHz Voice and ISDN

jackson%sdcsvax@ucsd.edu> (12/08/89)

As I understand it, work is afoot to implement a standard for the ISDN
defined 7kHz voice service, wherein audio sampled (presumably) at 16
ksamples per second is encoded (using cunning modern techniques) at
the ISDN bearer channel rate (64 kbps).

I envisage the appearance of "hi-fi" telephones capable of using this
service. Voice would be clearer and music could be carried (with
fidelity equivalent to that of a.m. radio). Further, digital
technology could enable superior echo cancelling allowing speakerphone
use without the "in-a-tomb" effect.  Clearly, the new phones would
have to be compatible with POTS phones, but Q.931 and SS7 know enough
for the service to be negotiated automatically on call set-up.

Such phones might become the next great consumer electronics fad,
following compact discs and cellular phones.  Once people heard the
higher quality, they might feel they had to have one, to keep up with
their yuppie friends.

Any comments? From those who know how the technology is progressing?
From potential owners?

Oh yes, if these things caught on, they would drive the market for
ISDN lines to residential as well as business premises. Just what the
local carriers need!

Dick Jackson

goldstein@delni.enet.dec.com (12/09/89)

In article <1933@accuvax.nwu.edu>, ttidca.TTI.COM!jackson%sdcsvax@ucsd.edu 
(Dick Jackson) writes...
 
>As I understand it, work is afoot to implement a standard for the ISDN
>defined 7kHz voice service, wherein audio sampled (presumably) at 16
>ksamples per second is encoded (using cunning modern techniques) at
>the ISDN bearer channel rate (64 kbps).
 
>Any comments? From those who know how the technology is progressing?
>From potential owners?

Funny you should ask.  Yes, there's a new ISDN 7 kHz audio bearer
service.  It makes use of 64 kbps ADPCM encoding.  (Digression:
Standard PCM uses 64 kbps to do 3.1 kHz audio.  ADPCM is more
efficient, so 32 kbps is essentially adequate for 3.1 kHz audio, with
only minimal distortion (modems might complain, humans won't).  So if
you use the ADPCM principle on the usual 64 kbps bandwidth, you can
get better audio.)

The network uses PCM to generate tones and announcements for ISDN
telephones in the telephony bearer service.  The 7 kHz standard says
that you begin all calls in standard 3.1 PCM mode, specifying that
it's really a 7 kHz call.  Once the two ends are connected to each
other, they do a handshake to confirm that they're ready to switch
into 7 kHz mode.  That way the terminals are in 3.1 PCM mode when
doing call setup (talking to the network) and in 7 kHz ADPCM mode when
actually communicating with each other.

This hack makes it essentially transparent to the network, which will
speed implementation.  You just need the chips in your telephones.  I
don't personally see much use for it in "handsets", given their cruddy
mic/speaker combos, but it could be very nice for speakerphones, audio
dial-up program services, remote broadcast feeds, etc.

     fred  (member, ANSI T1S1, speaking for himself)

stodol@diku.dk (David Stodolsky) (12/20/89)

goldstein@delni.enet.dec.com in <2002@accuvax.nwu.edu> writes

>Funny you should ask.  Yes, there's a new ISDN 7 kHz audio bearer
>service.  It makes use of 64 kbps ADPCM encoding. 

Mermelstein, P., (1988). G.722, A New CCITT Coding Standard for
Digital Transmission of Wideband Audio Signals (IEEE Communications
Magazine, v. 26, n. 1) describes a way to split audio input into two 4
khz bands using ADPCM coders. Audio data can be transmitted at 64, 56,
or 48 kbits, thus allowing simultaneous transmission of other data.
The system is targeted toward "audio- visual conferencing applications
where one would like to approach the quality of face-to-face
communication (p. 8)."

My interest, is not the improvement in audio quality, but the use of
data-speech multiplexing. This is projected in the article, for
speaker identification or fax on the established connection. One of
the major problems in teleconferencing is speaker selection, how to
decide on the next speaker without using the normal cues one has when
face-to-face. The Danish Telecommunication Research Labs. produced a
pre-ISDN prototype with separate lines for audio, and speaker id and
queuing data via modem, some years back.  It turned out to be too
complex for practical use. A version of my equal-time resolution rule
was programmed into that system (Stodolsky, D. (1987).  Dialogue
management program for the Apple II computer. _Behavior Research
Methods, Instruments, & Computers_, _19_, 483484.). This rule has been
show to yield benefits in both emotional tone and group performance in
controlled experiments.

I would like to see the rule applied in one of these new ISDN
conferencing systems, but its hard to get the attention of the
equipment suppliers on this point. They typically resort to
centralized control by a chairmen, without even the ability to run on
"auto pilot", where people queue themselves up by pressing a "request"
button or just by starting to talk with a voice-operated switch
"pressing" the button for them.

Central control of speakers was strongly disliked in the prototype
system. In fact, all units were eventually rebuilt, so each one could
be the "master" in a multi-unit conference. Chairmen management seemed
a bit clumsy, even when the queuing was automatic and the chair just
announced the name of the next speaker. From a psychological
standpoint, fully distributed control is the only way to go, and it is
quite feasible with ISDN, any takers?


David S. Stodolsky, PhD      Routing: <@uunet.uu.net:stodol@diku.dk>
Department of Psychology                  Internet: <stodol@diku.dk>
Copenhagen Univ., Njalsg. 88                  Voice + 45 31 58 48 86
DK-2300 Copenhagen S, Denmark                  Fax. + 45 31 54 32 11