[comp.dsp] Audio Compression - sidebar on pitch shifting

jj) (09/25/89)

In article <7814@microsoft.UUCP> brianw@microsoft.UUCP (Brian Willoughby) writes:
>Basically, I'm saying that its too easy to avoid the distortion from
>dropping samples, so why do it?
Um, we don't agree on this. You might want to consider the
spectral considerations of dropping samples, or inserting samples.
It's rather a pecular effect, and it's rather striking, in practice.
There are SOME pitch-shifting algorithms I know of, most specifically
one by David Malah and others, called "Time Domain Harmonic Scaling"
that works for small ratios, i.e. 1/2, 2/1, 3/2 2/3 and so on.
It's HARD to do a 44/45th pitch shift, in my experience, except by
such methods as micro-silence increase-decrease.  Some signals
aren't compatable with this sort of processing, so it's hardly
a universal technique.

>
>On a side note, I have heard that someone has developed a compression
>scheme to fit sixteen times as much data on a CD as is currently done.
Not one, but many people, although I'm not aware of anyone who
has claimed a 16/1 compression that is TRANSPARENT.  Some references
on compressions schemes that work (no particular order) follow,
with comments.

1)  Johnston, J. D., "Transform Coding of Audio Signals Using
Perceptual Noise Criteria", IEEE Journal on Selected Areas
of Communications, Vol 6 (1988), pp 314-323. (February issue)
(This discusses 128 kb/s compression techniques. It's old
work at this point).
<Newer> Johnston, J. D., "Perceptual Transform Coding of Wideband
Stereo Signals" (or something close to that, I can't remember
the title exactly) in ICASSP-89 Conference proceedings. (There are
only two J. D. Johnston papers in the conference, so it's easy
to find in the author index.)  This is more recent (but not
the newest) work, and talks about 1.45 bits/sample coders,
among others.

2) Brandenburg, K, & Seitzer, D., "OCF: Coding High Quality
Audio with Data RAtes of 64 kbit/sec", 85th AES Convention,
Los Angeles, ??preprint??. <newer>

(older) Brandenburg, K, "OCF - A new coding algorithm for high quality sound
speech signals", ICASSP 1987, pp 141-144.
(similar to 1, in results.  Authors 1 and 2 seem to have converged
to very similar systems from different starting points.)

3) Theile, G, and Link, M, and Stoll, G "Low bit rate coding of high
quality audio signals", , 82nd AES convention, London 1987, 
Preprint 2432, (C-1).
(a different style of coding algorithm than 1 and 2)

>If you think about the typical audio waveform, you'll understand that
>it is easy to compress.  Just by storing the *difference* between
>adjacent samples, and assuming that there are no impulses, a great
 but, but -------------------------------------^^-^^^^^^^^
there ARE impulses, and doublets in audio signals.  When you
assume that, your oversimplification falls apart.  Yes,
in general, simple DPCM works, but it fails miserably
at low bit rates becuase of noise-shape problems.


In general, 64 kb/s systems aren't quite transparent for
the toughest signals.  96 kb/s systems seem to be transparent.
128kb/s is MORE than enough.
-- 
To the Lords of   *Mail to jj@alice.att.com  or alice!jj
Convention        *HASA, Atheist Curmudgeon Division
'Twas Claverhouse *Copyright alice!jj 1989, all rights reserved, except
Spoke             *transmission by USENET and like free facilities granted.