[net.audio] Sheffield CDs and why they sound bad

herbie@watdcsu.UUCP (Herb Chong, Computing Services) (10/02/84)

Some people have been complaining about the sound of Sheffield Lab's CD
releases.  One of the problems with digital recording as attempted in
the manner that Sheffield does is that the digital recorder does not
have enough dynamic range.  We have people in our physics audio lab
designing and building a dynamic range compression device for digital
recorders because of that.  One theory has it that a digital recorder
with at least 120 dB SNR is required to provide adequate dynamic range
for live digital recording.  If you record to capture the peaks of
something like LAB-2 (I've Got the Music in Me by Thelma Houston), the
lowest levels will be about 40 dB below that.  This leaves you with 50
dB SNR, which is fine for analog systems, though not too listenable,
but not for digital systems.  This ratios is a voltage difference of
only about 300 times.  This means that the lowest range is quantized
with only about 8 bits (2^8 = 256).  What does 8 bit quantization sound
like?  I've heard demos at 10, 8, and 4 bits quantization.  Something
sounds wrong at 10 bits; at 8 bits, everything sounds harsh; at 4 bits,
the sound is barely recognizable.  A compressor can be used to place
the signal up where more bits are being used and quantization error is
not as noticeable.  This is a problem with all digital recording
systems handling a high dynamic range signal.  More bits are required,
or a signal compression system, to get around this.  Of course, 120 dB
SNR is hard to maintain in analog equipment, and a digital system using
linear encoding would require 20 bits to get that much SNR, and
finally, the number of bits generated would be 1.5 times as great.  Any
comments?

Herb Chong, BASc
Computer Consultant 
Department of Computing Services
University of Waterloo

shauns@vice.UUCP (Shaun Simpkins) (10/10/84)

Something bothers me about this dynamic range business.  Let's assume that
we have a recording medium with 120dB dynamic range.  Further, let's assume
that MOL (maximum output level) of the recorder will produce a SPL of 130dB,
the threshold of pain to the human ear.  This means that the noise level of
the recording medium will be at 10dB, 10dB above the threshold of hearing in
an anechoic chamber and 30dB below the noise floor of a quiet listening room.

We now substitute a 16-bit digital recorder for our hypothetical system, placing
its MOL at the same point.  Its noise floor will now be at 34dB SPL, still
below the noise floor of a quiet listening room by 6dB.  Put another way, the
difference between ambient noise and a music signal is one bit.

Will a music signal sufficient to twiddle only the LSB of the digital recorder
sound bad?  Maybe, but I claim that you won't notice it, since you'll be
straining so hard to filter out all the ambient noise.

I suspect that the digitization study cited used signals far above the noise
floor of the listening chamber.  What happens when the signals digitized at
4, 6, or 8 bits are played at the SPLs where they would normally occur?  I
suspect that distortion components would be lost in the noise floor.  Even 10
bit quantization noise is lost in the environment in my example - and LSB
dithering would make it even more indistinguishable from random noise.  The
only time such errors become perceptible is in a highly unlikely environment,
the anechoic chamber.

Given this, I find it hard to believe that LPs suffer less from dynamic range
problems than CDs.  Good vinyl, I am led to believe, has a dynamic range of
about 70dB.  Allowing for the ear's adaptive filtering (5dB? 10dB?) this is
still a 16-20dB less than 16 bit PCM.  Is the ear really that bothered by PCM's
inverted distortion vs amplitude characteristics?

Yes, professional recorders need to be better than the consumer environment for
many reasons - Multitrack mixdowns and foresight being two.  It does not mean
that the consumer product must be as good.  I challenge you to make a 20-bit
digitizer with a 10 volt full code output.  The LSB would be 10uV!  Try that on
comtemporary IC processes over the commercial temperature range.

A final comment - I like the idea of compressing CDs, but more for stereo
system protection than for ultimate dynamic range.  If this is done, the
compression should be multiband and the level of compression indicated by an
explicit control track instead of by sensing the level of the compressed output
and supposedly ``undoing'' the change, as is done now.  This would reduce
system distortion to levels approaching -80dB(MOL) rather than the -40 to
-60dB(MOL) possible with today's systems (such as dbx), and could be easily
implemented by an extra control word in the CD bit stream with appropriate
sorting in the player.  Of course, this would entail perhaps a 30% in playing
time, about the same as 20 bit PCM but a whole lot more manufacturable.

The wandering squash,

-- 
				Shaun Simpkins

uucp:	{ucbvax,decvax,chico,pur-ee,cbosg,ihnss}!teklabs!tekcad!vice!shauns
CSnet:	shauns@tek
ARPAnet:shauns.tek@rand-relay

karn@mouton.UUCP (10/10/84)

On every CD I own, both those recorded from analog masters and from
digital masters, the recorded-in noise level from microphone amplifiers,
mixers, ventilating systems, etc, exceeds the quantizing noise level
by a considerable margin.  This provides a "self-dithering" mechanism
that completely masks quantizing noise.  The only time I've been able
to hear quantizing noise on a CD is during a silent passage at the
end of a recording where the recording mixer must have ramped the master
gain down. The recorded-in noise breaks up and becomes irregular
for an instant just before it disappears. Mind you, this is occurring
at extremely low signal levels -- my volume control had to be all the way
up to hear this.

Phil

herbie@watdcsu.UUCP (Herb Chong, Computing Services) (10/11/84)

Which is all true, but remember that quantization noise is like IM 
distortion, it is audible at levels that harmonic distortion is not
audible at all.  I was attempting to point out that there is more
to the sound of "digital" that meets the ear.  For all intents and
purposes, dithering the digital signal itself removes all the problems
with quantization noise by randomizing the components of the noise.

As for dynamic range offered by a 16-bit linear system being inadequate,
it has nothing to do with the continuous output power.  One can
easily generate a pulse of about 100uS that can require
thousands of watts to reproduce accurately, and yet not
sound very loud compared to the threshold of pain.  Imagine a bolt of
lightning striking about 500m away.  It has a peak SPL somewhere
around 140+dB.  You aren't immediately deafened and you don't fall to
the ground in pain because the pulse didn't last long enough to transmit
significant energy to your eardrum.  The oft-quoted SPL's for pain is with
a signal that is relatively long-lived by comparison, say one or two
seconds.  Bob Carver has recognized this and has made his amplifiers to
have tremendous reserve power, but only for the short time it is really
needed.  It works because music is, by nature, transient.  He has also
observed that, if it were practical to do so, an amplifer capable of peak
power more than 2000W/ch for about a mS is needed for accurate peak
reproduction with typical speaker efficiencies.  Make of it what you will.


Herb...
Once a hack, always a hack...

UUCP:  {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdcsu!herbie
CSNET: herbie%watdcsu@waterloo.csnet
ARPA:  herbie%watdcsu%waterloo.csnet@csnet-relay.arpa
BITNET: herbie at watdcs

mat@hou4b.UUCP (Mark Terribile) (10/11/84)

>A final comment - I like the idea of compressing CDs, but more for stereo
>system protection than for ultimate dynamic range.  If this is done, the
>compression should be multiband and the level of compression indicated by an
>explicit control track instead of by sensing the level of the compressed
>output and supposedly ``undoing'' the change, as is done now.  This would
>reduce system distortion to levels approaching -80dB(MOL) rather than the
>-40 to >-60dB(MOL) possible with today's systems (such as dbx), and could be
>easily implemented by an extra control word in the CD bit stream with
>appropriate sorting in the player.  Of course, this would entail perhaps a
>30% in playing time, about the same as 20 bit PCM but a whole lot more
>manufacturable.

I don't think it would have to cost 30% in play time; it would seem that the
envelope used for compression would need a bandwidth of only 100 Hz or so
and even 1f you used 16 bits for the compression amount, you would be taking
less than 1/2 of 1% off the usable bandwidth.

But for listenability reasons, especially in cars, you might very well
want to be able to NOT decompress the signal.  This would be most useful
if the mastering engineers and the performers/writers had a say in what
amount of compression was used.

I like this idea, mainly because IF compression is needed, it ought to be
in the hands of the artist.  Let the artist decide how he will ``compose''
for the noisy environment.
-- 

	from Mole End			Mark Terribile
		(scrape .. dig )	hou4b!mat
    ,..      .,,       ,,,   ..,***_*.

jj@rabbit.UUCP (10/11/84)

Uh, oh!  JJ's been listening to a CD player.

Last night I was listening to some Strauss and a few chamber
pieces on a CD player (Magnavox, of all kinds).  I was strongly
impressed by the Strauss, and disgusted by the miking on the
chamber pieces.   The alleged "digital sound" was quite evident
on the chamber works, and totaly missing on the two-miked
Strauss.

Needless to say, the chamber music was dreadful.  The Strauss
(well, the reproduction thereof, at least) was absoulutely
wonderful, with a good soundstage, a VERY low background,
and a dynamic range that let my speakers do a bit of what they
were designed to do.


This experience clearly shows that at least some of the
demons of digital are recording demons that aren't even
digital related.

Who knows, I might even BUY one of these Magnavox players,
they work nicely.  <I tried the shock test, a blot test, and
general abuse, and got nothing but music.>

I've listened quite a bit to the Sony CDP101, and I do believe
(although I did NOT have a chance to AB) that the Magnavox
sounds better in some unclearly defined way that my ears
perceive as transient handling.
 
Now, I think I HAVE to buy that new cartridge! :-)

Anyone have any feeling for the reliability of the
Phillips/Magnavox CD players?  I don't, this one was brand
new.
-- 
BE KIND TO SOFT FURRY CREATURES, THE 
LIFE YOU SAVE MAY BE YOUR OWN!
"The car was stalled, that fateful night, ..."

(allegra,ihnp4,ulysses)!rabbit!jj

emrath@uiucdcsb.UUCP (10/12/84)

Regarding Shaun's point about compression:
It sounds like what you are saying is that a CD could be made
where the data is 16 bit samples, but every so often a control sample
is read which (digitally) controls the gain of the analog (de)amplifier
used on the output of a 16-bit D/A.  Assuming you could make the
original recording using a 20 or 24 bit A/D (maybe someday), the
samples could be converted to 16 bits by shifting small ones left
and setting the control words to decrease the gain of the playback
amplifier by 6dB for each bit the following samples are left shifted.
Naturally, the 20 -> 16 bit conversion looks ahead to make sure that
no MSBs are lost, only the LSBs of numerically large samples.
The recorded signal is compressed, and proper playback entails downward
expansion.  It would probably sound bad without proper playback.
One possible problem with this may be the inability to switch the
gain of the PB amp fast enough and at the exact times, thus causing
distortion and noise.  (also, as you pointed out, its of dubious value)

Any way, the whole idea raises the following question.
Do current CDs have data format version information stored
on the disk, allowing future format changes?    (they must, I suspect)
A drastic change, such as the one above, might mean that version n
players would reject version n+1 disks, but a version n+1
player should be able to play any disk of version 1..n+1.
Recalls the evolution of the cassette, with CrO2, Dolby B,
Dolby C, dbx, Metal....

Perry Emrath	...pur-ee!uiucdcs!emrath
DCS, U of IL	emrath%uiuc@csnet-relay.arpa

	"Life is a bitch, and then you die"

shauns@vice.UUCP (Shaun Simpkins) (10/15/84)

If I remember my dbx specs right the vca has an attack time of less than a
millisecond and a much longer decay time.  They give some reasons relating to
audibility of the gain riding for these times and suggest mightily that they
are black magic.  Who am I to argue?  Anyway, this implies a control track
bandwidth of at least 1KHz, perhaps 2 or 3KHz.  Roughly speaking, every tenth
word will be a control word if the control track is interleaved with the
normal CD data stream.  So, perhaps 10% play time reduction, not 30-50%.
100Hz compansion contoller BW sounds way too slow \- in particular for isolated
percussive sounds, the bane of compression systems.

The wandering squash,

-- 
				Shaun Simpkins

uucp:	{ucbvax,decvax,chico,pur-ee,cbosg,ihnss}!teklabs!tekcad!vice!shauns
CSnet:	shauns@tek
ARPAnet:shauns.tek@rand-relay

newton2@ucbtopaz.CC.Berkeley.ARPA (10/22/84)

	I reiterate that a narrow-bandwidth gain-control channel suffices.

	First of all, even if the notion of attack time sufficed to define
the required bandwidth, there is no reason to resolve the gain-control
information to sixteen bits (the implication of "every tenth word a control
*word*"). That's like a floating-point representation with a sixteen-bit
mantissa *and* a sixteen-bit exponent. All this to solve the alleged
problem of "only" sixteen bits of resolution in the present rather impressive
system. If you must have a compander "track", surely gain changes of
a few multiples of 6 dB (i.e., a few bits per control word) should satisfy
the yearning for more than 90-odd dB of dynamic range (personally I think
the present system is all one could reasonably ask for noise/dynamic range
wise, when implemented properly).

	Secondly, on the matter of bandwidth: 

	Regardless of the required instantaneous rate-of-change of channel 
gain, the measure of control-track bit rate is the average, not the peak,
rate of change to be accommodated by the compander. Since the master
encoder that cuts the CD-of-the-future cum control track can easily
survey the signal to be compander-encoded throughout the program's time
history (or at least sufficiently "in advance" of the transient to be
encoded), the gain-control bits that are interleaved with the program 
can appaear sufficently in advance of the gain-change they command to
allow the reproducer channel gain to respond exactly on time.

	I'm not sure I've been clear on this: the recorder (being digital
already) includes a digital program delay line, which allows the gain-
control bits, calculated in real time with respect to the incoming program,
to be interleaved with the delayed program as it emerges. In the
reproducer, the gain control information is extracted and decoded before
the program data to which it applies (and which gave rise to it) arrives.
Thus the level-detectors and VCA's can compensate rather exactly for
the tardiness of response inherent in the low bandwidth of the gain-
control channel.

	This principle is the basis for a patented low-frequency pilot
tone compander compatible with comventional FM stereo broadcasting, and 
this same notion of delaying the program to allow the slow-responding
level-control detector to "catch up" is adapted by Dolby to an 
adaptive delta-mod digital transmission scheme, where the step-size
information is coded within a bandwidth on the order of 100 Hz. Both
these applications take advantage of the economies made possible by the
central transmission nature of their media, which allows the expense of the
delay-line and A/D's at the encoder. CD manufacture offers the same
distribution configuration and thus the same economic practicality.

	Thus, to recapitulate, I don't think you need 30% overhead for
	compander information, I don't think you need 10%, and in fact
	I still don't see why you need an explicit control-track at all,
	since in an instantaneously compandered digital system tracking
	errors are not a problem as with analog noise-reduction systems.
	And I'm unconvinced that 16-bit linear PCM isn't already a
	miraculously generous and ambitious format to offer the mass
	market- after all, this is essentially (except for sample
	rate) the scheme that virtually everyone was/is thrilled to
	use commercially.

	Any, I remain

		Doug Maisel

herbie@watdcsu.UUCP (Herb Chong, Computing Services) (10/23/84)

The key point you mentioned was properly implemented.  The problem with
most recordings on CD (and on vinyl, for that matter) is the initial
recording and the microphone setup.  Admittedly, a digital recording
system is pretty impressive as far as the numbers go.  However, those
numbers assume a lot of things, like the recording engineer is familiar with
the technical side of his recorder and the consequences of what he does.
The digital systems are much less forgiving of errors and techniques of
analog systems don't cut it, most of the time, if you want to realize the full
potential of the medium.  A lot of manufacturers of CD's have admitted to
compressing the initial signal so that the dynamic range is "acceptable
for the consumer".  Where have we heard that one before?  With the
compression, where do you think all the original noise goes?  I would prefer
to make my own decison about compression and keep the noise where it belongs.
Some companies ddon't compress, but they are the ones that originally in the
audiophile record business with digital recordings of their own.  One
final point, at $1-2k per hour just for the recording time, wouldn't you
rather have a saftey blanket if you set the recording levels wrong?  And
wouldn't you rather have a few extra bits of dynamic range when you know
the signal is going to be compressed to the final dynamic range before
mastering?


Herb...

I'm user-friendly -- I don't byte, I nybble....

UUCP:  {decvax|utzoo|ihnp4|allegra|clyde}!watmath!watdcsu!herbie
CSNET: herbie%watdcsu@waterloo.csnet
ARPA:  herbie%watdcsu%waterloo.csnet@csnet-relay.arpa
BITNET: herbie at watdcs,herbie at watdcsu

wmartin@brl-tgr.ARPA (Will Martin ) (10/23/84)

It's interesting... All these discussions of interleaving level-control
info with the sound-determining data on a CD sound an awful lot like
what is done in a reproducing piano. This is a player piano where the 
holes in the central portion of the roll determine the notes played
(as in an ordinary player), plus there are holes on both roll edges that
control volume and other factors determining expression.

There is nothing new...

Will Martin

USENET: seismo!brl-bmd!wmartin     or   ARPA/MILNET: wmartin@almsa-1.ARPA