[comp.ivideodisc] CD Summary, Part 2

poggio@apple.com (Andy Poggio) (03/01/90)
CD Summary Part 2

CD  Data  Hierarchy

Storing data on a CD may be thought of as occurring through a data 
encoding hierarchy with each level built upon the previous one.  At the 
lowest level, data is physically stored as pits on the disc.  It is 
actually encoded by several low-level mechanisms to provide high storage 
density and reliable data recovery.  At the next level, it organized into 
tracks which may be digital audio or CD-ROM.  The High Sierra 
specification then defines a file system built on CD-ROM tracks.  Finally, 
applications like HyperCard specify a content format for files.

The Physical Medium

The Compact Disc itself is a thin plastic disk some 12 cm. in diameter.  
Information is encoded in a plastic-encased spiral track contained on the 
top of the disk.  The spiral track is read optically by a noncontact head 
which scans approximately radially as the disk spins just above it.  The 
spiral is scanned at a constant linear velocity thus assuring a constant 
data rate.  This requires the disc to rotate at a decreasing rate as the 
spiral is scanned from its beginning near the center of the disc to its 
end near the disc circumference.

The spiral track contains shallow depressions, called pits, in a 
reflective layer.  Binary information is encoded by the lengths of these 
pits and the lengths of the areas between them, called land.  During 
reading, a low power laser beam from the optical head is focused on the 
spiral layer and is reflected back into the head.  Due to the optical 
characteristics of the plastic disc and  the wavelength of light used, the 
quantity of reflected light varies depending on whether the beam is on 
land or on a pit.  The modulated, reflected light is converted to a radio 
frequency, raw data signal by a photodetector in the optical head.

Low-level Data Encoding

To ensure accurate recovery, the disc data must be encoded to optimize the 
analog-to-digital conversion process that  the radio frequency signal must 
undergo.  Goals of the low level data encoding include:

1.  High information density.  This requires encoding that makes the best 
possible use of the high, but limited, resolution of the laser beam and 
read head optics.

2.  Minimum intersymbol interference.  This requires making the minimum 
run length, i.e. the minimum number of consecutive zero bits or one bits, 
as large as possible.

3.  Self-clocking.  To avoid a separate timing track, the data should be 
encoded so as to allow the clock signal to be regenerated from the data 
signal.  This requires limiting the maximum run length of the data so that 
data transitions will regenerate the clock.  

4.  Low digital sum value (the number of one bits minus the number of zero 
bits).  This minimizes the low frequency and DC content of the data signal 
which permits optimal servo system operation.

A straightforward encoding would be to simply to encode zero bits as land 
and one bits as pits.  However, this does not meet goal (1) as well as the 
encoding scheme actually used.  The current CD scheme encodes one bits as 
transitions from pit to land or land to pit and zero bits as constant pit 
or constant land.

To meet goals (2) to (4), it is not possible to encode arbitrary binary 
data.  For example, the integer 0 expressed as thirty-two bits of zero 
would have too long a run length to satisfy goal (3).  To accommodate 
these goals, each eight-bit byte of actual data is encoded as fourteen 
bits of channel data.  There are many more combinations of fourteen bits 
(16,384) than there are of eight bits (256).  To encode the eight-bit 
combinations, 256 combinations of fourteen bits are chosen that meet the 
goals.  This encoding is referred to as Eight-to-Fourteen Modulation (EFM) 
coding.

If fourteen channel bits were concatenated with another set of fourteen 
channel bits, once again the above goals may not be met.  To avoid this 
possibility, three merging bits are included between each set of fourteen 
channel bits.  These merging bits carry no information but are chosen to 
limit run length, keep data signal DC content low, etc.  Thus, an eight 
bit byte of actual data is encoded into a total of seventeen channel bits: 
 fourteen EFM bits and three merging bits.

To achieve a reliable self-clocking system, periodic synchronization is 
necessary.  Thus, data is broken up into individual frames each beginning 
with a synchronization pattern.  Each frame also contains twenty-four data 
bytes, eight error correction bytes, a control and display byte (carrying 
the subcoding channels), and merging bits separating them all.  Each frame 
is arranged as follows:

Sync Pattern24 + 3channel bits
Control and Display byte14 + 3
Data bytes12 * (14 + 3)
Error Correction bytes 4 * (14 + 3)
Data bytes12 * (14 + 3)
Error Correction bytes 4 * (14 + 3)

TOTAL588channel bits

Thus, 192 actual data bits (24 bytes) are encoded as 588 channel bits.

Editorial:  A CD physically has a single spiral track about 3 miles long.  
CDs spin at about 500 RPM when reading near the center down to about 250 
RPM when reading near the circumference.

Disc with a 'c' or disk with a 'k'?  A usage has emerged for these terms:  
disk is used for eraseable disks (e.g. magnetic disks) while disc is used 
for read-only (e.g. CD-ROM discs).  One would presumably call a frisbee a 
disc.

--andy