[comp.ivideodisc] CD Summary, Part 5

poggio@apple.com (Andy Poggio) (03/03/90)

CD Summary Part 5

The High Sierra File System Standard

Built on top of the addressable 2K blocks that the CD-ROM specification 
defines, the next higher level of data encoding is a file system that 
permits logical organization of the data on the CD.  This can be a native 
file system like the Macintosh Hierarchical File System (HFS).  Another 
alternative is the High Sierra (also known as the ISO 9660) file standard, 
recently approved by the National Information Standards Organization 
(NISO) and the International Standards Organization (ISO), which defines a 
file system carefully tuned to CD characteristics.  In particular:

1.  CDs have modest seek time and high capacity.  As a result, the High 
Sierra standard makes tradeoffs that reduce the number of seeks needed to 
read a file at the expense of space efficiency.

2.  CDs are read-only.  Thus, concerns like space allocation, file 
deletion, and the like are not addressed in the specification.

For High Sierra file systems, each individual CD is a volume.  Several CDs 
may be grouped together in a volume set and there is a mechanism for 
subsequent volumes in a set to update preceding ones.  Volumes can contain 
standard file structures, coded character set file structures for 
character encoding other than ASCII, or boot records.  Boot records can 
contain either data or program code that may be needed by systems or 
applications.

High Sierra Directories and Files

The file system is a hierarchical one in which directories may contain 
files or other directories.  Each volume has a root directory which serves 
as an ancestor to all other directories or files in the volume.  This 
dictates an overall tree structure for the volume.

A typical disadvantage in hierarchical systems is that to read a file 
(which must be a leaf of the hierarchy tree) given its full path name, it 
is necessary to begin at the root directory and search through each of its 
ancestral directories until the entry for the file is found.  For example, 
given the path name "Wine Regions:America:California:Mendocino", three 
directories (the first three components of the path name) would need to be 
searched.  Typically, a separate seek would be required for each 
directory.  This would result in relatively poor performance.

To avoid this, High Sierra specifies that each volume contain a path table 
in addition to its directories and files.  The path table describes the 
directory hierarchy in a compact form that may be cached in computer 
memory for optimum performance.  The path table contains entries for the 
volume's directories in a breadth-first order; directories with a common 
parent are listed in lexicographic order.  Each entry contains only the 
location of the directory it describes, its name, and the location in the 
path table of its parent.  This mechanism allows any directory to be 
accessed with only a single CD seek.

Directories contain more detailed information than the path table.  Each 
directory entry contains:

Directory or file location.
File length.
Date and time of creation.
Name of the file.
Flags:
Whether the entry is for a file or a directory.
Whether or not it is an associated file.
Whether or not it has records.
Whether or not it has read protection.
Whether or not it has subsequent extents.
Interleave structure of the file.

Interleaving may be used, for example, to meet realtime requirements for 
multiple files whose contents must be presented simultaneously.  This 
would happen if a file containing graphic images were interleaved with a 
file containing compressed sound that describes the images.

Files themselves are recorded in contiguous (or interleaved) blocks on the 
disc.  The read-only nature of CD permits this contiguous recording in a 
straightforward manner.  A file may also be recorded in a series of 
noncontiguous extents with a directory entry for each extent.

The specification does not favor any particular computer architecture.  In 
particular all significant, multibyte numbers are recorded twice, once 
with the most significant byte first and once with the least significant 
byte first.

Multimedia Information

Using the file system are applications that create and portray multimedia 
information.  While it is true that a CD can store anything that a 
magnetic disk can store (and usually much more of it), CDs will be used 
more for storing information than for storing programs.  It is the very 
large storage capacity of CDs coupled with their low cost that opens up 
the possibilities for interactive, multimedia information to be used in a 
multitude of ways.

Programs like HyperCard, with it's ease of authoring and broad 
extensibility, are very useful for this purpose.  Hypercard stacks, with 
related information such as color images and sound, can be easily and 
inexpensively stored on CDs despite their possibly very large size.

Editorial:  The High Sierra file system gets its name from the location of 
the first meeting on it:  the High Sierra Hotel at Lake Tahoe.  It is much 
more commonly referred to as ISO 9660, though the two specifications are 
slightly different.

It has gotten very easy and inexpensive to make a CD-ROM disc (or audio 
CD).  For example, you can now take a Macintosh hard disk and send it with 
$1500 to one of several CD pressers.  They will send you back your hard 
disk and 100 CDs with exactly the same content as what's on your disk.  
This is the easy way to make CDs with capacity up to the size of your hard 
disk (Apple's go up to 160 megabytes).  True, this is not a full CD but 
CDs don't need to be full.  If you have just 10 megabytes and need 100 
copies, CDs may be the best way to go.

If you are buying a CD-ROM drive, there are several factors you might 
consider in making your choice.  Two factors NOT to consider are capacity 
and data rate.  The capacity of all CD-ROM drives is determined solely by 
the CD they are reading.  Though you will see a range of numbers in 
manufacturers' specs (e.g. 540, 550, 600, and 650 Mbytes), any drive can 
read any disc and so they are all fundamentally the same.  All CD-ROM 
drives read data at a net 150 Kbytes/sec for CD-ROM data.  Other data 
rates you may see may include error correction data (not included in the 
net rate) or may be a mode 2 data rate (faster than mode 1).  All drives 
will be the same in all of these specs.

--andy