[bionet.molbio.bio-matrix] CD-ROM vs paper

mckee@corwin (George McKee) (08/10/90)

Robert J. Robbins <rrobins@nsf.gov> gets close to the point
when he asks about the life of the *format* of an electronic
medium, rather than the medium itself.  I was hit by this
long ago when I put my master's thesis on PDP-10 DECtape.
Even if I had used standard magtape, I'd still have been at
the mercy of the 36-bit word of the machine.  Nowdays you'll
want to start worrying about 8-bit character codes for
european languages and 16-bit charsets for Japanese and Chinese.

	This trend doesn't look like it's going to stop.
If you've looked at tape media recently, it seems like there's
a new format every six months or so.  And already you can get
read/write optical disks with almost twice the capacity of a
CD-ROM in the same form factor.

	When talking about comparisons with print, you have
to take a really long view; the longest view with respect to
useful longevity of artifacts that I've seen is in Arthur C.
Clarke's novel "The City and the Stars", which has an underlying
concern with the immortal technology to go with immortal people.
One of the characters points out that to last forever,
	"no device shall have any moving parts".
As long as you have media that need readers to play them, you'll
have problems with obsolescence.  The fact that it doesn't need
any special equipment to be accessible by an ordinary person is
paper's greatest characteristic as an archival medium.

	I think it's better to think of CD-ROM is a low-cost
distribution medium rather than an archival format.  This means
that for cost comparisons, you should compare the cost per bit of
the CD-ROM, not to paper in units of dollars and years, but to the
network in units of dollars, miles, and hours.

	In the network, the data is always online at some archive
site and it doesn't really matter what hardware the archive site is
using, as long as the material is accessible from the net in a
standard way.
	In other words, I don't think it's worthwhile worrying
about the longevity of physical media for storing knowledge; paper
will be the best at this for longer than *I* can forsee.  What's
important is to make sure that the electronic data structures are
powerful enough to carry the original information without loss
through many generations of transmission and translation.

	It would be a tragedy if all the care and insight in
the figures of a classic work in (for example) developmental
biology were digitized at 72 or 300 dpi, simply because the
people doing the project didn't think beyond the screen of the
Macintosh or the page of the LaserWriter they're using today.
	One of the important things electronically-oriented
libraries can do is to study how much information *really* is
in the figures and typography in archival works, so that we
can be confident that we're not losing important details when
we exchange electronic copies of them.  Then once we know
how much information is there, we can devise standard ways
of representing this information that will guarantee loss-free
transmission.  My knowledge of standards is mainly in systems,
languages, and networks, so it's not unexpected that I haven't
seen one that I feel is adequate for for archival representation of
biological literature.   If someone out there can recommend
a standard, I'd be interested in knowing about it. 

	- George McKee