[bionet.molbio.bio-matrix] electronic publishing

rrobbins@NOTE.NSF.GOV ("Robert J. Robbins") (08/07/90)

A recent posting has raised some issues with regard to electronic 
publishing and has made particular reference the presentation by Patricia 
Morgan at the last BioMatrix meeting.  The posting recalls two arguments
against electronic publishing, half tones and cost to libraries, and notes 
that less than 5% of computer science journal articles contain photographs.
The posting also notes that, 'when I mentioned the second argument to a 
friend they commented, "who needs libraries anyway?".  That is, we could 
have direct distribution from publishers to readers.'

In response to these comments, I offer some observations:

  -  The posting is correct that the use of halftones is discipline 
     dependent.  Computer science, mathematics, logic, philosophy, 
     and other abstract fields don't need them.  Biology, especially
     molecular biology, does.  Halftones are not the only problem, 
     however.  Other graphics make for difficulty, too.  The argument,
     raised at the meeting, that PostScript can be used as the galactic 
     standard for page layout just doesn't hold water.  First, the 
     standard isn't as standard as standards ought to be.  I have had 
     people send me PostScript reports via e-mail and when I print them 
     out using a PostScript printer at NSF, occasionally a page 
     comes out with a screwed up format.  The source claims 
     the file printed fine at home.  This leads me to believe that 
     not all PS files print identically on all PS printers.  Second,
     not everyone has access to PS printers.  Third, even when folks 
     do have access to PS printers, they often would prefer to receive 
     a printed article in the mail rather than fire up the printer 
     and produce (at, say, 4 cents a page) a 100-page printout 
     that then must be stapled or punched or whatever so that it 
     can be carried around and read conveniently.

  -  The doing-away-with-libraries notion sounds intriguing at 
     first, but appears more naive and silly upon further thought.
     Libraries do not exist just to provide full employment 
     opportunities for parasitic librarians.  They provide real 
     functions, such as allowing researchers access to many more 
     journals than they could possibly afford individually.  They 
     also provide an archival function, which is essential in 
     disciplines that are less ephemeral than computer science.
     Quick, give five computer-science citations that are over 
     fifty years old that can still be read with profit.  However, 
     there are many biological manuscripts that are over a 
     hundred years old that are still important and useful.  Many 
     of these involve detailed illustrations (engravings, usually)
     of anatomical studies that have not been redone, since the 
     original 19th century work is considered definitive.  Converting
     these to PostScript format would be costly and tedious, and 
     would greatly reduce the value of the work.  The archival 
     problem is perhaps one of the more acute arguments against 
     electronic publishing.  Quick, give five examples of material
     written electronically over forty years ago that can still 
     be read easily with current equipment.  For electronic 
     publishing to be considered even as a candidate for archival 
     publishing, there will have to be some possibility that 
     material published electronically will continue to be 
     readable for decades, preferably centuries, without need for 
     periodic copying onto new media in new formats.  

  -  The issue of cost is real, as well, and cannot be dismissed 
     by doing away with libraries.  Publishers publish to make a 
     buck, not to provide a free service to the world.  Serious 
     electronic publishing involves user fees independently of 
     whether the user is a library or an individual.  Serious 
     publishing also involves a concern for the rights, economic
     and otherwise, of the author and publisher.  Sure, you can 
     always make 50 xerox copies of the latest issue of Cell and 
     distribute them to your friends, but that will involve a lot
     of work and expense and you are not likely to do it very 
     often.  Therefore, the publisher of Cell can set subscription 
     fees on the assumption that most readers will be looking at 
     a paid-for copy, not a xeroxed rip off.  With an electronically 
     delivered journal, say that arrives via email, you can, with 
     a simple forward command, send 50 or 100 or 1000 copies off 
     to many friends almost effortlessly.  Therefore, a publisher 
     working in this medium must assume that most of the readers 
     will be using non-paid-for copies and the few fools (libraries?)
     who obtain their copies legitimately must be expected to carry 
     the full cost for the subscription.

I could add a few more observations, but I suspect my position is 
fairly clear.  I believe that the idea that electronic publishing 
will replace print publishing is about as accurate and astute a 
prediction as the one made frequently in the fifties that private
helicopters would replace private automobiles.  Helicopters play 
many important roles in our society, but providing routine 
individual transportation is not one of them.  Likewise, computers and 
electronic communication play important roles in our society, but
replacing the printed word as the primary medium for scientific 
communication is not one of them.

At the same time, there are certain kinds of scientific publishing
that cry out for an electronic medium.  Database materials are 
obviously one of these.  How many people prefer to use GenBank 
in the hard-copy, multi-volume form?  I think that those of us 
who believe that computers have an increasingly important role 
to play in the practice of biology should take care to avoid
making exaggerated claims, either through naivete or excess 
enthusiasm.  Nothing undercuts a good case more than a patently 
false assertion.

pkarp@NCBI.NLM.NIH.GOV (Peter Karp) (08/08/90)

I must thank Bob for pointing out some ambiguities in my posting
regarding electronic publishing.  Actually Bob, I don't think my
message contained any false assertions, only a lack of detail that
when interpreted in particular ways could easily yield false
propositions.  Believe it or not, I as a computer scientist actually
prefer to read scientific articles in paper form (for asthetic reasons
that I can't quite put my finger on), and my reason for asking others
to summarize Morgan's arguments is that I found it very interesting to
hear the perspective of a publishing professional who had actually
looked into the problem.

To elaborate on the question of postscript as a graphics standard, I
took Morgan's comments at the meeting to mean that she didn't believe
that there existed a *technical* solution to the problem of specifying
graphics within documents -- Postscript is clearly such a solution.  I
think the question of whether Postscript constitutes an adequate
*standard* for specifying graphics is an open question.  Sure, I've
also had trouble FTPing Postscript documents around the Internet, but
this is not proof that someone can't sit down and write a standard for
a "Universal Postscript" that most printers should be able to handle.
I simply don't believe that there is a major problem in specifying
graphics within documents.  As I wrote earlier, halftones do appear
to represent an important problem for many disciplines.

Yes, it's true that not everyone has a Postscript printer; I 
certainly wouldn't argue that we can install worldwide electronic
publishing tomorrow.  When the first television stations started
broadcasting not everyone owned a TV; we should expect that it
will take time for new technology to come into widespread use.

My "who needs libraries" comment was not meant to imply that we should
close all libraries tomorrow -- sorry for being so flip here.  I was
simply implying that we should expect new technologies to create new
patterns of useage.  Morgan seemed to expect that libraries would
serve as the distribution point for electronically-published
documents, but in the future this may not be the case.  The person I
talked to was implying that the right now the computer science
community has adequate technology in place to simply circumvent
libraries for the distribution phase -- and perhaps even for the
archival of future computer science publications if they will all fit
on a few dozen CD ROMs.  I quite agree that the biology community does
not have this technology in place now, and that conversion of existing
documents would be problematic.

My overall conclusion is not that we should expect electronic
publishing to be feasible for all types of documents in all
disciplines tomorrow, nor that we should close down all University
libraries tomorrow.  That is, I'm not the boundless optimist that my
first message may have implied.  However, I'm also not the bounded
pessimist that I concluded Morgan is.  Although I believe Morgan and
Robbins have both identified many important problems that are
impossible for us to solve in the short term, I believe that there are
a substantial number of problems that we can solve in the short term;
we shouldn't let the existence of some hard problems deter us from
forging ahead on the easy ones.

Peter

rrobbins@NOTE.NSF.GOV ("Robert J. Robbins") (08/08/90)

My comments regarding electronic publishing have produced a quick 
rejoinder that, presumably inadvertantly, was directly only to me 
and not to the net.  In the interest of stimulating even more 
commentary, I forward this response to the group...


------- Forwarded Message

Received: from life.ai.mit.edu by Note.NSF.GOV id aa24849; 7 Aug 90 13:27 EDT
Received: from rice-chex (rice-chex.ai.mit.edu) by life.ai.mit.edu (4.1/AI-4.10) id AA02326; Tue, 7 Aug 90 13:28:46 EDT
From: tmb@ai.mit.edu (Thomas M. Breuel)
Received: by rice-chex (4.1/AI-4.10) id AA11806; Tue, 7 Aug 90 13:28:38 EDT
Date: Tue, 7 Aug 90 13:28:38 EDT
Message-Id: <9008071728.AA11806@rice-chex>
To: rrobbins@note.nsf.gov
Subject: Re:  electronic publishing

|  -  The posting is correct that the use of halftones is discipline 
|     dependent.  Computer science, mathematics, logic, philosophy, 
|     and other abstract fields don't need them.  Biology, especially
|     molecular biology, does.  Halftones are not the only problem, 
|     however.  Other graphics make for difficulty, too.  The argument,
|     raised at the meeting, that PostScript can be used as the galactic 
|     standard for page layout just doesn't hold water.  First, the 
|     standard isn't as standard as standards ought to be.  I have had 
|     people send me PostScript reports via e-mail and when I print them 
|     out using a PostScript printer at NSF, occasionally a page 
|     comes out with a screwed up format.  The source claims 
|     the file printed fine at home.  This leads me to believe that 
|     not all PS files print identically on all PS printers.  Second,
|     not everyone has access to PS printers.  Third, even when folks 
|     do have access to PS printers, they often would prefer to receive 
|     a printed article in the mail rather than fire up the printer 
|     and produce (at, say, 4 cents a page) a 100-page printout 
|     that then must be stapled or punched or whatever so that it 
|     can be carried around and read conveniently.

What is the problem with halftones? There are lots of standards for
transmitting graphics, B/W, and color images. CGM and GIF might
make a reasonable standard, so might HPGL and TIFF, or UNIX PLOT
and UNIX VIS format. For all of these, there are public domain
or free previewers for X windows, Macs, and PCs. With only a minor
amount of effort these could be integrated directly into a document
reader. 

I think it would be silly to expect that one can start electronic
publishing without any software development, but, on the other hand,
most of the code for doing it already exists--it just needs to
be put together.

There are also free postscript previewers available (try ghostscript
from prep.ai.mit.edu:/u/emacs, or ralpage (sp?) from expo.lcs.mit.edu).
These aren't very user friendly yet, but they get the job
done.

(Incidentally, the CMU Andrew system is one system that gives you
multi-media mail. Something like that might be nice to use--it does
currently require a UNIX workstation and quite a bit of memory, though).

|  -  The doing-away-with-libraries notion sounds intriguing at 
|     first, but appears more naive and silly upon further thought.
|     Libraries do not exist just to provide full employment 
|     opportunities for parasitic librarians.  They provide real 
|     functions, such as allowing researchers access to many more 
|     journals than they could possibly afford individually.  They 
|     also provide an archival function, which is essential in 
|     disciplines that are less ephemeral than computer science.
|     Quick, give five computer-science citations that are over 
|     fifty years old that can still be read with profit.  However, 
|     there are many biological manuscripts that are over a 
|     hundred years old that are still important and useful.  Many 
|     of these involve detailed illustrations (engravings, usually)
|     of anatomical studies that have not been redone, since the 
|     original 19th century work is considered definitive.  Converting
|     these to PostScript format would be costly and tedious, and 
|     would greatly reduce the value of the work.  The archival 
|     problem is perhaps one of the more acute arguments against 
|     electronic publishing.  Quick, give five examples of material
|     written electronically over forty years ago that can still 
|     be read easily with current equipment.  For electronic 
|     publishing to be considered even as a candidate for archival 
|     publishing, there will have to be some possibility that 
|     material published electronically will continue to be 
|     readable for decades, preferably centuries, without need for 
|     periodic copying onto new media in new formats.  

Archival and advice is precisely what you pay electronic publishers
for. It's not an argument against electronic publishing.

I fail to see why an electronic version of a 19th century work
would be any worse than the original. You can always store the
pages as images (with yellowing and all). It won't be long until
everyone can afford 150 or 300 dpi greyscale or color screens
together with the network bandwidth and local storage necessary
to receive and store such images.

|  -  The issue of cost is real, as well, and cannot be dismissed 
|     by doing away with libraries.  Publishers publish to make a 
|     buck, not to provide a free service to the world.  Serious 
|     electronic publishing involves user fees independently of 
|     whether the user is a library or an individual.  Serious 
|     publishing also involves a concern for the rights, economic
|     and otherwise, of the author and publisher.  Sure, you can 
|     always make 50 xerox copies of the latest issue of Cell and 
|     distribute them to your friends, but that will involve a lot
|     of work and expense and you are not likely to do it very 
|     often.  Therefore, the publisher of Cell can set subscription 
|     fees on the assumption that most readers will be looking at 
|     a paid-for copy, not a xeroxed rip off.  With an electronically 
|     delivered journal, say that arrives via email, you can, with 
|     a simple forward command, send 50 or 100 or 1000 copies off 
|     to many friends almost effortlessly.  Therefore, a publisher 
|     working in this medium must assume that most of the readers 
|     will be using non-paid-for copies and the few fools (libraries?)
|     who obtain their copies legitimately must be expected to carry 
|     the full cost for the subscription.

But the costs for electronic publishing are completely different.
Users usually pay for the transmission costs themselves. The costs
to the publisher are editorial and archival. But we know from
USENET that editors and moderators often volunteer or are sponsored
by some organization. Archival services, on the other hand, are not
made unnecessary if users can pass information freely.

Of course, if publishers approach the electronic publishing medium
with the same expectations as publishing on paper, they'll be
disappointed: readers will not pay a premium price for something
they receive electronically. But if the existing publishers don't
understand that, others will move in.

Consider, for example, DIALOG. DIALOG actually has serious restrictions
on re-distribution of data obtained from its data bases, but
I would bet they make very little difference to DIALOG's bottom line.
The reason why people (including myself) are using DIALOG is because
it provides a useful service. It would make little difference to
me if you subscribed to DIALOG and forwarded the result of your
searches to me (on the other hand, if you conducted a search for me,
DIALOG would get its money).

|I could add a few more observations, but I suspect my position is 
|fairly clear.  I believe that the idea that electronic publishing 
|will replace print publishing is about as accurate and astute a 
|prediction as the one made frequently in the fifties that private
|helicopters would replace private automobiles.  Helicopters play 
|many important roles in our society, but providing routine 
|individual transportation is not one of them.  Likewise, computers and 
|electronic communication play important roles in our society, but
|replacing the printed word as the primary medium for scientific 
|communication is not one of them.

I think your prediction is near-sighted. It may take 30 years, it may
take 50 years, but electronic publishing is already replacing printed
publishing. Many significant articles have been made and are being made
available as "pre-prints" electronically.  In some disciplines and/or
areas, paper publishing now often only serves archival purposes and is
for the benefit of those who have no access to the Internet.  Likewise,
electronic publishing standards are emerging slowly.  PostScript and
DVI are a first step. Sure, paper won't go away for a very long time,
but many people will switch over to electronic media in the not-too-distant
future.

------- End of Forwarded Message

tmb@AI.MIT.EDU (Thomas M. Breuel) (08/08/90)

|My comments regarding electronic publishing have produced a quick 
|rejoinder that, presumably inadvertantly, was directly only to me 
|and not to the net.  In the interest of stimulating even more 
|commentary, I forward this response to the group...

The forwarded message was not intended for the mailing list as a
whole--otherwise, I would have edited it more carefully. Keeping
that in mind, feel free to read it and respond to it, though.

					Thomas.

|[forwarded message deleted]

gilbertd@silver.ucs.indiana.edu (Don Gilbert) (08/08/90)

My two cents on electronic publication:

*  We had a bit of discussion on this a few months back.  The 
drosophila information service journal may be still looking into the 
idea.  Graphics are still a hang-up -- there are several "standards" 
but none is standard enough to reach everyone.  I think that any try 
to duplicate a printed journal will need to make the data available 
in several formats and let the user choose which (if any) he can 
read.  Plain ascii text is still the common denominator, but then 
there are many biologists who don't have (or use) computers.   The 
time/cost for setting up such an e-publication is large because of 
the multiple format problem. Contributors will not be able to send 
in standard formats either.  I am used to reading info on video 
monitors rather than paper, and I much prefer formatted/typeset to 
plain text.  Average joe biologist who spends less time in front of 
a computer will require printed or printable copy (this decade 
anyway).  Postscript or Fax printers would suit some as output 
options.  An e-pub for any science discipline that mimics paper pubs 
would have a big problem with the mechanics of formatting, and would 
have a small readership.  But I also think it would be worth a try.

*  I think e-publication should _not_ try to mimic paper 
publication, but look to the currently successful electronic info 
distribution media:  e-mail, netnews and archives.  The combination 
of these three media allows scientists much more free and rapid 
exchange of hypotheses/data/results/discussion than any month-lagged 
paper media.   I've been publishing my software works first and 
mostly exclusively via this ether network for about 5 years now, 
first thru Compuserve and recently Internet. It makes sense for 
software.  I publish a new work by placing it in a public archive or 
two, and sending out notices (abstracts) via public bulletin board 
or network newsgroup.  Typically I will get responses from users 
with in a few days which help solve most of the problems that I 
missed (the review period).  Then it is easy as pie to re-distribute 
the corrected information (sometimes too easy, leading to hiccup 
updates).  The software (article) sits in the archive and propagates 
itself through the computer net at speeds depending on its 
popularity and usefulness.  Noise or useless articles are self-
restricting.  As people pick the software (article) up from servers, 
some will fire back questions by e-mail, which I dutifully reply to 
(if the program hasn't reached obsolesence yet).

Maybe this method will make sense for disemination of other science 
research as people look at it more.  If someone wants to try now, 
please feel free to drop off articles at Iubio archive, directory 
[archive.receive].  Choice of formats is your own headache (but if 
it's good, someone else will translate it as needed).


Don Gilbert  biocomputing office   / archive for 
gilbertd@iubio.bio.indiana.edu    / molecular & general biology 
biology dept., indiana univ.,    / ftp iubio.bio.indiana.edu  
bloomington, in  47405, usa     / (129.79.1.101) user anonymous 

Don.Gilbert@Iubio.Bio.Indiana.Edu
biocomputing office, indiana univ., bloomington, in 47405, usa

ODONNELL@arcb.afrc.ac.uk (08/08/90)

An interesting discussion - it's a pity that a quirk in the distribution
of bbmessages to the UK sends them in a different order. We see some replies
BEFORE the original....


The archiving problem is of great significance. Electronic backing up
procedures are more expensive and time consuming that storing a printed copy
on a shelf. If you have an admin computer on your network, you will know what
I mean: everything grinds to a halt during their backup.

Electronic media fade more rapidly than other sorts - Indeed some of the
older printed works are outliving modern paper. The latter self-destructs
because of the chemical processes used in its production. The lifetime of
magnetic media is less than 20 years (according to an article in New Scientist
some time back) Backups of LARGE archives will be essential. Look at the
sequence databases - it frightens my network administrators when I tell them
how fast its growing.

So what about published works that are not appreciated at their time of
publication? Currently they can languish in a library, to be discovered many
years later (eg : Mendel's work in genetics). The fast pace of computing often
says "use it or lose it"; a message we received recently about our network
software library. It has been in operation for only two years.

As one contributor said - electronic publishing has its place. Appropriate
solutions for appropriate data.

Disclaimer: Only my personal views expressed.
*****************************************************************************
Cary O'Donnell			Tel: (+44) 582 762271 ext 226
AFRC Computing Centre		Fax: (+44) 582 761710
West Common			email: ODONNELL@UK.AC.AFRC.ARCB
Harpenden			(Molecular biology support at AFRCCC)
Herts AL5 2JE
U.K.    			(AFRC = Agricultural & Food Research Council)

gilbertd@silver.ucs.indiana.edu (Don Gilbert) (08/08/90)

It's occurred to me that I've just recently seen a good example of how 
electronic publishing works now.  This recent note in bionet.molbio.genbank, 
from Rainer Fuchs of EMBL Data Library, is the abstract and pointer to a full 
article, possibly containing graphics, available in multiple formats, in an 
archive that interested readers can fetch:

(from Ranier Fuchs) -----------------{ 
 In addition to David Benton's recent posting:
 The features table format description is also available in electronic form via
 the EMBL File Server. It can be obtained by sending a mail message to
 NETSERV@EMBL.BITNET containing one of the following commands:
 GET DOC:FT_DEFINITION.HQX       /* to get a Macintosh Word 4.0 document */
 GET DOC:FT_DEFINITION.RTF       /* to get it in RTF format              */
 GET DOC:FT_DEFINITION.PS        /* to get it in PostScript format       */
 
 (The PostScript document may not print on every laserprinter, due to some
 peculiarities of the Macintosh laserwriter dictionary :-( )
 
 Rainer Fuchs, Ph.D.
 EMBL Data Library
 fuchs@embl.bitnet
-------------------------------------}
Don.Gilbert@Iubio.Bio.Indiana.Edu
biocomputing office, indiana univ., bloomington, in 47405, usa

jgsmith@watson.bcm.tmc.edu (James G. Smith) (08/08/90)

This is how I hope/expect to interact with electronic publishing over the
usenet:

As I read thru the bionet.immunology.abstracts newsgroup, I come across one
that I would like the whole text from.  I type "r" (which allows me to email
a response to the poster of the current article).  I then mail the message

send mac

and then I continue looking through the abstracts, knowing that at some time
in the future I will receive the (HyperCard, I expect) Macintosh version of
that paper in email.

The various things that go on in the background (perhaps including the checking
to make sure I have a valid paid subscription to that server) I would be
unaware of.

How difficult would it be to set up a moderated newsgroup and server to do 
this?  I think the toughest part would be some sort of review process for
submitted papers.  

Finally, a comment on archiving.  What's the half life of information on CD-ROM,
or WORM drives?

*

kristoff@genbank.BIO.NET (David Kristofferson) (08/09/90)

Glad to see that some interest has been generated on the newsgroup.
Don't forget that much more than electronic publishing (only one
session) was discussed at the MATRIX meeting, so I hope that other
participants will add their comments on their topic.

> An interesting discussion - it's a pity that a quirk in the distribution
> of bbmessages to the UK sends them in a different order. We see some replies
> BEFORE the original....

I believe that some mail queues operate on the principle of letting
smaller mail messages jump ahead in the queue.  That may be why the
order is sometimes altered.
-- 
				Sincerely,

				Dave Kristofferson
				GenBank On-line Service Manager

				kristoff@genbank.bio.net

rrobbins@NSF.GOV ("Robert J. Robbins") (08/09/90)

With regard to archiving, James G. Smith asks: 

> What's the half life of information on CD-ROM, or WORM drives?

A more relevant question is, "What's the half life of a particular 
format for a CD-ROM?"  At the moment, the answer seems to lie 
somewhere between months and years, and certainly does not exceed 
a decade.  The problem with archiving is twofold: (1) storing 
the information on a medium with a long shelf life, and (2) storing 
the information on a medium, in a format that will be readable 
by current hardware n years in the future.  To compete with the 
archiving capabilities of print, the value of n must reach into 
the centuries.