[bionet.molbio.genbank] Distributing GenBank with CD ROM

phil@WUBIOS.WUSTL.EDU (J. Philip Miller) (12/12/89)

> >they are much slower than magnetic hard disks.  Also, I'm not sure that
> >CD-ROM is really practical yet.  Maybe in a couple of years, but it's still
> >pretty much of a specialty item today.
> 
> No argument about getting the physical object from point A to point B,
> but Release 62 (in the works) is going to require 5 reels @ 1600bpi
> (yes, there are sites that want GenBank and cannot receive 6250) and
> Rel 63 (March, the next release on floppies) will probably require
> something like 125 360-kb floppies.  Since we are distributing more
> that 100 copies in the 360-kb density, I don't think we can
> morally stop the pain of mastering, packaging, and shipping those
> floppy releases until we provide a viable alternative.  (And even
> if our morals would let us, the GenBank Project Officer would remind
> us of our contractual obligation to release on floppies.  We're hoping
> the floppy release (at least on the LD disks) will die a natural
> death after the CD ROMs are available.)
> 
I think it may be useful to throw some numbers in here that I have been
quoted.  First the drives to read the CD ROMS - they are available for the PC
platform under $1,000 including controller, software etc.  For Unix boxes they
are available  for $1,000-$1,500 including software.  I do not recall the
prices for for VAXen, but DEC is now distributing much of its software on CD
ROM so is pushing their use.  Mastering (including the first 100 copies) runs
from $5,000 - $10,000 depending on how much indexing and other preprocessing
is needed.  Pressings are in the $1-$2/copy range and mailing is MUCH less
than any other media.  Thus if there are only 50-100 copies to be distributed,
we are already in the same ballpark as with 1600 bpi tapes and 360k floppies.
The costs can only become more favorable as the volume goes up.  Even if folks
want their own private copies only for status reasons, they will go for the CD
ROM approach since that will give them a way of justifying having a CD player
in their office to listen to their favorite music :-)

-phil

-- 
     J. Philip Miller, Professor, Division of Biostatistics, Box 8067
	 Washington University Medical School, St. Louis MO 63110
phil@wubios.WUstl.edu - Internet  (314) 362-3617   phil@wubios.wustl - bitnet
uunet!wucs1!wubios!phil - UUCP              C90562JM@WUVMD - alternate bitnet

roy@phri.nyu.edu (Roy Smith) (12/13/89)

	The numbers David Benton gives regarding the genbank distributions
are pretty horrifying.  The whole concept of getting a stack of 125 floppys
twice a year is beyond belief.  Not to mention cutting 12,500 of them!
There is no doubt that this is a totally untenable way to distribute this
information.  5 1600 BPI tapes isn't much better.  Not to mention that if
it's up to 5 reels at 1600, how much longer can I, with my 6250 drive, be
so smug about only having to deal with a single reel?  March 90?  June 90
at the latest?  David hopes that CD will make the 360k floppy distribution
obsolete, but is there really much hope that people who havn't managed to
get 1.4 Meg (or even 720k) floppy drives yet will shell out $800 or so for
a CD drive?  Hopefully the Project Officer can be convinced that scheduling
an end to the 360k distribution will simply be a good way to give some
folks some incentive to upgrade their hardware.  Hand in hand with this, of
course, would be a commitment from NIH to approve grants for said upgrades.

	I had no idea that 1-day turnaround was possible for CDs.  I was
under the impression that it was more like several weeks; I guess I havn't
been keeping up with the technology.  Given that datum, and the pricing
figures supplied by J. Philip Miller, I have to change my opinion about CDs
as a distribution medium.  It certainly seems like it beats magtape and
absolutely puts floppys to shame.  I notice that genbank is now available
on TK-50s.  While I suppose that was a necessity, my feeling on the issue
is that it simply panders to DEC's insistence on introducing proprietary
tape formats totally incompatible with accepted industry standards.  If
anybody wants to hear more on that subject, write me privately; most of my
thoughts on the TK-50 are best not aired in public.

	I understand the need to be careful about the design of CD file
formats to work with (or is that around?) the timing characteristics of CD
drives.  I just hope that some way is found to make the raw data on the
disk available to people who want it in essentially the same form as it is
now.

	In the past few days, I guess I've thrown a lot of criticism at the
genbank folks.  I know I've certainly done so in the past.  Lest anybody
get the wrong idea, I should state that I think the IG folks are doing a
pretty good job.  Certainly, the state of genbank has improved since IG
took over the contract from BBN.  Not to mention that the size of the
database has grown so much since then so the job is that much harder.
Everything possible should be done to make life easier on the keepers of
the sacred knowledge.  It should be a condition of accepting an NIH or NSF
grant that any sequences produced *must* be submitted to genbank in machine
readable form.  Make it part of the stock "administrative assurances"
section of the application.  Make it a check-off box on the front page of
R01s along with the animal welfare and human subjects, etc.  Make it part
of the standard instructions to reviewers on study sections to do genbank
searches to see if the PI has been naughty or nice.  Some way should also
be found to twist journals' arms and make them refuse to accept manuscripts
if there is any sequence data that hasn't been submitted to genbank.  I
find it absolutely impossible to believe that there is a single lab in the
US (or, for that matter, most of the world) doing sequencing that doesn't
at least have a PC clone with a 360k floppy drive.  What's so hard about
copying the sequence to a floppy and dropping it in the mail?

	Quick informal poll.  How many of you have, in say, the past 6
months, requested that somebody send you a sequence and received a printout
of the sequence on paper!?  Ah, but I'm preaching to the converted.
--
Roy Smith, Public Health Research Institute
455 First Avenue, New York, NY 10016
roy@alanine.phri.nyu.edu -OR- {att,philabs,cmcl2,rutgers,hombre}!phri!roy
"My karma ran over my dogma"