[bionet.molbio.genome-program] GenBank and FASTA

kristoff@genbank.BIO.NET (David Kristofferson) (03/23/90)

Two clarifications in the database distribution debate:

I'd simply like to point out that in my original posting I
**supported** Professor Roe's suggestion of distributed computing and
distributed databases when there is a need for it, but the debate
veered into the central-vs-local discussion (term from Roy Smith's
posting) probably because of excessive focussing on my suggestion that
if the main use was for FASTA searching, then users ought to consider
just using the GOS FASTA server instead of worrying about updating
their local database daily.  In fact, I still maintain that some of
the local uses suggested (to me in public and private mail) are *not*
as critically dependent on daily updates as is FASTA searching.
Nonetheless, at the present data flow rates, daily updates at local
sites are easy to do and will be done.

I am also sure that technology will continue to advance and provide
increasingly more feasible ways of distributing data as the database
grows in size.  Having dealt with hundreds of biologists on BIONET
though, I would just like for us to remember that the biological
community so far has not shown a desire to pay much for the latest
technological solutions and tends to lag behind in learning their use.
I will not be surprised if the data start growing faster than many
people's ability to handle it, even though the technical solutions
exist.  It will be the larger centers (NOTE: plural not singular) that
will have the trained personnel to handle these issues first.  Only
later will it spread to other sites.

I'm still amused by a statement that I saw concerning a certain
central computing resource which stated that in five years everyone
would have a certain type of hardware on their desktop, so central
facilities were no longer necessary.  Unfortunately, no transition
plan was provided for how one would get from the present to the
promised land.  But I guess that my father learned how to swim when
his father (a Coast Guard officer) threw him off the end of the
Chicago pier into Lake Michigan, so perhaps transition plans aren't
needed after all.  Why not keep life exciting?  

Let's not forget the human part of the equation when we wax poetic
about technology.  Spending several years *just trying to get people
to use e-mail and newsgroups* not to mention teaching them the more
difficult art of sequence analysis has made that very clear to me
personally.

Neither should one conclude that I was suggesting that FASTA would be
used forever.  I am aware of many different attempts to improve the
speed of database similarity searches involving both hardware (faster
specialized processors and parallel machines) and software
improvements.  During the operation of BIONET and afterwards we have
been involved in investigating these areas ourselves.  For now,
however, FASTA is familiar to the biological community and has become
a standard technique which delivers decent results in a reasonable
time.  Advances will, of course, continue so I'm not going to take
bets on how long this will remain true.  When something better comes
out, GOS will react accordingly if it is within our financial means.
-- 
				Sincerely,

				Dave Kristofferson
				GenBank On-line Service Manager

				kristoff@genbank.bio.net