[bionet.molbio.bio-matrix] Computerizing the literature

kristoff@NET.BIO.NET (David Kristofferson) (05/24/89)

> Formalizing the AI of biology --
> and theorizing about reasoning on biological principles -- well,
> I wouldn't want to point fingers or anything, but at this stage I
> would much rather work with a few dozen system experts than a few
> dozen expert systems.  More fun, too.

You are not the first person to express these sentiments.  I am sure
that this debate on AI will continue to rage.

Also to respond to Dan Davison's last message, I fully appreciate that
the goal of the BIO-MATRIX project is not to reason from first
principles, etc.  My point was that in most presentations of the
matrix concept that I have seen/heard to date, the monologue starts
out by talking about how wonderful it is that physicists can reason
from first principles, etc., etc.  This tends to go on for more time
than necessary.  In my experience as soon as one starts talking about
such things to a biological audience people start tuning you out as
being completely impractical.  Later in the matrix presentation, one
gets to the point about reasoning by analogy, improving biological
databases, etc., which are all points that biologists would agree
with.  My question is simply: How many practicing biologists are still
listening by the time this point is brought up?  Again I believe that
the standard Matrix description should "drop the physics talk" as I
said earlier and get to the point of the project more directly.

> So what will it take to siphon mainstream biological literature
> into useful computer form?  Perhaps there is some way I can help.

I have heard several people suggest that this should be a major focus
at the NLM.  Can someone from there comment on what is being done to
this effect?  This is an extremely important topic in my mind.

At BIONET I have taken a *very* small step in starting to introduce
journal editors to the use of electronic mail and bulletin boards and
in then getting them to submit their journal's Tables of Contents in
advance of hardcopy publication.  Perhaps next we can get the
abstracts on-line (although I keep being told that this entails more
expense for them).

Of course, this effort is NOT considered *RESEARCH* (everyone bow with
face to the ground) but is considered to be more worthy of assignment
to a corps of thousands of faceless data entry clerks.  There have
been some cynics out there who have been less than supportive of these
efforts.  If one tries to get grant money for this type of "service
work" one gets told that grants are for RESEARCH.  If one tries to get
the funding agencies to put out contracts, one finds out that this
takes years to organize!  Meanwhile the biological community continues
to limp along.  Steve Jobs will probably not come to your rescue
because biologists are still rather poor and it is unlikely that a
private company is going to make the investment needed to perform
tasks of this magnitude when (a) there is a high risk that the
investment may never be recovered and (b) the typical academic
biologist doesn't fully appreciate the costs involved having never
worked in industry (that place of the "wailing and gnashing of teeth"
as it was portrayed to me before I "defected" ...  best decision of my
life, by the way!) and would rather try to do a band-aid job at best
by paying slave labor wages to grad students who will do the project
during centrifuge runs.  Given this situation ye all shall reap what
is sown.  Will the funding agencies and the community get together on
"service topics" like this?  How long, Oh Lord, how long ...???
-- 
				Sincerely,

				Dave Kristofferson
				BIONET Resource Manager

				kristoff@net.bio.net
			     or	kristofferson@bionet-20.bio.net

kristoff@NET.BIO.NET (Dave Kristofferson) (05/26/89)

> We are not just
> after a big bag of data, we want to explain how things work.  At a
> minimum, the Bio-Matrix is a tool to facilitate discovery of biological
> principles.
> 
...
> Some of the important issues are:
> 
>   How do we enter the last couple centuries of information?  

Chris,

	If I read these two statements correctly you do not want to go
after a "big bag of data" which is what entering the last couple of
centuries of information would produce.  On the other hand, being
selective in what was entered would presuppose some higher order
knowledge about what was "correct" in biology.  If the Matrix project
hopes to discover new knowledge by drawing analogies between disparate
databases one runs the chance here of throwing the baby out with the
bathwater by being "selective" during the entry process.  I still
think the Matrix goals are stated in a fashion that makes them
unachievable by mere mortals, although later in your message you
acknowledge the need for looking at "subdomains" in biological
knowledge which is indeed reasonable.

.
. No immediate solutions to the monumental problems deleted here 8-)!
.

>   One place almost everyone agrees we should start is by integrating
>   existing databases.  The ISIS project, which has integrated 6 or 7
>   molecular biology database into a monolithic whole, is one such
>   approach to this problem.  What are other approaches to integrating
>   multiple, heterogeneous databases?

More details about ISIS might be of interest to the general
readership. 

Dave Kristofferson