[bionet.molbio.bio-matrix] electronic journals, use of AI

briscoe-duke@CS.Yale.EDU (Duke Briscoe) (06/01/89)

We have recently had some very interesting articles on this
newsgroup discussing the topics of information retrieval and
applications of AI.  Here are a few of my thoughts on these
subjects.  My background is that I am a computer science graduate
student who was an undergraduate biology major, and I had several
years experience in between related to expert systems, working on
a medical expert system myself and observing others working on
expert systems for both scientific and other domains.

I think the emphasis on development of systems to support
research (in biology, other sciences, and computer science) needs
to be on systems which blend the work of man and machine.  When
an AI system reaches some stumbling block, a person should be
able to discover the context of what the machine is doing, and
possibly choose a path for the machine to continue work on.  In
many cases, perhaps the AI capabilities should be focused on very
specific tasks, and the overall system will depend on the person
to coordinate the overall activities.  This requires development of
appropriate user interfaces.

Actually, the greatest benefits from computers which are possible
in the next decade probably do not need any AI components at all.
I think a multi-media hypertext system distributed over the
network would have the greatest impact.  This would depend on
having a system which would allow people to filter what they read
based on the choices of respected editors and recommendations by
other chosen colleagues.  Working with hypertext would greatly
improve the efficiency of searching the literature since it would
make it easier to keep up with what other people think are
important results.  A reader could also quickly access
references; the references could directly point to relevant
paragraphs.  Also, references could be added which would point
forward to more recent work instead of being limited to pointing
at past work.  After electronically publishing a document,
corrections and clarifications could be made in response to
reader's comments.  Because of the use of hyper-text, which is a
net-like style instead of a linear style of writing, side
discussions could be attached to individual sentences,
paragraphs, figures, or whatever textual unit desired, without
interrupting the main linear thread of the paper.  Also,
hypertext documents could be active documents.  Films or
animation could be attached to the texts, and also computer
programs.  If a computer program was used to produce some of the
results of the paper, that program, along with the data sets,
could be referenced by the paper.  A reader could choose to run
that program on another set of data if desired.

What are the obstacles to development of such systems?  Software
development would take several years, but probably even more time
would be needed to overcome societal barriers.  A few weeks ago,
kristoff@NET.BIO.NET (David Kristofferson) wrote:

> The odds of rewriting all of the biological literature or that
> of any other discipline using a standard nomenclature are
> obviously zero.  Nonetheless the National Library of Medicine
> is attempting to use medical subject headings (MESH terms) in
> their cataloging of the literature.  Searching can then be
> performed using this kind of standardized vocabulary.  However,
> one still faces the need to foot it over to the library to
> retrieve the text.  If journals began publishing
> electronically, one could simply call this up on one's computer
> screen (simple character based terminals would, of course, be
> at a loss here for lack of graphics capabilities).  One then
> gets into questions of copyright law, loss of subscription
> money to publishers, etc., etc.  This is not a trivial problem.
> The building of libraries to house ever expanding shelves of
> journals still seems to be the preferred route.  Nonetheless, I
> believe that this too will come to pass although it may take a
> few decades.  It's very useful but not glamorous work.
> Nonetheless making literature available on-line as above would
> probably do more to help the progress of biology than most
> research projects do.

Questions such as copyright, formal academic credit for
contributions to "hyperjournals", and funding for the network and
associated software would have to be solved.  But these problems
could probably be solved if government could be convinced of the
value of such a change in the style of information retrieval.
Copyright problems could probably be solved by the government
legislating rights for itself or the authors to retain electronic
publishing rights even if the material is also published by paper
publishers.  Also, some fair system of assigning copyright fees
for electronic use of previously published items could probably
be developed.  Academic recognition for publishing in
"hyperjournals" would probably depend on the development of an
appreciation of the value of hyperjournals.

It looks like the government really has to be closely involved
with this kind of thing.  It seems like it would fit in with the
usual calls for "improving American competitiveness."  The
applicability of this technology goes far beyond biology, so it
would make sense for development of such a system to be
coordinated between disciplines, sharing costs and avoiding
development of several incompatible systems.

The next few years should show a growing appreciation of the
ideas of hypertext, and the value of networks.  At some point,
the government will have to get involved.

Duke

Kristofferson@BIONET-20.BIO.NET (David Kristofferson) (06/01/89)

Duke,

	Very interesting points!  Your idea on networked hypertext
sounds like an excellent possibility for a RFA or RFP from the
NIH/NSF.  Hope that someone there was listening 8-)!!

Dave
-------

briscoe-duke@CS.Yale.EDU (Duke Briscoe) (06/02/89)

This is a follow-up message to my own message yesterday.

Using hypertext to link related material together is similar to
the way in which semantic nets or knowledge bases are built for
use by expert systems.  A problem in developing expert systems
has been that the knowledge and inference rules have to be
specified in excruciating detail in order for a computer to be
able to reliably use them.  My view of the utility of hypertext
is that the body of knowledge would gradually become interlinked
with references, and that people could navigate through that
hypertext medium, acting as much more flexible and capable
inference engines than any AI systems currently conceived.  The
body of knowledge could evolve, with knowledge added and also
with some ideas being pushed further into the background if they
do not prove useful.  The use of networks and a distributed
hypertext medium could allow ideas to widely propagate within a
matter of hours, instead of a matter of months as is typical now.
Also the medium could serve as a more global indexing of
information.  As my previous message stated, the hypertext medium
could have active components, so that in areas of knowledge where
there was greater automation, a researcher could call on the
available programs as part of the process of navigating the
hypertext.

I hope this clarifies some aspects of my perhaps overly verbose
message yesterday.