briscoe-duke@CS.Yale.EDU (Duke Briscoe) (06/01/89)
We have recently had some very interesting articles on this newsgroup discussing the topics of information retrieval and applications of AI. Here are a few of my thoughts on these subjects. My background is that I am a computer science graduate student who was an undergraduate biology major, and I had several years experience in between related to expert systems, working on a medical expert system myself and observing others working on expert systems for both scientific and other domains. I think the emphasis on development of systems to support research (in biology, other sciences, and computer science) needs to be on systems which blend the work of man and machine. When an AI system reaches some stumbling block, a person should be able to discover the context of what the machine is doing, and possibly choose a path for the machine to continue work on. In many cases, perhaps the AI capabilities should be focused on very specific tasks, and the overall system will depend on the person to coordinate the overall activities. This requires development of appropriate user interfaces. Actually, the greatest benefits from computers which are possible in the next decade probably do not need any AI components at all. I think a multi-media hypertext system distributed over the network would have the greatest impact. This would depend on having a system which would allow people to filter what they read based on the choices of respected editors and recommendations by other chosen colleagues. Working with hypertext would greatly improve the efficiency of searching the literature since it would make it easier to keep up with what other people think are important results. A reader could also quickly access references; the references could directly point to relevant paragraphs. Also, references could be added which would point forward to more recent work instead of being limited to pointing at past work. After electronically publishing a document, corrections and clarifications could be made in response to reader's comments. Because of the use of hyper-text, which is a net-like style instead of a linear style of writing, side discussions could be attached to individual sentences, paragraphs, figures, or whatever textual unit desired, without interrupting the main linear thread of the paper. Also, hypertext documents could be active documents. Films or animation could be attached to the texts, and also computer programs. If a computer program was used to produce some of the results of the paper, that program, along with the data sets, could be referenced by the paper. A reader could choose to run that program on another set of data if desired. What are the obstacles to development of such systems? Software development would take several years, but probably even more time would be needed to overcome societal barriers. A few weeks ago, kristoff@NET.BIO.NET (David Kristofferson) wrote: > The odds of rewriting all of the biological literature or that > of any other discipline using a standard nomenclature are > obviously zero. Nonetheless the National Library of Medicine > is attempting to use medical subject headings (MESH terms) in > their cataloging of the literature. Searching can then be > performed using this kind of standardized vocabulary. However, > one still faces the need to foot it over to the library to > retrieve the text. If journals began publishing > electronically, one could simply call this up on one's computer > screen (simple character based terminals would, of course, be > at a loss here for lack of graphics capabilities). One then > gets into questions of copyright law, loss of subscription > money to publishers, etc., etc. This is not a trivial problem. > The building of libraries to house ever expanding shelves of > journals still seems to be the preferred route. Nonetheless, I > believe that this too will come to pass although it may take a > few decades. It's very useful but not glamorous work. > Nonetheless making literature available on-line as above would > probably do more to help the progress of biology than most > research projects do. Questions such as copyright, formal academic credit for contributions to "hyperjournals", and funding for the network and associated software would have to be solved. But these problems could probably be solved if government could be convinced of the value of such a change in the style of information retrieval. Copyright problems could probably be solved by the government legislating rights for itself or the authors to retain electronic publishing rights even if the material is also published by paper publishers. Also, some fair system of assigning copyright fees for electronic use of previously published items could probably be developed. Academic recognition for publishing in "hyperjournals" would probably depend on the development of an appreciation of the value of hyperjournals. It looks like the government really has to be closely involved with this kind of thing. It seems like it would fit in with the usual calls for "improving American competitiveness." The applicability of this technology goes far beyond biology, so it would make sense for development of such a system to be coordinated between disciplines, sharing costs and avoiding development of several incompatible systems. The next few years should show a growing appreciation of the ideas of hypertext, and the value of networks. At some point, the government will have to get involved. Duke
Kristofferson@BIONET-20.BIO.NET (David Kristofferson) (06/01/89)
Duke, Very interesting points! Your idea on networked hypertext sounds like an excellent possibility for a RFA or RFP from the NIH/NSF. Hope that someone there was listening 8-)!! Dave -------
briscoe-duke@CS.Yale.EDU (Duke Briscoe) (06/02/89)
This is a follow-up message to my own message yesterday. Using hypertext to link related material together is similar to the way in which semantic nets or knowledge bases are built for use by expert systems. A problem in developing expert systems has been that the knowledge and inference rules have to be specified in excruciating detail in order for a computer to be able to reliably use them. My view of the utility of hypertext is that the body of knowledge would gradually become interlinked with references, and that people could navigate through that hypertext medium, acting as much more flexible and capable inference engines than any AI systems currently conceived. The body of knowledge could evolve, with knowledge added and also with some ideas being pushed further into the background if they do not prove useful. The use of networks and a distributed hypertext medium could allow ideas to widely propagate within a matter of hours, instead of a matter of months as is typical now. Also the medium could serve as a more global indexing of information. As my previous message stated, the hypertext medium could have active components, so that in areas of knowledge where there was greater automation, a researcher could call on the available programs as part of the process of navigating the hypertext. I hope this clarifies some aspects of my perhaps overly verbose message yesterday.