[bionet.molbio.bio-matrix] Summary of current research

KARP@SUMEX-AIM.STANFORD.EDU (Peter Karp) (10/27/88)

The following summarizes my current research in Artificial Intelligence
in the Stanford Computer Science Department.  I am a PhD student working
in the areas of Machine Learning and Qualitative Reasoning.  My thesis
work involves building a computer system to reproduce some of the
reasoning used by Dr. Charles Yanofsky and other workers in discovering
the bacterial gene regulation mechanism called attenuation.  The work
has three major components.

First, I performed a historical study of Yanofsky's research to
understand what reasoning processes biologists used to make these
discoveries.    This study includes a comprehensive survey of the
literature and many interviews with Yanofsky, his co-workers, and
researchers in other laboratories.

Second, I constructed a knowledge-based simulation system which models
the E. coli tryptophan operon (the system in which attenuation was
studied).  This system embodies a theory of the trp operon which can
predict the results of experiments on the trp operon.  This knowledge
base is currently rather small in size (roughly 200 objects), but
constructing it provided important lessons in the representation of
biological knowledge, and raised a number of issues which must be
addressed in the construction of large knowledge bases.  The system
includes models of entities such as the trp operon, and the processes of
of transcription and translation, and the biosynthesis of tryptophan.
These models are fairly abstract, e.g., they contain no DNA or protein
sequence information but represent DNA functional units such as
operators, promoters and ribosome binding sites.  The system was
constructed on a Xerox Lisp machine running Interlisp using
Intellicorp's KEE frame (objected oriented) knowledge representation
tool.

Third, I am building a hypothesis generation system which takes as input
(a) a description of an experiment whose outcome is predicted
incorrectly by the simulation system above, and (b) the model of the trp
operon used by the simulation system.  Its output is a set of hypotheses
which alter both the theory and the initial conditions of the experiment
(e.g., by postulating the existence of mutations) to produce a correct
prediction.  These hypotheses are synthesized by a program which views
hypothesis generation as a design problem, and employs Artificial
Intelligence techniques from design and planning to generate hypotheses.

Further information can be obtained from the references below.  My
dissertation will be a comprehensive description of this research, and
is in preparation.



P. Karp, "A Process-Oriented Model of Bacterial Gene Regulation", 1988
Stanford Knowledge Systems Laboratory Technical Report KSL-88-18, 14 pages.

Friedland, P., and Kedes, L. "Discovering the secrets of DNA",
Communications of the ACM, 28(11):1164-1186, November, 1985.
-------