[comp.ai.digest] Conference - Matrix of Biology Workshop

PHW@OZ.AI.MIT.EDU ("Patrick H. Winston") (04/23/87)
 ****************   OPPORTUNITY  FOR  PARTICIPATION   ****************

				WORKSHOP
			    ON THE MATRIX OF
			 BIOLOGICAL INFORMATION


ARTIFICIAL INTELLIGENCE, DATA BANK MANAGEMENT, COMPUTER ANALYSIS OF
MACROMOLECULES --- APPLIED TO CELLULAR BIOLOGY TO DEVELOP AN APPROACH TO
GENERALIZATIONS AND OTHER THEORETICAL INSIGHTS IN BIOLOGICAL SCIENCE.

We have today a unique opportunity to merge research at the forefront of
Artificial Intelligence with efforts to provide a new conceptual
framework for the laws, models, empirical generalizations and physical
foundations of the modern biological sciences.

The Matrix of Biological Knowledge is an attempt to use advanced
computer methods to organize the immense and growing body of
experimental data in the biological sciences, in the expectation that
there are a significant number of as yet undiscovered ordering
relations, new laws and predictive relations embedded in the mass of
existing information.  Workshop participants will attempt to define the
interrelations of the matrix of biological knowledge, and to demonstrate
its feasibility by applying the modern tools of computer science to a
small set of case studies.  This is an outgrowth of a report from the
Natl. Academy of Sciences, "Models for Biomedical Research: a New
Perspective," produced in response to a request by the Natl. Institutes
of Health (NIH).  A brief summary and description appears in "An
Omnifarious Data Bank for Biology?," SCIENCE 228(4706), 21 June 1985.

The workshop is intended to introduce a number of young scientists to
the matrix concept and to explore with these investigators the
possibilities of new theoretical developments and conceptual frameworks.
The workshop will run July 13 - August 14 at St. Johns College in Santa
Fe, in the Sangre de Cristo mountains of northern New Mexico (AAAI
attendees may miss the first week).  Participants will be supported with
housing, meals and travel as necessary.  Thirty participants (graduate
students, post-doctoral fellows, and working scientists) are expected to
be selected by application from throughout the United States.

Eight groups will be directed by senior scientists:
"Artificial Intelligence," Patrick Winston, A.I. Laboratory, MIT;
"Management of Large Scale Data Bases," Robert Goldstein, U. Brit. Columbia;
"Computers Applied to Macromolecules," Peter Kollman, U. Cal. San Francisco;
"The Organization of Biological Knowledge," Harold Morowitz, Yale University;
"Cell-Cell Interactions," Hans Bode, U. of Calif., Irvine;
"Toxicology," Robert Rubin, Johns Hopkins University;
"Information Flow from DNA to Cells," Richard Dickerson, UCLA,
	 Harvey Hershman, UCLA, and Temple Smith, Harvard University;
"Peptides and Signalling Molecules," Christian Burks, Los Alamos Natl. Lab.,
	and Derek LeRoith, NIH.

A brief description of background and desire to participate, together
with two letters of recommendation, should be sent to

	Santa Fe Institute, attn. Ginger Richardson
	P.O. Box 9020
	Santa Fe, New Mexico, 87504 - 9020
	(phone (505) 984-8800)

(Applicants should first review the NAS report or the SCIENCE article,
above, available in most science libraries.)

The workshop has been previously announced in other forums and the
formal application deadline is 1 May 1987.  Applicants who will have
difficulty meeting that deadline should telephone Ginger Richardson and
notify her of their intent to submit an application, as few if any
positions will be available after that date.  Applicants are strongly
encouraged to apply expeditiously so that an early decision about
participation may be reached.

Some representative connections between Artificial Intelligence and
the Matrix Workshop follow, but the list is suggestive only.

NATURAL LANGUAGE:  What constraints on form and content must be met for
a scientific Abstract to be machine-readable?  It is generally a single
paragraph in a very restricted form of declarative prose.  If tolerable
constraints could be found they would probably be widely adopted.

KNOWLEDGE REPRESENTATION:  How much of what knowledge must be captured,
and how, to enable scientific reasoning?  Is a single unified
representation scheme possible or must each sub-field have a specialized
representation to support a specialized vocabulary and ontology?  ``In
the Knowledge lies the Power.''  How can we organize this tremendous
amount of knowledge to extract the power everyone believes is there?

ANALOGICAL MAPPING:  How can we notice when analogous biological
functions are implemented by analogous structures?  Can we discover and
validate analogical animal models of human systems?  Can we explain an
unknown response in an organism by analogy to a better-understood
system?  Given an experimental system, description or outcome, could we
index and retrieve analogous situations and/or literature references?

MACHINE LEARNING:  How can we re-structure the large existing databases
to automate induction from data?  Can we use more knowledge-intensive
forms of learning in this knowledge-intensive domain?  Can existing
learning paradigms be extended to cope with the noisy data that any real
application must face?

RULE-BASED EXPERT SYSTEMS:  How much of the expert scientist's knowledge
can be formalized explicitly as rules?  Could we produce an expert
system which, given a problem or request for information, could infer
which database contained the answer?  Could expert knowledge, say of
toxicology, be used to produce a Toxicology Advisor which knew how to
access databases to find answers to questions not covered by its rules?
Could we create expert systems which continually scanned new additions
to databases to update their rules, or at least flag areas where the new
addition conflicts with or supplants an existing rules?

TRUTH MAINTENANCE:  Suppose an Abstract always contained an explicit
statement of the proposition(s) argued for or against by the paper.
Could this be entered into a dependency network, with the paper as
justification?  Could we then query the TMS to determine, for some
proposition, whether it is generally believed, disbelieved, or
controversial; and pick out the relevant literature citations?  If a new
paper supports or contradicts a result from a neighboring field, can
this be detected reliably?

QUALITATIVE PROCESS THEORY:  Can an organism be modeled as a cooperating
system of processes?  Can we organize this so as to find similar process
systems shared by different organisms?  Can we reliably predict the
effects of perturbing an organism's processes, e.g. in the study of
toxicology or medicine?

SCIENTIFIC REASONING AND DISCOVERY:  We have the opportunity to
structure a large, continuously-updated body of real-world scientific
knowledge.  What form of Knowledge Base would best facilitate
discovering the unexpected regularities in the data?  Could a program
(possibly using a dependency network of experimental results) suggest
crucial experiments and reason about implications of possible outcomes?

SCHEMA COMPLETION:  Can an experiment be understood in terms of a
setting which instantiates an ``experiment schema''?  Can we use this to
group results that are ``schematically close'', even if they occur in
different biological models or in related but distinct sub-fields?  Can
we fill in the default assumptions underlying a description of the
experiment and results?

DISCOURSE/STORY UNDERSTANDING:  Could a scientific article be analyzed
as a narrative describing an experimental setting, a group of
observations, and some conclusions?  Given a new story (experiment),
could we retrieve closely related or similar stories we've heard before?
Could a highly abridged summary of the story be produced?  Could several
stories be automatically merged, and an overall summary produced?

This list is obviously indicative, not exhaustive.

-------