[net.ai] CYC Project at MCC

michaelm@bcsaic.UUCP (michael maxwell) (07/17/86)

(long!)

I just finished reading the following article:

%A Doug Lenat
%A Mayank Prakash
%A Mary Shepherd
%T CYC: Using Common Sense Knowledge to Overcome Brittleness and Knowledge \
Acquisition Bottlenecks
%J The AI Magazine
%V 6
%N 4
%P 65-85
%D 1985
%X MCC's CYC project is the building, over the coming decade, of a large
knowledge base (or KB) of real world facts and heuristics and--as a part of
the KB itself--methods for efficiently reasoning over the KB.

I haven't seen much discussion of this article (there are two letters to the
editor in the next issue, but neither goes into much depth).  It seems to me
that this project has at least the potential of changing the field of AI more
than any other project now in progress (hold the flames, please!).  On the
other hand, it could be a real fiasco; I suspect that we won't know until it's
tried.  But the kind of discussion I'd like to see is not whether CYC will
succeed (for as I say, I don't think anyone can possibly *know* now), but
rather about the methodology--how it could be improved, where the weaknesses
are, i.e. substantive issues, rather than "Gee, this is great!" or flames.
So if you think there are weaknesses, please be explicit, and preferably give
a better way to do it.

By way of starting some discussion (and probably violating my standards I just
gave :-), let me suggest some questions.

1. In their typology of the analogy-space, the list (pg. 69) several ways
in which two frames might be seen to be identical.  "Two frames have
several slots with identical names and values...  Two frames can have
identically named slots whose values are not quite identical...  Both the
names of the slots and the values they contain may match but be
nonidentical [e.g. they might both be specializations of the same slot]."
A lot seems to ride here on identicalness of slot names.  Given that they
expect most entries to be done by copy-and-edit, this is at least
plausible ICO frames that are specializations of the same system (e.g.
"irrigation" and "subway systems" might both be specializations of
"transportation systems").  But does this leave out some of the more
interesting types of analogy?  E.g. the analogy between cable cars (or
whatever it was--some sort of mass transit vehicle) and computer
architecture that is drawn in the movie "TRON"?  I don't think one is a
priori likely to copy the frame for computer hardware from that for
transportation (or am I wrong?)--and if you do, won't you miss other
interesting analogies for computer hardware?  See also their comment (pg. 70)
"that the precise way two concepts are represented can radically effect how
easy it is to find the analogy between them."

2. Re the discussion on endowing an expert system with common sense (pg. 71),
are they doing anything more than assigning a data type to the arguments of a
function?  How does this relate to the idea that typeless programming 
languages have certain advantages?

3. Also on pg. 71, last paragraph, discussing how new frames get added: "the
expert would discover this by arriving at *the* place where Patients should
be...and not finding it there."  [-emphasis mine, MM]  As the database gets
more complex, will it become increasingly difficult to find where a concept
should be?  Probably this violates my standards above--I should wait and see!
See also their comments on pg. 76.

4. It seems to me that the trickiest part is their steps 3 (pg. 80, "extract
and encode the implied (common sense) knowledge" and 4 (pg. 81, "extract and
encode the intersentential knowledge").  Will the encyclopedia articles
really presuppose all the knowledge we would want?  E.g. consider the
interpretation of indefinite NPs in opaque contexts:
	(1) Everyone is looking for a lost boy.
	(2) Everyone is trying to catch a fish.
The most likely interpretation of (1) is that there is a particular boy,
whereas the most likely interpretation of (2) (in the absence of information
to the contrary) is that no one has any particular fish in mind.  What kind of
information might you run across in an encyclopedia that would presuppose this
sort of distinction, and therefore enter it in the presupposed knowledge base?  
Remember that you need to make the distinction not only for boys and fish, but 
answers to arithmetic problems, solutions to Fermat's Last Theorem (of which 
there may be several), etc.  If this particular example is problematical for
CYC, is it an isolated case, or are there lots more?

Well, I have a feeling that I may regret posting this.  Let the flames
come...
-- 
Mike Maxwell
Boeing Artificial Intelligence Center
	...uw-beaver!uw-june!bcsaic!michaelm

kort@hounx.UUCP (B.KORT) (07/19/86)

Mike Maxwell has opened up for discussion a fascinating area of research
on Analogical Thinking.

To me, the deepest and most interesting analogies are the one where
the structure or shape of the Knowledge Tree (or Mesh) is equivalent
to that found in another, seemingly unrelated section of the Knowldege
Base.  The names of the slots can be copletely different.

Imagine a high-level command called Copy-with-Substitute-Symbol-Names.
This command would duplicate a section of the KB, but systmatically
substitute new names for old.  When humans do this, we call it
analogy, metaphor, or parable.  Once the mapping is done, one has
transferred a large chunk of knowldege, mutatis mutandis, from one
field to another.  It is only the pattern or shape that remains
constant.  It is in the pattern that the invariant portion of the
knowledge resides.


There are some interesting books which draw such maps.  Two that
I like are The Dancing Wu Li Masters by Gary Zukov and The Tao
of Physics by Fritjof Capra.  These books construct mappings
from concepts in Western Science to corresponding notions in
Eastern Mystical Philosophy.

That such a mapping exists may seem surprising at first, but
makes sense when you think about it.  Western Science (and
Pjysics in particular) is a map or description of the Laws
of Nature.  The human mind builds pictures (or images or
models) of the world in which the individual finds himself
embedded.  Introspection reveals those internal models,
which should be in good correspondence with external Reality
as detected by the Senses.  The only problem is that the
internal models are made up of parts without nomenclature.
The nomenclature invented to describe Mystical Philosophy
needs to be translated into the corresponding Scientific
Nomenclature.  This is what Zukov and Capra have attempted
in the above mentioned books.

The parables of Religious Literature are another rich source
of structures for Common Sense Knowledge.  Children's Fables
are another.  Lewis Carroll's contribution to this genre
is seminal.  Gulliver's Travels is in the same category,
but perhaps harder to decipher.  Metaphor is everywhere.

Barry Kort

"The Keyboard on my new Lisp Machine has a Meta Key.
  What is a Meta for?"