[net.ai] So the language analysis problem has been solved?!?

PEREIRA@SRI-AI.ARPA@sri-unix.UUCP (08/20/83)

I will also refrain from flaming, but not from taking to task 
excessive claims.

    I'll refrain from flaming about traditional (including
    logic) grammars.  I'm tired of people insisting on a
    restricted view of language that claims that grammar rules
    are the ultimate description of syntax (semantics being
    irrelevant) and that idioms are irritating special cases.  I
    might note that we have basically solved the language
    analysis problem (using a version of Berkeley's Phrase
    Analysis that handles ambiguity) ...

I would love to test that "solution of the language analysis 
problem"... As for the author being "tired of people insisting on a 
restricted ...", he is just tired of his own straw people, because 
there doesn't seem to be anybody around anymore claiming that 
"semantics is irrelevant".  Formal grammars (logic or otherwise) are 
just a convenient mathematical technique for representing SOME 
regularities in language in a modular and testable form. OF COURSE, a 
formal grammar seen from the PROCEDURAL point of view can be replaced 
by any arbitrary "ball of string" with the same operational semantics.
What this replacement does to modularity, testability and 
reproducibility of results is sadly clear in the large amount of 
published "research" in natural language analysis which is untestable 
and irreproducible. The methodological failure of this approach 
becomes obvious if one considers the analogous proposal of replacing 
the principles and equations of some modern physical theory (general 
relativity, say) by a computer program which computes "solutions" to 
the equations for some unspecified subset of their domain, some of 
these solutions being approximate or plain wrong for some (again 
unspecified) set of cases. Even if such a program were "right" all the
time (in contradiction with all our experience so far), its sheer 
opacity would make it useless as scientific explanation.

Furthermore, when mentioning "semantics", one better say which KIND of
semantics one means. For example, grammar rules fit very well with 
various kinds of truth-theoretic and model-theoretic semantics, so the
comment above cannot be about that kind of semantics. Again, a theory 
of semantics needs to be testable and reproducible, and, I would 
claim, it only qualifies if it allows the representation of a 
potential infinity of situation patterns in a finite way.

    I don't recall a von Neumann bottleneck in AI programs, at
    least not of the kind Backus was talking about.  The main
    bottleneck seems to be of a conceptual rather than a
    hardware nature.  After all, production systems are not
    inherently bottlenecked, but nobody really knows how to make
    them run concurrently, or exactly what to do with the
    results (I have some ideas though).

The reason why nobody knows how to make production systems run 
concurrently is simply because they use a global state and side 
effects. This IS precisely the von Neumann bottleneck, as made clear 
in Backus' article, and is a conceptual limitation with hardware 
consequences rather than a purely hardware limitation. Otherwise, why 
would Backus address the problem by proposing a new LANGUAGE (fp), 
rather than a new computer architecture?  If your AI program was 
written in a language without side effects (such as PURE Prolog), the 
opportunities for parallelism would be there. This would be 
particularly welcome in natural language analysis with logic (or other
formal) grammars, because dealing with more and more complex subsets 
of language needs an increasing number of grammar rules and rules of 
inference, if the results are to be accurate and predictable.  
Analysis times, even if they are polynomial on the size of the input, 
may grow EXPONENTIALLY with the size of the grammar.

                                Fernando Pereira
                                AI Center
                                SRI International
                                pereira@sri-ai

sts@ssc-vax.UUCP (Stanley T Shebs) (08/24/83)

Heh-heh.  Thought that'd raise a few hackles (my boss didn't approve
of the article; oh well.  I tend to be a bit fiery around the edges).

The claim is that we have "basically" solved the problem.  Actually,
we're not the only ones - the APE-II parser by Pazzani and others
from the Schank school have also done the same thing.  Our parser
can handle arbitrarily ambiguous sentences, generating *all* the
possible meanings, limited only by the size of its knowledge base.
We have the capability to do any sort of idiom, and mix any number
of natural languages.  Our problems are really concerned with the
acquisition of linguistic knowledge, either by having nonspecialists
put it in by hand (*everyone* is an expert on the native language)
or by having the machine acquire it automatically.  We can mail out
some details if anyone is interested.

One advantage we had is starting from ground zero, so we had very few
preconceptions about how language analysis ought to be done, and scanned
the literature.  It became apparent that since we were required to
handle free-form input, any kind of grammar would eventually become
less than useful and possibly a hindrance to analysis.  Mr. Pereira
admits as much when he says that grammars only reflect *some* aspects
of language.  Well, that's not good enough.  Us folks in applied
research can't always afford the luxury of theorizing about the most
elegant methods.  We need something that models human cognition
closely enough to make sense to knowledge engineers and to users.
So I'm sort of in the Schank camp (folks at SRI hate 'em) although
I try to keep my thinking as independent as possible (hard when
each camp is calling the other ones charlatans; I'll post something
on that pernicious behavior eventually).

Parallel production systems I'll save for another article...

					stan the leprechaun hacker
					ssc-vax!sts (soon utah-cs)

ps I *did* read an article of Mr. Pereira's - couldn't understand
the point.  Sorry.  (perhaps he would be so good as to explain?)