shebs@utah-cs.UUCP (Stanley Shebs) (09/20/83)
Lest usenet readers think things had gotten silent all at once, here's an article by Fernando Pereira that (apparently and inexplicably) was *not* sent to usenet, and my reply (fortunately, I now have read-only access to Arpanet, so I was able to find out about this) _____________________ Date: Wed 31 Aug 83 18:42:08-PDT From: PEREIRA@SRI-AI.ARPA Subject: Solutions of the natural language analysis problem Given the downhill trend of some contributions on natural language analysis in this group, this is my last comment on the topic, and is essentially an answer to Stan the leprechaun hacker (STLH for short). I didn't "admit" that grammars only reflect some aspects of language. (Using loaded verbs such as "admit" is not conducive to the best quality of discussion.) I just STATED THE OBVIOUS. The equations of motion only reflect SOME aspects of the material world, and yet no engineer goes without them. I presented this point at greater length in my earlier note, but the substantive presentation of method seems to have gone unanswered. Incidentally, I worked for several years in a civil engineering laboratory where ACTUAL dams and bridges were designed, and I never saw there the preference for alchemy over chemistry that STLH suggests is the necessary result of practical concerns. Elegance and reproduciblity do not seem to be enemies of generality in other scientific or engineering disciplines. Claiming for AI an immunity from normal scientific standards (however flawed on the deffensive because of media hype, but will surely come back to the fray, with that weapon plus a long list of unfulfilled promises and irreproducible "results." Lack of rigor follows from lack of method. STLH tries to bludgeon us with "generating *all* the possible meanings" of a sentence. Does he mean ALL of the INFINITY of meanings a sentence has in general? Even leaving aside model-theoretic considerations, we are all familiar with he wanted me to believe P so he said P he wanted me to believe not P so he said P because he thought that I would think that he said P just for me to believe P and not believe it and so on ... in spy stories. The observation that "we need something that models human cognition closely enough..." begs the question of what human cognition looks like. (Silly me, it looks like STLH's program, of course.) STLH also forgets that is often better for a conversation partner (whether man or machine) to say "I don't understand" than to go on saying "yes, yes, yes ..." and get it all wrong, as people (and machines) that are trying to disguise their ignorance do. It is indeed not surprising that "[his] problems are really concerned with the acquisition of linguistic knowledge." Once every grammatical framework is thrown out, it is extremely difficult to see how new linguistic knowledge can be assimilated, whether automatically or even by programming it in. As to the notion that "everyone is an expert on the native language", it is similar to the claim that everyone with working ears is an expert in acoustics. As to "pernicious behavior", it would be better if STLH would first put his own house in order: he seems to believe that to work at SRI one needs to swear eternal hate to the "Schank camp" (whatever that is); and useful criticism of other people's papers requires at least a mention of the title and of the objections. A bit of that old battered scientific protocol would help... Fernando Pereira ___________________ The level of discussion *has* degenerated somewhat, so let me try to bring it back up again. I was originally hoping to stimulate some debate about certain assumptions involved in NLP, but instead I seem to see a lot of dogma, which is *very* dismaying. Young idealistic me thought that AI would be the field where the most original thought was taking place, but instead everyone seems to be divided into warring factions, each of whom refuses to accept the validity of anybody else's approach. Hardly seems scientific to me, and certainly other sciences don't evidence this problem (perhaps there's some fundamental truth here - that the nature of epistemology and other AI activities are such that it's very difficult to prevent one's thought from being trapped into certain patterns - I know I've been caught a couple times, and it was hard to break out of the habit - more on that later) As a colleague of mine put it, we seem to be suffering from a "difference in context". So let me describe the assumptions underpinning my theory (yes I do have one): 1. Language is a very fuzzy thing. More precisely, the set of sound strings meaningful to a human is almost (if not exactly) the set of all possible sound strings. Now, before you flame, consider: Humans can get at least *some* understanding out of a nonsense sequence, especially if they have any expectations about what they're hearing (this has been demonstrated experimentally) although it will likely be wrong. Also, they can understand mispronounced or misspelled words, sentences with missing words, sentences with repeated words, sentences with scrambled word order, sentences with mixed languages (I used to have fun by speaking English using German syntax, and you can sometimes see signs using English syntax with "German" words), and so forth. Language is also used creatively (especially netters!). Words are continually invented, metaphors are created and mixed in novel ways. I claim that there is no rule of grammar that cannot be violated. Note that I have said *nothing* about changes of meaning, nor have I claimed that one could get much of anything out of a random sequence of words strung together. I have only claimed that the set of linguistically valid utterances is actually a large fuzzy set (in the technical sense of "fuzzy"). If you accept this, the implications for grammar are far-reaching - in fact, it may be that classical grammar is a curious but basically irrelevant description of language (however, I'm not completely convinced of that). 2. Meaning and interpretation are distinct. Perhaps I should follow convention and say "s-meaning" and "s-interpretation", to avoid terminology trouble. I think it's noncontroversial that the "true meaning" of an utterance can be defined as the totality of response to that utterance. In that case, s-meaning is the individual-independent portion of meaning (I know, that's pretty vague. But would saying that 51% of all humans must agree on a meaning make it any more precise? Or that there must be a predicate to represent that meaning? Who decides which predicate is appropriate?). Then s-interpretation is the component that depends primarily on the individual and his knowledge, etc. Let's consider an example - "John kicked the bucket." For most people, this has two s-meanings - the usual one derived directly from the words and an idiomatic way of saying "John died". Of course, someone may not know the idiom, so they can assign only one s-meaning. But as Mr. Pereira correctly points out, there are an infinitude of s-interpretations, which will completely vary from individual to individual. Most can be derived from the s-meaning, for instance the convoluted inferences about belief and intention that Mr. Pereira gave. On the other hand, I don't normally make those s-interpretations, and a "naive" person might *never* do so. Other parts of the s-interpretation could be (if the second s-meaning above was intended) that the speaker tends to be rather blunt; certainly a part of the response to the utterance, but is less clearly part of a "meaning". Even s- meanings are pretty volatile though - to use another spy story example, the sentence might actually be a code phrase with a completely arbitrary meaning! 3. Cognitive science is relevant to NLP. Let me be the first to say that all of its results are at best suspect. However, the apparent inclination of many AI people to regard the study of human cognition as "unscientific" is inexplicable. I won't claim that my program defines human cognition, since that degree of hubris requires at least a PhD :-) . But cognitive science does have useful results, like the aforementioned result about making sense out of nonsense. Also, lot of common-sense results can be more accurately described by doing experiments. "Don't think of a zebra for the next ten minutes" - my informal experimentation indicates that *nobody* is capable - that seems to say a lot about how humans operate. Perhaps cognitive science gets a bad review because much of it is Gedanken experiments; I don't need tests on a thousand subjects to know that most kinds of ungrammaticality (such as number agreement) are noticeable, but rarely affect my understanding of a sentence. That's why I say that humans are experts at their own languages - we all (at least intuitively) understand the different parts of speech and how sentences are put together, even though we have difficulty expressing that knowledge (sounds like the knowledge engineer's problems in dealing with experts!). BTW, we *have* had a non- expert (a CS undergrad) add knowledge to our NLP system, and the folks at Berkeley have reported similar results [Wilensky81]. 4. Theories should reflect reality. This is especially important because the reverse is quite pernicious - one ignores or discounts information not conforming to one's theories. The equations of motion are fine for slow-speed behavior, but fail as one approaches c (the language or the velocity? :-) ). Does this mean that Lorenz contractions are experimental anomalies? The grammar theory of language is fine for very restricted subsets of language, but is less satisfactory for explaining the phenomena mentioned in 1., nor does it suggest how organisms *learn* language. Mr. Pereira's suggestion that I do not have any kind of theoretical basis makes me wonder if he knows what Phrase Analysis *is*, let alone its justification. Wilensky and Arens of UCB have IJCAI-81 papers (and tech reports) that justify the method much better than I possibly could. My own improvement was to make it follow multiple lines of parsing (have to be contrite on this; I read Winograd's new book recently and what I have is really a sort of active chart parser; also noticed that he gives nary a mention to Phrase Analysis, which is inexcusable - that's the sort of thing I mean by "warring factions"). 4a. Reflecting reality means "all of it" or (less preferable) "as much as possible". Most of the "soft sciences" get their bad reputation by disregarding this principle, and AI seems to have a problem with that also. What good is a language theory that cannot account for language learning, creative use of language, and the incredible robustness of language understanding? The definition of language by grammar cannot properly explain these - the first because of results (again mentioned by Winograd) that children receive almost no negative examples, and that a grammar cannot be learned from positive examples alone, the third because the grammar must be extended and extended until it recognizes all strings as valid. So perhaps the classical notion of grammar is like classical mechanics - useful for simple things, but not so good for photon drives or complete NLP systems. The basic notions in NLP have been thoroughly investigated; IT'S TIME TO DEVELOP THEORIES THAT CAN EXPLAIN *ALL* ASPECTS OF LANGUAGE BEHAVIOR! 5. The existence of "infinite garden-pathing". To steal an example from [Wilensky80], John gave Mary a piece of his.........................mind. Only the last word disambiguates the sentence. So now, what did *you* fill in, before you read that last word? There's even more interesting situations. Part of my secret research agenda (don't tell Boeing!) has been the understanding of jokes, particularly word plays. Many jokes are multi-sentence versions of garden- pathing, where only the punch line disambiguates. A surprising number of crummy sitcoms can get a whole half-hour because an ambiguous sentence is interpreted differently by two people (a random thought - where *did* this notion of sentence as fundamental structure come from? Why don't speeches and discourses have a "grammar" precisely defining *their* structure?). In general, language is LR(lazy eight). Miscellaneous comments: This has gotten pretty long (a lot of accusations to respond to!), so I'll save the discussion of AI dogma, fads, etc for another article. When I said that "problems are really concerned with the acquisition of linguistic knowledge", that was actually an awkward way to say that, having solved the parsing problem, my research interests switched to the implementation of full-scale error correction and language learning (notice that Mr. Pereira did not say "this is ambiguous - what did you mean?", he just assumed one of the meanings and went on from there. Typical human language behavior, and inadequately explained by most existing theories...). In fact, I have a detailed plan for implementation, but grad school has interrupted that and it may be a while before it gets done. So far as I can tell, the implementation of learning will not be unusually difficult. It will involve inductive learning, manipulation of analogical representations to acquire meanings ("an mtrans is like a ptrans, but with abstract objects"....), and other good things. The nonrestrictive nature of Phrase Analysis seems to be particularly well-suited to language knowledge acquisition. Thanks to Winograd (really quite a good book, but biased) I now know what DCG's are (the paper I referred to before was [Pereira80]). One of the first paragraphs in that paper was revealing. It said that language was *defined* by a grammar, then proceeded from there. (Different assumptions....) Since DCG's were compared only to ATN's, it was of course easy to show that they were better (almost any formalism is better than one from ten years before, so that wasn't quite fair). However, I fail to see any important distinction between a DCG and a production rule system with backtracking. In that case, a DCG is really a special case of a Phrase Analysis parser (I did at one time tinker with the notion of compiling phrase rules into OPS5 rules, but OPS5 couldn't manage it very well - no capacity for the parallelism that my parser needed). I am of course interested in being contradicted on any of this. Mr. Pereira says he doesn't know what the "Schank camp" is. If that's so then he's the only one in NLP who doesn't. I have heard some highly uncomplimentary comments about Schank and his students. But then that's the price for going against conventional wisdom... Sorry for the length, but it *was* time for some light rather than heat! I have refrained from saying much of anything about my theories of language understanding, but will post details if accusations warrant :-) Theoretically yours*, Stan (the leprechaun hacker) Shebs utah-cs!shebs * love those double meanings! [Pereira80] Pereira, F.C.N., and Warren, D.H.D. "Definite Clause Grammars for Language Analysis - A Survey of the Formalism and a Comparison with Augmented Transition Networks", Artificial Intelligence 13 (1980), pp 231-278. [Wilensky80] Wilensky, R. and Arens, Y. PHRAN: A Knowledge-based Approach to Natural Language Analysis (Memorandum No. UCB/ERL M80/34). University of California, Berkeley, 1980. [Wilensky81] Wilensky, R. and Morgan, M. One Analyzer for Three Languages (Memorandum No. UCB/ERL M81/67). University of California, Berkeley, 1981. [Winograd83] Winograd, T. Language as a Cognitive Process, vol. 1: Syntax. Addison-Wesley, 1983.