LAWS@SRI-AI.ARPA (11/12/84)
From: AIList Moderator Kenneth Laws <AIList-REQUEST@SRI-AI> AIList Digest Sunday, 21 Oct 1984 Volume 2 : Issue 143 Today's Topics: Programming Languages - Buzzwords, AI Tools - LISP Machine Benchmarks, Linguistics - Language Evolution & Sastric Sanskrit, Seminar - Transformational Grammar and AI, PhD Oral: Theory-Driven Data Interpretation ---------------------------------------------------------------------- Date: 19 October 1984 22:52-EDT From: Herb Lin <LIN @ MIT-MC> Subject: buzzwords for different language types Could someone out there please tell me the usual catch phrases for distinguishing between languages such as C, Pascal, Ada on one hand and languages such as LISP on the other? Is it "structured" vs "unstructured"? List vs ?? Thanks. ------------------------------ Date: Fri 19 Oct 84 13:08:44-PDT From: WYLAND@SRI-KL.ARPA Subject: LISP machine benchmarks A thought for the day on the AI computer benchmark controversy. We need a single, simple measure for machine quality in order to decide which machine to buy. It must be simple and general because these are typically intended to be used as general purpose AI research machines where we cannot closely define and confine the application. We already have one single, simple measure called price. If there is no *simple* alternative number based on performance, others (i.e. those funding the effort) will use price as the only available measure, and we will have to continually struggle against it using secondary arguments and personal opinion. It should be possible to create a simple benchmark measure. It will - of necessity - be highly abstracted, necessarily crude. This has been done for conventional computer systems: the acronym MIPs is now fairly common, for good or ill. Yes, there are additional measures, but they are used in addition to simple ones like MIPs. We need good, extensive benchmarks for these machines: they will point out the performance bugs that are unique to particular designs. After we do the benchmarks, however, we need to boil it down to some simple number we can use for general purpose comparason to place in opposition to price. ------------------------------ Date: 19 Oct 84 10:32 PDT From: Schoppers.pa@XEROX.ARPA Subject: The Future of the English Auxiliary In response to Ken Kahn's question on language evolution, my own theory is that the invasion of a language by foreign cultures, or vice versa, has a lot to do with how simple a language becomes: cross-cultural speakers tend to use only as much as absolutely necessary for them to consider themselves understood. The English spoken in some communities, eg "Where they goin'?" (missing an auxiliary), "Why he be leavin'?" (levelling the auxiliary), "He ain't goin' nowhere" (ignoring double negatives), etc may well be indicative of our future grammar. On the other hand, "Hey yous" for plural "you" (in Australia), and "y'all" (here), are pointing towards disambiguation. Well, there does have to be a limit to the simplification, lest we "new-speak double-plus ungood". Then again, "ain't" can mean any one of "am not", "aren't", "isn't", "haven't", "hasn't" --- effectively replacing both the primary English auxiliaries (to be, to have) in all their conjugations! United States "English", being the lingo of the melting pot, will probably change faster than most. Marcel Schoppers Schoppers@XEROX ------------------------------ Date: Fri 19 Oct 84 15:23:26-MDT From: Stan Shebs <SHEBS@UTAH-20.ARPA> Subject: Cases & Evolution of Natural Language Has anybody at all researched the origins of language? Not an expert on the subject, but I do know that the languages of aboriginal tribes are extraordinarily complicated, as languages go. But they probably don't give us much clue to what the earliest of languages were like. If you believe that the earliest of languages arose along with human intelligence, then you can suppose that the most primitive languages had a separate "word" for each concept to be expressed. Such concepts might include what would correspond to entire sentences in a modern language. Thus the most primitive languages would be completely non-orthogonal. When intelligence developed to a point where the necessary vocabulary was just too complex to handle the wide range of expressible concepts, then perhaps some individuals would start grouping primitive sounds together in different ways (the famous chimpanzee and gorilla language experiments suggest that other primates already have this ability), resulting in the birth of syntactic rules. Obvious question: can all known languages be derived as some combination of arbitrarily bizarre syntactic/semantic rules? (I would guess so, based on results for mathematical languages) Word cases can then be explained as one of the last concepts to be factored out of words. In the most ancient Indo-European languages, for instance, prepositions are relatively infrequent, although the notions of subject, object, verb, and so forth have already been separated into separate words. Perhaps in the future, singular and plural numbers will be separated out also (anyone for "dog es" instead of "dogs"?). stan shebs ------------------------------ Date: 19 Oct 1984 15:17-PDT (Friday) From: Rick Briggs <briggs@RIACS.ARPA> Subject: Sastric Sanskrit Firstly, the language is NOT artificial. There is a LITERATURE which is written in this language. It is different from toy artificial languages like Fitch's in that for three thousand years scientists communicated and wrote texts in this language. There are thus two aspects which are interesting and relevent; one is that research such as I have been describing was carried out in its peculiar context, the other is that a natural language can function as an unambiguous, inference-generating language without sacrificing simplicity or stylistic beauty. The advantage of case is that (assuming it is a good case system) you have a closed set with which a correspondance can be made with a closed set of semantic cases, whereas prepositions can be combined in a multitude of ways and classifying prepositions is not easy. Secondly, the fact that prepositions are not attached to the word allows a possibility for ambiguity "a boat on the river near the tree" could be "a boat on the (river near the tree)" or "a boat (on the river) near the tree". Attaching affixes directly to words allows you (potentially) to express such a sentence without ambiguity. The Sastric approach is to allow one to express a sentence as a series of "facts", each agreeing with "activity". Prepositions would not allow this. If one hears "John was killed", some questions come to mind: who did it, how, why. These are actually the semantic cases agent, instrument, and semantic ablative (apaadaanakaaraka). Instead of "on" and "near" one would say "there is a proximity, having as its substratum an instance of boatness... etc." in Sastric Sanskrit. The real question is "How good a case system is it?". Mapping syntactic case to semantic is much easier than mapping prepositions since a direct correspondance is found automatically if you have a good case system, whereas prepositions do not lend themselves to easy classification. Again, Sanskrit is NOT long-winded, it is the english translation which is, since their vocabulary and methodology was more exact than that of English. "Caitra cooks rice in a pot" is not represented ambiguously. Since it is not specified whether the rice is boiled, steamed, or fried the correct representation should include the fact that the means of softening the rice is unspecified, and the language does have the ability to mark slots as unspecified (anabhihite). Actually, cooking is broken down even further (if-needed) and since rice is cooked by boiling in India, that fact would be explicitly stated. The question is how deep a level of detail is desired, Sanskrit maintains: as far as is necessary but "The notion 'action' cannot be applied to the solitary point reached by extreme subdivision", i.e. only to the point of semantic primitives. Sentences with ambiguity like "the man lives on the Nile" in Sastric is made up of the denotative meaning (the man actually lives on the river) and the implied meaning (the man lives on the bank of the Nile). The latter is the default meaning unless it is actually specified otherwise. There is a very complex theory of implication in the literature, but sentences with implied meanings are discouraged because: "when purport (taatparya) is present, any word may signify any meaning", thus the Sastric system where implied meanings are made explicit. I do not agree that languages need to tolerate ambiguity, in fact that is my main point. One can take a sentence like "Daddy ball" and express it as an imperative of "there is a desire of the speaker for an unspecified activity involving the ball and Daddy." By specifying what exactly is known and what is unknown, one can represent a vague mental notion as precisely as is possible. But do we really need to allow such utterances? Would something humanistic be lost if children simply were more explicit? Children in this culture are encouraged to talk this way by adults engaging in "baby talk". All this points to the fact that the language you speak has a tremendous influence on the your mental make-up. If a language more specific than english was spoken, our thoughts would be more clear and ambiguity would not be needed. I conclude with another example: Classical Sanskrit--> raama: araNye baaNena baalinam jaghaana (Rama killed Baalin in the forest with an arrow) ---> raamakartRkaa araNyaadhikaraNikaa baaNakaraNikaa praaNaviyogaanukuulaa parokSHaatiitakaalikii baalinkarmakaa bhaavanaa (There is an activity relating to the past beyond the speaker's ken, which is favourable to the separation of life, which has the agency of Rama, which has the forest as locus, Baalin as object, and which has the arrow as the implement. Note that each word represents a semantic case with its instantiation, (eg., raama-kartRkaa having as agent Rama), with the verb "kill" (jaghaana) being represented as an activity which is favourable (anukuulaa) to the separation (viyoga) of praana (life). Thus the sentence is a list of assertions with no possibility of ambiguity. Notice that Sanskrit expresses the notion in 42 syllables (7 words) and English takes 75 syllables (43 words). This ratio is fairly indicative of the general case. Rick Briggs ------------------------------ Date: 19 Oct 1984 15:41 EDT (Fri) From: "Daniel S. Weld" <WELD%MIT-OZ@MIT-MC.ARPA> Subject: Seminar - Transformational Grammar and AI [Forwarded from the MIT bboard by SASW@MIT-MC.] Transformational Grammar and Artificial Intelligence: A View from the Bridge Robert Berwick It has frequently been suggested that modern linguistic theory is irreconcilably at odds with a ``computational'' view of human linguistic abilities. In part this is so because grammars were thought to consist of large numbers of explicit rules. This talk reviews recent developments in linguistic theory showing that, in fact, current models of grammar are quite compatible with a range of AI-based computational models. These newer theories avoid the use of explicit phrase structure rules and fit quite well with such lexically-based models as ``word expert'' parsing. Wednesday October 24 4:00pm 8th floor playroom ------------------------------ Date: 19 Oct 84 15:35 PDT From: Dietterich.pa@XEROX.ARPA Reply-to: DIETTERICH@SUMEX-AIM.ARPA Subject: PHD Oral: Theory-Driven Data Interpretation [Forwarded from the Stanford bboard by Laws@SRI-AI.] PHD ORAL: TOM DIETTERICH DEPARTMENT OF COMPUTER SCIENCE 2:30PM OCTOBER 25 SKILLING AUDITORIUM CONSTRAINT PROPAGATION TECHNIQUES FOR THEORY-DRIVEN DATA INTERPRETATION This talk defines the task of THEORY-DRIVEN DATA INTERPRETATION (TDDI) and investigates the adequacy of constraint propagation techniques for performing it. Data interpretation is the process of applying a given theory T (possibly a partial theory) to interpret observed facts F and infer a set of initial conditions C such that from C and T one can infer F. Most existing data interpretation programs do not employ an explicit theory T, but rather use some algorithm that embodies T. Theory-driven data interpretation involves performing data interpretation by working from an explicit theory. The method of local propagation of constraints is investigated as a possible technique for implementing TDDI. A model task--forming theories of the file system commands of the UNIX operating system--is chosen for an empirical test of constraint propagation techniques. In the UNIX task, the "theories" take the form of programs, and theory-driven data interpretation involves "reverse execution" of these programs. To test the applicability of constraint propagation techniques, a system named EG has been constructed for the "reverse execution" of computer programs. The UNIX task was analyzed to develop an evaluation suite of data interpretation problems, and these problems have been processed by EG. The results of this empircal evaluation demonstrate that constraint propagation techniques are adequate for the UNIX task, but only if the representation for theories is augmented to include invariant facts about the programs. In general, constraint propagation is adequate for TDDI only if the theories satisfy certain conditions: local invertibility, lack of constraint loops, and tractable inference over propagated values. ------------------------------ End of AIList Digest ********************