nl-kr-request@CS.ROCHESTER.EDU (NL-KR Moderator Brad Miller) (01/20/88)
NL-KR Digest (1/19/88 18:36:55) Volume 4 Number 5 Today's Topics: The OED story Machine-Readable Dictionari(es) for AI/NL binding: Dik Gregory? M. Selfridge ref on phonetics dependency grammar Dependency grammar / Variable word order Request for Stuttgart LFG code Linguist Bashing Submissions: NL-KR@CS.ROCHESTER.EDU Requests, policy: NL-KR-REQUEST@CS.ROCHESTER.EDU ---------------------------------------------------------------------- Date: Sun, 10 Jan 88 00:08 EST From: Robert Amsler <amsler@flash.bellcore.com> Subject: The OED story Date: 30 November 1987, 10:54:07 EST Reply-To: MCCARTY%UTOREPAS.BITNET@wiscvm.wisc.edu From: MCCARTY%UTOREPAS.BITNET@wiscvm.wisc.edu To: Robert Amsler <amsler@flash.bellcore.com> Contributor: May Katzen <MAY@VAX.LEICESTER.AC.UK> Subject: 1st edn. of the OED in CD-ROM and 2nd edn. in hardcopy I have received the following information from Tim Benbow of Oxford University Press about its publishing plans for the OED, in response to the query from Mr Wall. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Oxford University Press has announced that early in 1988 it will publish the original Oxford English Dictionary, 1884-1928, issued in twelve printed volumes, on two CD ROM disks. OUP states that this product is very user-friendly, much more so than other similar products on the market. These CD ROMs can run on a PC, XT or AT or an IBM clone with a 640 K memory with either a CGA or EGA device. A Hitachi, Philips, or Sony disk drive is required. The display monitor may be monochrome, but a colour monitor is preferable, as colour is used to distinguish different types of information. OUP also plan to make the original OED available on magnetic tape in a fully structured version with embedded codes, written in IBM format. In 1989, OUP will publish the Oxford English Dictionary, Second Edition, which is the text of the original OED, plus supplements, plus new material which has been added recently. This will be published in a printed version of 20 volumes. The database containing this material will be made available in a number of electronic forms. ------------------------------ Date: Fri, 15 Jan 88 11:25 EST From: Robert France <france@vtopus.cs.vt.edu> Subject: Machine-Readable Dictionari(es) for AI/NL At Virginia Tech we have been working for a few years with dictionaries available through the Oxford Text Archive. Thte OTA is a depository for machine-readable literary texts. They have assembled by this point a considerable body of both primary and secondary (lexicographic) material, all of which is available for research use only at a nominal fee. Restrictions and a list of their materials can be obtained from The Oxford Text Archive Oxford University Computing Service 13 Banbury Rd. Oxford GREAT BRITAIN OX2 6NN Telephone: Oxford (0865) 56721 They are on the net but I'm afraid I've misplaced their Eaddress. Most of OTA's material is available only in (some) typesetters' format, and often the formatting conventions are no longer available. They are also archiving re-formatted versions as they become available, though, so in some cases the data is fairly directly useable. A case in point is the following: One of our early efforts with machin-readable dictionaries involved translating the Collins English Dictionary from typesetters' format into a set of files of Prolog facts. These facts include, for the c. 80,000 headwords in the CED: syllabification, variant spellings, abbreviations, irregular inflections and morphological variants; parts of speech and semantic register information; "also called", "related adjective", and "compare" cross-references; and the texts of definitions, sample uses and usage notes. We ignored only etymology and pronunciation. A syntactially corrrect copy of these facts (i.e., a set of facts in Edinburgh standard syntax that can be consulted without blowing up a Prolog compiler) is now on deposit at the Archive and available under the same terms as the raw data. We are working on a semantically correct version (i.e., one where the data in the facts is in all cases the data that ought to be there), and will deposit that when we have it complete. Currently, our group here, headed by E.A. Fox and Terry Nutter, is coordinating with a group at the Illinois Institue of Technology headed by Martha Evens to analyse the definition texts of this and other M-R dictionaries and to integrate our findings into a *VERY* large semantic net. This product will also be made available to the community for research use only. Anyone desiring further information on this project is invited to contact any of the principles. Believe me, we have some stories to tell. Good luck, Robert France Department of Computer Science Virginia Tech Blacksburg, VA 24061 france@vtopus fox@vtopus nutter@vtopus csevans%iitvax "Believing people is a very bad habit. I stopped years ago." (Miss Marple) ------------------------------ Date: Tue, 19 Jan 88 09:48 EST From: Bruce E. Nevin <bnevin@cch.bbn.com> Subject: binding: Dik Gregory? Anyone know where Dik Gregory is? He was at ARI a couple of years ago. ------------------------------ Date: Thu, 14 Jan 88 15:04 EST From: rolandi <rolandi@gollum.Columbia.NCR.COM> Subject: M. Selfridge Does anyone know the email (or other mail) address of M. Selfridge of: Selfridge, M. 1980. A Process Model of Language Acquisition. Ph.D. diss., Technical Report, 172, Dept of Computer Science, Yale University. ? Thanks. walter rolandi rolandi@gollum.UUCP () NCR Advanced Systems, Columbia, SC u.s.carolina dept. of psychology and linguistics ------------------------------ Date: Tue, 12 Jan 88 14:59 EST From: Bruce E. Nevin <bnevin@cch.bbn.com> Subject: ref on phonetics A while back someone asked for references on phonetics and phonetic transcription, I think in this forum. I just turned up my copy of: The principles of the International Phonetic Association being a description of the International Phonetic Alphabet and the manner of using it, illustrated by texts in 51 languages 1949 (reprinted 1966) Obtainable from the Secretary of the International Phonetic Association Department of Phonetics, University College, London, W.C. 1 price 6s. 6d. I don't know about more recent revisions or editions. The copy I have is a 56-page pamphlet. Bruce bn@cch.bbn.com <usual_disclaimer> ------------------------------ Date: Wed, 13 Jan 88 10:21 EST From: Bruce E. Nevin <bnevin@cch.bbn.com> Subject: dependency grammar > From: Michael Covington <MCOVINGT%UGA.BITNET@forsythe.stanford.edu> > Subject: Dependency Grammar / Variable Word Order > > I would like to hear of any work, published or unpublished, on the > following topics: > > (1) Parsing using a dependency rather than a constituency grammar, > i.e., by establishing grammatical relations between individual words > rather than gathering words into groups. The operator-argument relations in the base sublanguage of Harris's grammar may easily be described with a simple dependency grammar. The caveat is that the reductions of redundancy that yield the other sentences (and the structures of those sentences themselves) are not amenable to a simple dependency description. For a computer implementation, see Stephen Johnson's 1987 NYU dissertation, _An Analyzer for the Information Content of Sentences_. For the theory of language, see recent books by Harris: _Mathematical Structures of Language_ (1968); _A Grammar of English on Mathematical Principles_ (1982--I reviewed this in _Computational Linguistics_ in 1984); _The Form of Information in Science_ (with Gottfried, Ryckman, and others, 1987); _A Mathematical Approach to the Theory of Language_, forthcoming from Oxford U. Press. The theory is based upon the argument requirements of words as they enter in the construction of a sentence. It is thus concerned with the construction of sentences, rather than the "generation" (misleading term) of a set of structures that a sentence constructed by a performance mechanism must fit into. It therefore supports the intuition: first that a person has something to say, expressed somehow in his own conceptual terms . . . and that all his decisions about the syntactic form that a generated (sic) sentence is to take are then made in the service of this intention. (M. Ross Quillian, "Word Concepts: a theory and simulation of some basic semantic capabilities") Bruce Nevin bn@cch.bbn.com <usual_disclaimer> ------------------------------ Date: Fri, 15 Jan 88 04:09 EST From: Klaus Schubert <mcvax!dlt1!schubert@uunet.UU.NET> Subject: Dependency grammar / Variable word order Dear Michael Covington, I have read your request for information on dependency grammar in the NL-KR Digest. The DLT machine translation system at BSO/Research here in Utrecht is based on dependency grammar. We use dependency syntax in parsing for various languages, among them our intermediate language Esperanto, which has a highly free word order. We found that some of the current parsing strategies, like ATNs or DCGs, are suitable for dependency parsing, although they were designed with constituency syntax in mind. May I in this connection advertise my recent book on these topics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M E T A T A X I S Contrastive dependency syntax for machine translation by Klaus Schubert Dordrecht / Providence: Foris, 250 pp. December 1987 The book can be ordered from bookshops and directly from the publisher: Foris Publications, Postbus 509, NL-3300 AM Dordrecht, Netherlands; Foris Publications USA, P.O.Box 5904, Providence, RI 02903, USA; agent for Japan: Sanseido Bookstore, 1-1 Jimbocho, Kanda Chiyoda-ku, Tokyo 101, Japan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CONTENTS Foreword 1. METATAXIS BETWEEN THEORETICAL AND COMPUTATIONAL LINGUISTICS 1.1. Metataxis - for what purpose? 1.2. The DLT machine translation system 2. SOME FUNDAMENTALS OF GRAMMAR 2.1. A general view of grammar 2.2. Dependency and constituency 2.3. The contrastive capacity of dependency and constituency 3. STREAMS OF DEVELOPMENT IN DEPENDENCY GRAMMAR 3.1. The reception of Tesni`ere's work 3.2. Leipzig and Mannheim 3.3. Other readers of Tesni`ere 3.4. Early computational applications 3.5. Not only Tesni`ere 4. DEPENDENCY SYNTAX 4.1. A definition of dependency 4.1.1. Co-occurrence 4.1.2. Directedness 4.1.3. How to detect dependency 4.2. Alternative definitions of dependency 4.3. Word classes 4.4. Dependency types 4.4.1. A definition of dependency types 4.4.2. Complements and adjuncts 4.4.3. Valency 4.5. Dependency trees 4.6. The one-word principle 4.7. The true-tree principle 4.8. Complex verb constructions 4.9. Subordinate clauses 4.10. Coordination 4.11. Ellipsis 4.12. The generative productivity of dependency syntax 4.13. Principles of dependency syntax: A summary 5. METATAXIS 5.1. Metataxis as contrastive lexical redundancy rules 5.1.1. The scope of metataxis 5.1.2. Tree-structured dictionary entries 5.1.3. Levels of redundancy 5.2. Word level metataxis: features and signs 5.3. The metataxis process 5.3.1. Metataxis rules and dictionary entries 5.3.2. Selecting applicable metataxis rules 5.3.3. A hierarchy of metataxis rules 5.3.4. Transformations and filters 5.3.5. Metataxis step by step 5.4. Text level metataxis 5.5. A glance at target language synthesis 5.5.1. Form government and agreement 5.5.2. Morphological synthesis 5.5.3. Tree linearisation 5.6. Why dependency? 6. METATAXIS, SEMANTICS, PRAGMATICS 6.1. Dependency syntax, dependency grammar and related models 6.2. Translation - at which level? 7. DEPENDENCY SYNTAX AND METATAXIS IN COMPUTATIONAL LINGUISTICS 7.1. The way of dependency grammar into computational linguistics 7.2. Grammar, formalism, implementation 7.3. Dependency parsing 7.4. Formalising metataxis 7.5. Metataxis and the design of machine translation systems 8. PROSPECTS Index References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I should appreciate it to hear what reactions you get. I am interested in other applications of dependency grammar than ours. Regards, Klaus Schubert ------------------------------ Date: Fri, 15 Jan 88 19:32 EST From: Ann Nicholson <munnari!mulga.oz.au!annn@uunet.UU.NET> Subject: Request for Stuttgart LFG code I have a copy (in German) of the Stuttgart tech report "Eine benutzerfreundliche Softwareumgebung zur Entwicklung Lexikalisch-Funktionaler Grammatiken". Is there an English version available? Also, does anyone have the code that is contained in the Appendix? Or know who I should contact at Stuttgart to get a tape of it? Thanks in advance... Ann Nicholson. Comp. Sci. Dept. University of Melbourne, Parkville 3052, Australia UUCP: {seismo,mcvax,ukc,ubc-vision,prlb2,enea,hplabs,tataelxsi}!munnari!annn ARPA: "annn%munnari@uunet.uu.net". CSNET: annn%munnari.oz@australia ------------------------------ Date: Sun, 17 Jan 88 04:21 EST From: Jeffrey Goldberg <goldberg@russell.STANFORD.EDU> Subject: Linguist Bashing "Michaela was sorry she had to kill him; he was a nice man -- for a linguist." -S. Elgin "Native Tongue" [I'm quoting from memory] Please read "linguist" as "generative linguist" throughout this article. Recently, on a local Stanford bboard, someone ask why the question "Can I help who's next" sounded bad while it was much improved with "Can I help whoever's next". In may speculative response to this, I included a brief discussion of differences between Headless Relative Clauses and Embedded Questions, my discussion was imperfect, but it included no technical material or analysis of the constructions themselves; it only contained examples attempting to illustrate the difference and the role of "-ever" in that difference. It is not that which I wish to discuss. What I wish to discuss is the response that I got. I got flamed. Some people attacked me for lecturing to them while others picked on me for not defining my terms. Someone went through my posting line by line showing how I was violating many maxims of technical writing. Some accused me of being vague, while others accused me of picking nits. I attempted to turn the discussion toward mail instead of posting to the bboard. I was able to do this, and after a while I sent a note to one individual suggesting that the vehemence of his response was out of all proportion to what he was complaining about and that there must be something he hadn't mentioned about my posting which irked him. He was honest in his response. His response was a tirade against Chomsky, TG, and all these linguists. At about the same time, a very similar message was simply posted to the bboard, a sort of "by the way, this is what I think of linguists" message. Again I found that the adamance of their particular attacks on linguistics were far out of proportion with the actual attack itself which was often irrational and understandable uninformed. When it did actually hit upon something real it seemed that it did so by accident. So, I still wonder why you all hate us. I find it hard that disagreement over the competence/performance could be at the root of such animosity. Is it the claim that constructions within language can be formally represented? Or the thought that there are constraints of the forms of those representations? I hardly think that any of these could generate such feelings, although competence/performance is what most people focus on. Believe me, there are much better grounds on which to attack generative grammar, but such attacks would require really making an honest effort to find out what it is that we do, how we do it, and what we are after. One course in the subject isn't enough. Is it the fact that we claim to have a special understanding of language. People take language very seriously (often too seriously), and nobody like being told that they don't know there language a well as someone else. (Linguists, of course, are not claiming to know the language better than others, but people react as if we make that claim.) After all language belongs to everyone, to the community, and a good writer or speaker, a poet, should be the real authorities on language. Or maybe someone how reminds us of a school teacher whom we can be pleased to agree with and who has access to a decent eytomological dictionary is someone people are willing to accept as an authority on language. [Linguists play a very important role in constructing "grammatical notes" for dictionaries, but the dictionary editors will stress the use of some "usage board" made up of noted writers, while you hardly hear them boast of the linguists they've used.] So is it when some group of people claim to have a special understanding of language, but that understanding isn't in terms that can be explained an a short plane ride or in a couple of electronic messages that you get mad. When you find that we can't demonstrate any use of that knowledge do you feel vindicated? "I knew he was full of shit." Is it threatening somehow to be told that there is a lot of structure behind such a personal and pervasive activity of talking which you don't understand and we do? Nyaa nyaa! Am I getting close? Or is it that you feel betrayed by linguists? Did you expect that our results would solve your problems in AI, CS, psychology, education, sociology, philosophy, language learning, child development, etc? Did you then find that we were instead a bunch of nerds interested not in your questions, but instead on the correct formulation of the constraints governing traces, are an autosegmental analysis of vowel harmony in Turkish? And that when you got us started talking, we would go on and on about stuff that was not only uninteresting to you but useless to you as well? Does the study of language hold so much unfulfilled promise, which is just being ignored by all these damn linguists? Am I any closer here? The dislike that many people have for generative linguists may be based on things entirely irrational, or on lies, or may even be justified. But if you are someone who does share this dislike, I would like to know what the reason is, rational justifiable (a reason doesn't have to be rational to be justifiable) or not. If I post a summary of responses, I will not use your name unless you specifically allow me to. -- Jeff Goldberg Internet: goldberg@russell.stanford.edu ------------------------------ End of NL-KR Digest *******************