larsbc@ncrsecp.Copenhagen.NCR.dk (Lars Ballieu Christensen) (09/08/89)
I am posting this for a friend, who does not have direct access to the net. Please reply to this newsgroup or directly to me, and I will forward it. "Studying the various natural language grammars used for automatic language parsing systems, as a special subject student in Computer Science and Danish Language at the University of Roskilde, I have (amongst others) come across the following problem: The concrete subject of the investigation is the danish part of the Eurotra project, which in its basic idea uses the recognized topological grammar for danish language developed by P. Diderichsen [1]. The problem can be stated as follows: Is the lack of theoretical linguistic description in the grammars used for automatic NW systems a necessary consequence to reach the primary goal (= provide an applicable system), or is it possible, maybe on a longer term basis, to provide better NW systems by developing grammars, which are build with higher degree of respect to the theoretical linguistic description? Generally when developing automatic NL system, the goal seems to be building parsers, which form the most correct analysis of the language structures on each level of the specific system. This strategy causes, at least for grammars for danish language, that the NW grammars, which primarily have been build/specified by theoretical language scientists, that the theoretical basis of the language often seems to be neglected. Instead, ad hoc grammars are build in order to meet the demands for correctness, as well as the commercial restrictions and their ability to be implemented, i.e. computerized. Viewpoints and comments on this subject will be highly appreciated. Also, I would like to share my experiences with anyone who state the wish. Thanks in advance Henrik Sternberg Jensen, Department of Computer Science, University of Roskilde, DK-4000 Roskilde References: [1] Diderichsen, P., Elementr Dansk Grammatik, Gyldendal, Kbenhavn, 1979. [2] Rue, H., Diderichsen p Prolog, SAML 12, University of Copenhagen, 1986. [3] Togeby, O., Parsing Danish Text in Eurotra, Nordic Journal of Linguistics, vol. 11, no. 1-2, Universitetsforlaget AS, Oslo, 1988." -- Lars Ballieu Christensen Email: Lars.B.Christensen@Copenhagen.NCR.DK NCR Systems Engineering Copenhagen Phone: +45 38 33 00 22 Contract Development, Svanevej 14, Fax: +45 31 10 23 62 DK-2400 Copenhagen NV, Denmark "Music is your only friend - till the end"
lee@uhccux.uhcc.hawaii.edu (Greg Lee) (09/08/89)
From article <1933@ncrsecp.Copenhagen.NCR.dk>, by larsbc@ncrsecp.Copenhagen.NCR.dk (Lars Ballieu Christensen): )... is it possible, maybe on a )longer term basis, to provide better NW systems by developing )grammars, which are build with higher degree of respect to the )theoretical linguistic description? I don't think it's possible, unless the long term basis is long enough to permit adequate theoretical linguistic descriptions to be discovered. Say a century. One might reasonably expect linguists' descriptions now to provide a good account of the facts of a language, or a fact in a group of languages, and to give a convenient terminology for describing facts. But that's not `theoretical' in the usual sense. )Generally when developing automatic NL system, the goal seems to )be building parsers, which form the most correct analysis of the )language structures on each level of the specific system. But who knows what "the most correct analysis" is? Should verbs in English go with their objects in a `verb phrase', or with their subjects, or with neither? Intonation sometimes suggests the second choice. Tradition and some not altogether conclusive arguments about possible idioms suggest the first. My colleague Stan Starosta has a well-developed theory that makes the third choice. In this and other instances, it seems to me that fashion or convenience more than theory dictates what turns up in current descriptions. Some syntacticians are enthralled with binary branching trees, but for any principled reason? Not so far as I know. )This strategy causes, at least for grammars for danish language, )that the NW grammars, which primarily have been build/specified )by theoretical language scientists, that the theoretical basis of )the language often seems to be neglected. Instead, ad hoc )grammars are build in order to meet the demands for correctness, But that's what linguists are doing, too. Greg, lee@uhccux.uhcc.hawaii.edu