larsbc@ncrsecp.Copenhagen.NCR.dk (Lars Ballieu Christensen) (09/08/89)
I am posting this for a friend, who does not have direct access
to the net. Please reply to this newsgroup or directly to me, and
I will forward it.
"Studying the various natural language grammars used for
automatic language parsing systems, as a special subject student
in Computer Science and Danish Language at the University of
Roskilde, I have (amongst others) come across the following
problem:
The concrete subject of the investigation is the danish part of
the Eurotra project, which in its basic idea uses the recognized
topological grammar for danish language developed by P.
Diderichsen [1].
The problem can be stated as follows: Is the lack of theoretical
linguistic description in the grammars used for automatic NW
systems a necessary consequence to reach the primary goal (=
provide an applicable system), or is it possible, maybe on a
longer term basis, to provide better NW systems by developing
grammars, which are build with higher degree of respect to the
theoretical linguistic description?
Generally when developing automatic NL system, the goal seems to
be building parsers, which form the most correct analysis of the
language structures on each level of the specific system.
This strategy causes, at least for grammars for danish language,
that the NW grammars, which primarily have been build/specified
by theoretical language scientists, that the theoretical basis of
the language often seems to be neglected. Instead, ad hoc
grammars are build in order to meet the demands for correctness,
as well as the commercial restrictions and their ability to be
implemented, i.e. computerized.
Viewpoints and comments on this subject will be highly
appreciated. Also, I would like to share my experiences with
anyone who state the wish.
Thanks in advance
Henrik Sternberg Jensen,
Department of Computer Science,
University of Roskilde,
DK-4000 Roskilde
References:
[1] Diderichsen, P., Elementr Dansk Grammatik, Gyldendal,
Kbenhavn, 1979.
[2] Rue, H., Diderichsen p Prolog, SAML 12, University of
Copenhagen, 1986.
[3] Togeby, O., Parsing Danish Text in Eurotra, Nordic Journal
of Linguistics, vol. 11, no. 1-2, Universitetsforlaget AS,
Oslo, 1988."
--
Lars Ballieu Christensen Email: Lars.B.Christensen@Copenhagen.NCR.DK
NCR Systems Engineering Copenhagen Phone: +45 38 33 00 22
Contract Development, Svanevej 14, Fax: +45 31 10 23 62
DK-2400 Copenhagen NV, Denmark "Music is your only friend - till the end"lee@uhccux.uhcc.hawaii.edu (Greg Lee) (09/08/89)
From article <1933@ncrsecp.Copenhagen.NCR.dk>, by larsbc@ncrsecp.Copenhagen.NCR.dk (Lars Ballieu Christensen): )... is it possible, maybe on a )longer term basis, to provide better NW systems by developing )grammars, which are build with higher degree of respect to the )theoretical linguistic description? I don't think it's possible, unless the long term basis is long enough to permit adequate theoretical linguistic descriptions to be discovered. Say a century. One might reasonably expect linguists' descriptions now to provide a good account of the facts of a language, or a fact in a group of languages, and to give a convenient terminology for describing facts. But that's not `theoretical' in the usual sense. )Generally when developing automatic NL system, the goal seems to )be building parsers, which form the most correct analysis of the )language structures on each level of the specific system. But who knows what "the most correct analysis" is? Should verbs in English go with their objects in a `verb phrase', or with their subjects, or with neither? Intonation sometimes suggests the second choice. Tradition and some not altogether conclusive arguments about possible idioms suggest the first. My colleague Stan Starosta has a well-developed theory that makes the third choice. In this and other instances, it seems to me that fashion or convenience more than theory dictates what turns up in current descriptions. Some syntacticians are enthralled with binary branching trees, but for any principled reason? Not so far as I know. )This strategy causes, at least for grammars for danish language, )that the NW grammars, which primarily have been build/specified )by theoretical language scientists, that the theoretical basis of )the language often seems to be neglected. Instead, ad hoc )grammars are build in order to meet the demands for correctness, But that's what linguists are doing, too. Greg, lee@uhccux.uhcc.hawaii.edu