val@wsccs.UUCP (Val Kartchner) (10/12/88)
Does someone out there have a syntactical definition of English. I would like to build English language parsers for various purposes including adventure game authoring systems. Thanks in advance, -=:[ VAL ]:=- -- ---- /\ ---------------------------------------------------------------- /\/\ . /\ | Val Kartchner {UT@WSC} | 'vi' must go, this / \/ \/\/ \ | #include <disclaimer.h> | is non-negotiable. ===/ U i n T e c h \===!ihnp4!utah-cs!utah-gr!uplherc!sp7040!obie!val=====
skh@hpclskh.HP.COM (10/19/88)
If you get one...post it. Should be an AMAZING grammar to see, if not good for a few laughs.
djones@megatest.UUCP (Dave Jones) (10/21/88)
I recall having seen a hardback book of a few hundred pages, filled with English language productions, nothing else. That was about ten years ago. I don't remember the name of it. It wouldn't help much in writing shells, I fear, but it might be interesting to look at again. Dave J.
dougs@hcx2.SSD.HARRIS.COM (10/24/88)
>/* Written 8:12 am Oct 22, 1988 by jkim@uhccux.uhcc.hawaii.edu */ >> <<<<<<<<<<<<***<<<<<<<<<<<<***<<<<<<***>>>>>>***>>>>>>>>>>>>***>>>>>>>>>>>> >Clay Bond wrote: > >> a CFL is not going to describe English. > >Could you tell us a convincing evidence for this? >If you are going to bring up 50's argument based on a long-distant >dependency, I would recommend you to read first Gerald Gazdar (1982) Phrase >structure grammar. In Pauline Jacobson and Geoffrey K. Pullum (eds), >The Nature of Syntactic Representation. Dordrecht: D. Reidel, 131-186. >/* End of text */ Context-free languages have enough trouble adequately describing programming languages. Sure, they can do a half-decent job on the written syntax as it appears in the file. But to use syntactical productions to recognize things such as various data types in expressions, or even worse, checking that the number of parameters agrees between a caller and a callee is either too exhaustive to be useful or just simply beyond a CFL. Hey, if a context-free grammer can't recognize the regular expression x y z y x (note: this requires a pushdown machine with a b c b a multiple stacks, more power than an automata equivalent to a CFL can be) how the hell is it going to handle English, or Spanish, or whatever? Remember, we must check proper pluralization, subject-verb agreement, all that good stuff. For programming languages, the CFL describes the written syntax and the semantic actions fill in the context-sensitive features we need. My wild guess is that our minds use a context-sensitive grammar with hundreds of thousands of semantic checks to fill in where the CSG is inadequate for our needs. Doug Scofield dougs@ssd.harris.com Harris Computer Systems {uunet,mit-eddie,novavax}!hcx1!dougs Ft. Lauderdale, FL voice: (305) 973 5340
pardo@june.cs.washington.edu (David Keppel) (10/25/88)
> somebody writes; >>[ english grammar? ] In article <44600003@hcx2> dougs@hcx2.SSD.HARRIS.COM writes: > [...] But to use syntactical productions to recognize things such as > various data types in expressions, or even worse, checking that the > number of parameters agrees between a caller and a callee is either > too exhaustive to be useful or just simply beyond a CFL. Attribute grammars are a current research topic. It is possible (although "too exhaustive") to write an attribute grammar that recognizes (semantically) Ada. It runs to some thousand pages (whew!). Here's another "goodie": somebody fed the statement "Time flies like an arrow" into a computer and the computer said: * This is an analogy; time is a thing that moves in a way (flying) that is similar to the way that an arrow moves. * Definition: "time files" are some species that have characteristics much like those of an arrow. * Command: [go get a stopwatch and] time flies the same way that you would time an arrow. If you think that's fun, the Lojban people enumerate something like 20 different ways to understand the phrase "pretty little girl's school". Lobjan is a synthetic language related to Loglan that is designed to be unambiguous and machine-parseable; there *are* parsers for Lojban, so quick, everybody run out and learn Lojban so we can have "synthetic-language query systems" :-) ;-D on ( Eh? I don't grok, Mike ) Pardo -- pardo@cs.washington.edu {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo
hermit@shockeye.UUCP (Mark Buda) (10/26/88)
In article <960009@hpclskh.HP.COM> skh@hpclskh.HP.COM writes: >If you get one...post it. Should be an AMAZING grammar to see, if not good for >a few laughs. <utterance> ::= <word>* Everything else is semantics, of course. -- Mark Buda / Smart UUCP: hermit@shockeye.uucp / Phone(work):(717)299-5189 Dumb UUCP: ...rutgers!bpa!vu-vlsi!devon!shockeye!hermit Entropy will get you in the end. "A little suction does wonders." - Gary Collins
wsmith@m.cs.uiuc.edu (10/26/88)
> > x y z y x (note: this requires a pushdown machine with > a b c b a multiple stacks, more power than an > automata equivalent to a CFL can be) > Don't you mean: x y z x y a b c a b instead? (Technically, what you give is not a regular expression, either.) The language you describe is generated by this context free grammar: R -> a R a | S ; S -> b S b | T ; T -> T c | ; Bill Smith wsmith@cs.uiuc.edu uiucdcs!wsmith
gvcormack@watdragon.waterloo.edu (Gordon V. Cormack) (10/26/88)
In article <44600003@hcx2>, dougs@hcx2.SSD.HARRIS.COM writes: > is either too exhaustive to be useful or just simply beyond a CFL. Hey, if > a context-free grammer can't recognize the regular expression > > x y z y x (note: this requires a pushdown machine with > a b c b a multiple stacks, more power than an > automata equivalent to a CFL can be) > 1. the expression above is not regular 2. the expression above is easily expressed as a CFG: A -> B A -> a A a B -> C B -> b B b C -> C -> C c 3. two stacks suffice for most recognition problems 4. grammer [sic] is misspelled 5. automata is plural 6. why is everybody picking on this guy so much? All he asked for was a CFG for English. If I asked for a CFG for Pascal, would you hassle me about all the Pascal constructs that aren't context-free? 7. The UNIX command "style" contains a yacc grammar for English. A paper is included in the supplementary UNIX documentation describing "style", but the source is not supplied with the BSD distribution. -- Gordon V. Cormack CS Dept, University of Waterloo, Canada N2L 3G1 gvcormack@waterloo { .CSNET or .CDN or .EDU } gvcormack@uwaterloo.CA gvcormack@water { UUCP or BITNET }
lee@uhccux.uhcc.hawaii.edu (Greg Lee) (10/28/88)
From article <44600003@hcx2>, by dougs@hcx2.SSD.HARRIS.COM: " ... " is either too exhaustive to be useful or just simply beyond a CFL. Hey, if " a context-free grammer can't recognize the regular expression " " x y z y x (note: this requires a pushdown machine with " a b c b a multiple stacks, more power than an " automata equivalent to a CFL can be) " " how the hell is it going to handle English, or Spanish, or whatever? x y z x y Supposing a b c a b was meant, then the answer is it's going to the hell handle them if they don't the hell have such constructions. Whether one does find such constructions in natural language is debatable -- there is discussion in the linguistic literature going back about a decade. At least, it seems clear that they are not common. " Remember, we must check proper pluralization, subject-verb agreement, all " that good stuff. Since natural languages have grammatical agreement with repect to only a finite (and rather small) number of categories, and since the strings that separate agreeing items can be characterized by a finite number of strings of category symbols, agreement does not pose a problem in principle. " For programming languages, the CFL describes the written " syntax and the semantic actions fill in the context-sensitive features " we need. And so it may be for natural languages. " My wild guess is that our minds use a context-sensitive grammar " with hundreds of thousands of semantic checks to fill in where the CSG " is inadequate for our needs. The proposal that natural languages are context free is also a guess, at this point, but I think it's fair to say it's an educated guess. There is some evidence against the proposal, but in my opinion this evidence is rather marginal. Other linguists have other opinions. Greg, lee@uhccux.uhcc.hawaii.edu
dougs@hcx2.SSD.HARRIS.COM (10/28/88)
> > x y z y x (note: this requires a pushdown machine with > a b c b a multiple stacks, more power than an > automata equivalent to a CFL can be) > ^^^^^^^^ oops. should be automaton. a most relevant mistake. Yeah, I know I made a few typos with this expression. It should have been x y z x y a b c a b x,y,z >= 0 More than one stack in an automaton means that it is not equivalent to a CFL. It doesn't matter if there is only two. Two is too many. Doug Scofield dougs@ssd.harris.com Harris Computer Systems {uunet,mit-eddie,novavax}!hcx1!dougs Ft. Lauderdale, FL voice: (305) 973 5340 [These are my mistakes _only_]