[comp.ai.nlang-know-rep] NL-KR Digest Volume 4 No. 60

nl-kr-request@CS.ROCHESTER.EDU (NL-KR Moderator Brad Miller) (06/24/88)
NL-KR Digest             (6/23/88 15:23:27)            Volume 4 Number 60

Today's Topics:
        Re: Shallow Parsing
        Irregular forms [Was Re: Shallow Parsing]
        
Submissions: NL-KR@CS.ROCHESTER.EDU 
Requests, policy: NL-KR-REQUEST@CS.ROCHESTER.EDU
----------------------------------------------------------------------

Date: Wed, 15 Jun 88 17:42 EDT
From: K Watkins <kww@amethyst.ma.arizona.edu>
Subject: Re: Shallow Parsing


In article <1080@ima.ISC.COM> smryan@garth.uucp (Steven Ryan) writes:
>I once heard a suggestion humans use a different parsing strategy than
>compilers. Compilers use a deep parse using lotsa malenky rules and produce
>a very impressive parse tree.
>The suggestion was that humans memorise faster than generate and are better
>at handling large chunks in parallel than small chunks nested.
>What I take that to mean, we memorise words in each inflected form, even
>regular forms, and possibly even groups words possibly up to words. Than
>language generation consists of inserting these chunks into a verb frame
>chunk, with each sentence form being a different chunk.
>I think I'll call this suggestion shallow parsing: the parse tree will
>only go down two or three levels before running into unanalysed (ig est
>memorised) chunks. In terms of a productions, this would mean having
>thousands of similar yet distinct productions instead factoring the
>similarities to reduce the number of productions.

It might be interesting to look at the way language learning differs between
children and adults.  My observation of (a limited number of) children suggests
that they do indeed do abundant memorization as they acquire language...
though they also overgeneralize rules as they learn them ("I seed a scary dog
and I runned away").  I, on the other hand, memorize quite poorly.  When I
have something to learn, I look for--and if possible/necessary build--as many
abstract rules and matrices as I can.  This means, for example, that I know a
number of rules for French grammar which I lack the vocabulary to apply :-).

I have heard various speculations about why this shift occurs (apparently I'm
not the only one to observe it), including the idea that, once we have a
sufficiently large data base, we begin to concentrate more on rules that
organize data than on data acquisition itself.

Does this imply anything useful about differences between computer languages
and natural languages? or only about differences between computers and human
beings? or not even that?  
===========================
K Watkins (watkins@rvax.ccit.arizona.edu)
My fingers, their keyboard, my opinions.

------------------------------

Date: Thu, 16 Jun 88 04:43 EDT
From: Bart Massey <bart@reed.dec.com>
Subject: Re: Shallow Parsing


> The suggestion was that humans memorise faster than generate and are better
> at handling large chunks in parallel than small chunks nested.
> ...
> What interests me about this is the possible application to compilers:
> humans parse an ambiguous and nondeterministic language in almost always
> linear time. Most programming languages are intentionally designed with a
> deterministic context free syntax which is LL(1) or LR(1). LR(1) parsing is
> all very interesting, but the resulting language is often cumbersome: the
> programmer must write out a sufficient left context to help out the parser
> even when he/she/it/they/te/se/... can look a little to right and know what
> is happen.
> ...
> To claim LL(1)/LR(1) is superior because of the linear time, O(n), ignores
> the fact that this is context free parse and must be followed symbol
> identification. Assuming the number of distinct symbols is logarithmic in the
> program length, the time necessary for a context sensitive parse is from
> O(n log log n) to O(n**2).

I don't believe that the above argument about time is correct.  My guess is
that a human parsing *sentences of text* is much like a compiler parsing
*sequences of statements*...  Certainly such a behavior will occur in
"linear time" on the number of sequences parsed.  In fact, this is probably
part of the reason why languages have statements and texts have sentences.

I understand that statements far away from a given statement in a traditional
language are kept track of by the language semantics (e.g. symbol table)
rather than the grammar itself.  Thus, I agree it is the implementation of
the semantics of the language which will determine how parse time grows with
*program* length.  But I'm not sure what this tells me -- in particular,
both humans and compilers seem to do quite well at recovering information
about distant tokens from memory in effectively *constant time* for the
sizes of texts and programs normally compiled.  It's usually safe to assume
that a compiler will compile a 1000 line program 10 times faster than a
10000 line program, or that a person will read a 100 page book 10 times faster
than a 1000 page one.

I believe that the chief advantage of single-level context-free grammars in
the context of computer languages is to avoid the possibilities for
ambiguity the author refers to in his (omitted) last paragraph.
Context-free grammars are so simple (stupid) that there is very little
possibility for ambiguous communication of a program to the machine.  In
other words, it is the *parser* which is "helping out" the *programmer*, in
the sense that the programmer *cannot* easily write an ambiguous program.

However, it is clear that humans use multilevel grammars with great effect.
Does anyone know of work on isolating levels of recognition of text (e.g.
natural language, mathematics) in humans?  I know that HEARSAY II used a
multilevel system for speech recognition, but I'm not sure whether the
levels were modelled on human speech recognition, or just designed a
priori...  For example, there's clearly a point in recognition in which
idiomatic text is converted to its corresponding meaning.  Is this a
seperate step, or part of some other?

I think maybe what the author is getting at is this:  In C, there is the
common programming idiom

	for(<var> = 0; <var> < <const>; <var>++) {
		<loop body>;
	}

Does it pay to recognize this idiom seperately from other for() loops?  Why
or why not?  In particular: 

	Where does such recognition belong in the programming process?
	
	Must this seperate recognition lead to ambiguity, or can one somehow
	guarantee that this for() loop still has the same semantics as other
	for loops?
  	
	Can we somehow speed parsing by recognizing these loops *before*
	doing a standard LL or LR parse?

I too am very interested to hear any answers anyone has to these questions.

					Bart Massey
					UUCP: ..tektronix!reed!bart
[I always thought that computers use unambiguous context free grammars because
we know how to parse them fast, and because they seem empirically to be
convenient for humans to try to express things like computer programs.  I
also think that something like the current parsing methods is probably
time-optimal for conventional computer architectures.  Ned Irons looked at
context-free parsing on small numbers of parallel processors and decided that
it wasn't worth the effort, because each processor ends up waiting for the
one to its left to tell it something crucial about the context for its
chunk of the program.  If the number of processors approximates the number
of tokens, then using some of the processors to look for special cases might
make more sense.  Has anyone looked at highly parallel compiling, as opposed
to compiling for highly parallel machines?  -John]

------------------------------

Date: Thu, 16 Jun 88 15:05 EDT
From: Steven Ryan <smryan@garth.UUCP>
Subject: Re: Shallow Parsing


From what I understand, children first learn the correct irregular forms, use
them, but then go through a stage using incorrect regular forms before going
back to the irregular forms.

Is this true?

It would certainly explain the persistence of irregular forms across a thousand
years, and why frequently used forms are irregular and irregular forms are
frequently used.

------------------------------

Date: Fri, 17 Jun 88 10:06 EDT
From: Rob Bernardo <rob@pbhyf.PacBell.COM>
Subject: Re: Shallow Parsing


In article <744@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:
+From what I understand, children first learn the correct irregular forms, use
+them, but then go through a stage using incorrect regular forms before going
+back to the irregular forms.
+
+Is this true?

This is what I learned in developmental linguistics, too.

+It would certainly explain the persistence of irregular forms across a thousand
+years, and why frequently used forms are irregular and irregular forms are
+frequently used.

I'm confused by what you say,  as I see the causality understandable and
likely in only the opposite direction you seem to imply. The way I see it:

Historically you start off with regular forms,  but because of linguistic
change, some forms that once appeared regular now appear irregular.

The irregular forms only persist in frequently used words - "practice
makes perfect" - and irregular infrequently used words eventual are
reformed according to the new rules of regularity.

Presumably, children generally learn frequently used words first,
and since most/all irregular forms are frequently used words, children's
early vocabulary includes many irregular forms.

To put it simply: the high frequency of a word causes *both* (1) the
promotion of its irregular form and (2) its early acquisition by
children.

Now, since children use the irregular forms, and then go through an
intermediate stage of hypercorrect wrongly regularized forms, it would
appear that children first learn individual words before they learn
the rules of regularization. And when they learn  rules  of regularization,
they learn then first in more simple form, i.e. universal application,
and then in more complex form, i.e. allowing "exceptions".
-- 
Rob Bernardo	[backbone]!pacbell!rob	 -OR-	rob@PacBell.COM
business:  (415) 823-2417	Pacific Bell SRVAC Room 4E750	San Ramon, CA
residence: (415) 827-4301		R Bar JB		Concord, CA

------------------------------

Date: Fri, 17 Jun 88 14:31 EDT
From: Steven Ryan <smryan@garth.UUCP>
Subject: Re: Shallow Parsing


In article <726@amethyst.ma.arizona.edu> kww@amethyst.ma.arizona.edu (K Watkins) writes:
>It might be interesting to look at the way language learning differs between
>children and adults.  My observation of (a limited number of) children suggests
>that they do indeed do abundant memorization as they acquire language...

(Disclaimer: this is what I think I heard and it was a while back, so .......)

Children under the age of 5 who suffer damage to speech areas of brain usually
recover fully, while damage after age 7 is usually permanent. Also, it is easy
to learn to speak as native when under 5 but near impossible after 7. This
suggests the brain rewires itself in these years, going from a stage when
language learning is most important to a stage when rapid and complicated
language usage is most important. This would mean adults and small children
learn language differently because they have different brains.

Since we are all on a computer now, I suppose an analogy would be a CISC
machine which is adaptable but slow being trading in for a RISC machine that
is hardwired to one application but is very fast.

                                        Hafa an godne daege.
                                                     sm ryan

------------------------------

Date: Fri, 17 Jun 88 15:08 EDT
From: Steven Ryan <smryan@garth.uucp>
Subject: Re: Shallow Parsing


In article <1114@ima.ISC.COM> Bart Massey <bart@reed.dec.com> writes:
>I don't believe that the above argument about time is correct.  My guess is
>that a human parsing *sentences of text* is much like a compiler parsing
>*sequences of statements*...  Certainly such a behavior will occur in
>"linear time" on the number of sequences parsed.  In fact, this is probably
>part of the reason why languages have statements and texts have sentences.

Yes, I'm thinking that we divide the stream into sentence-like entities.
In spoken language we use pauses to signify a break (and catch our breaths)
and variety of gestures and eye movements. In written language we endmarks
and paragraphs. Programming languages use begin/end/if/... keywords that
break out the overall structures.

Dividing the stream could be done in context free, deterministic fashion. It
may be the contents of each sentence are parsed differently.
 
>the semantics of the language which will determine how parse time grows with
>*program* length.  But I'm not sure what this tells me -- in particular,
>both humans and compilers seem to do quite well at recovering information
>about distant tokens from memory in effectively *constant time* for the

I don't know about humans, but most compilers use a hash table for the symbol
table. In this case the time to identify one symbol out of n distinct symbols
is O(n/m) for a hash table size m. If m is larger than n, it is constant time.
However if n is large enough, the hash table linearly decreases search time,
but the search down any hash entry chain remains n (usually) or log n (for a
sophisticated table).
 
>Does it pay to recognize this idiom seperately from other for() loops?  Why
>or why not?

Also, I'm wonderring about the tradeoff of determinstic versus nondeterministic
grammars or languages or parsing. For those unfamilar with these terms, if a
parser merrily trips through string left to right, given everything it is
seen so far, looking at the current symbol at maybe k symbols to the right, does
it know exactly which production it is in? If this true for a fixed k, the
parser is deterministic. If it requires arbritrarily large k (the lookahead),
it is nondeterminstic. Or it might be ambiguous. (Deterministic langauges
are not inherently ambiguous: each has an LR(1) grammar and LR(1) grammars are
not ambiguous.)
 
>[I always thought that computers use unambiguous context free grammars because
>we know how to parse them fast, and because they seem empirically to be
>convenient for humans to try to express things like computer programs.  I
>also think that something like the current parsing methods is probably
>time-optimal for conventional computer architectures.

I suspect it has more to do parse generators: a parser can be automatically
generated for an LR(1) grammar, but not so for a nondeterministical language.
(Personal remark:) I think Wirth has crippled a generation of programming
language by stripping them of anything that makes life tough for a
compiler writer (my occupation if you haven't already guessed).

>                                                       Ned Irons looked at
>context-free parsing on small numbers of parallel processors and decided that

I first thought about this with respect to CDC Cyber 205 and ETA-10x vector
machines. If somebody can find a way to vectorise parsing, I think it will
open up a vast field of applications which currently appear to be scalar or
sequential.

                                      Thank you for your support,
                                                 sm ryan
[Using Earley's algorithm you can parse ambigious grammars without trouble,
although it is much slower than LR or LALR for the grammars that the latter
two can handle.  -John]
[From smryan@garth.uucp (Steven Ryan)]

------------------------------

Date: Sun, 19 Jun 88 04:13 EDT
From: Celso Alvarez <sp299-ad@violet.berkeley.edu>
Subject: Re: Shallow Parsing


In article <748@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:
>
>Also, it is easy to learn to speak as native when under
>5 but near impossible after 7.

If you are referring to second language acquisition, I believe
the difficulty is for older children to acquire native phonology,
not grammar.

C.A. (sp299-ad@violet.berkeley.edu.UUCP)

------------------------------

Date: Sun, 19 Jun 88 17:51 EDT
From: Steven Ryan <smryan@garth.UUCP>
Subject: Re: Shallow Parsing


>+It would certainly explain the persistence of irregular forms across a thousand
>+years, and why frequently used forms are irregular and irregular forms are
>+frequently used.
 
>I'm confused by what you say,  as I see the causality understandable and
>likely in only the opposite direction you seem to imply. The way I see it:

Poorly phrased by me. I meant the frequently used words are memorised early
and resist ongoing change. But perhaps some frequently used words are made
irregular for efficiency? That is "had" instead "*haved"?

------------------------------

Date: Mon, 20 Jun 88 04:29 EDT
From: Celso Alvarez <sp299-ad@violet.berkeley.edu>
Subject: Irregular forms [Was Re: Shallow Parsing]


In article <759@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:
>I meant the frequently used words are memorised early
>and resist ongoing change. But perhaps some frequently used words are made
>irregular for efficiency? That is "had" instead "*haved"?

I would assume so. Contractions, syncopas, sound deletions,
etc. are ruled by the 'principle of linguistic economy'. But
economizing tendencies clash, for example, with the language's
resistance to ambiguity.

C.A. (sp299-ad@garnet.berkeley.edu.UUCP)

------------------------------

Date: Tue, 21 Jun 88 09:32 EDT
From: Rob Bernardo <rob@pbhyf.PacBell.COM>
Subject: Re: Irregular forms [Was Re: Shallow Parsing]


In article <759@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:
+I meant the frequently used words are memorised early
+and resist ongoing change. But perhaps some frequently used words are made
+irregular for efficiency? That is "had" instead "*haved"?

I doubt it. I'd be willing to bet that "had" is a consequence of phonological
change and because "had" is frequent, it wasn't regularized afterwards.

The way you put it, "some frequently used words are made irregular for
efficiency" makes it sound like you believe someone is deciding to make
words irregular. There is nothing so deliberate going on.

Things are much simpler. Let's suppose hypothetically (I don't know the
history of English - ask me about the Romance languages! :-) ), at some
point the sound "v" was lost before certain consonants so that

	have + d [past tense]  -> had	{We are talking sounds here
	have + s [3rd sing]    -> has	so the silent "e" doesn't count.}

and this happened all over the place uniformly so that this didn't seem
irregular at all, but just a consequence of unconscious phonological
rules. (Cf. the plural  is pronounced "z" in some words but "s" in others
without us necessarily being aware there is such a "rule".) But much later,
let's suppose, there were situations where a vowel that used to be
pronounced was dropped in certain situations, bringing together consonants
that weren't pronounced together before. This means that we now have
words where "v+d" and "v+z" occurs. Now what happens is that in infrequently
used verbs that end in "v", the "v" is no longer dropped before the past
tense and 3rd singular endings are attached. But with "have", the
frequency of its use promotes its now "irregular" form.

Sorry I can't produce the real history of this irregularity, but this
sort of phonological change occurs all the time. I could present hundreds
of examples from Romance languages.
-- 
Rob Bernardo	[backbone]!pacbell!rob	 -OR-	rob@PacBell.COM
business:  (415) 823-2417	Pacific Bell SRVAC Room 4E750	San Ramon, CA
residence: (415) 827-4301		R Bar JB		Concord, CA

------------------------------

Date: Tue, 21 Jun 88 12:04 EDT
From: Rick Wojcik <rwojcik@bcsaic.UUCP>
Subject: Re: Irregular forms [Was Re: Shallow Parsing]


In article <759@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:
>I meant the frequently used words are memorised early
>and resist ongoing change. But perhaps some frequently used words are made
>irregular for efficiency? That is "had" instead "*haved"?

It is misleading to think that irregular forms are learned early.  One
should think of early words, whether they appear to conform to rules or not, 
as unanalyzed wholes.  After morpho-syntactic rules are learned, then the
'exceptional' words have to be relearned as exceptions.

In article <11146@agate.BERKELEY.EDU> sp299-ad@violet.berkeley.edu (Celso Alvarez) writes:
>I would assume so. Contractions, syncopas, sound deletions,
>etc. are ruled by the 'principle of linguistic economy'. But
>economizing tendencies clash, for example, with the language's
>resistance to ambiguity.

One of the best modern works on the clash between 'speaker-based' economy
of form and 'hearer-based' elaboration of form is to be found in Donegan
and Stampe's "The Study of Natural Phonology" (Dinnsen, ed. Current
approaches to Phonological Theory.  Indiana U. Press. 1979.)  Donegan and
Stampe give examples such as the following:  'parade' reduces to /preyd/ in
casual speech; 'prayed' expands to /p@reyd/ in emphatically articulated
speech. Donegan and Stampe cite historical precedents in the study of the
conflicting forces. 
-- 
Rick Wojcik   csnet:  rwojcik@boeing.com	   
              uucp:   uw-beaver!ssc-vax!bcsaic!rwojcik 
address:  P.O. Box 24346, MS 7L-64, Seattle, WA 98124-0346
phone:    206-865-3844

------------------------------

Date: Wed, 22 Jun 88 18:43 EDT
From: Steven Ryan <smryan@garth.UUCP>
Subject: Re: Irregular forms [Was Re: Shallow Parsing]


>+and resist ongoing change. But perhaps some frequently used words are made
>+irregular for efficiency? That is "had" instead "*haved"?
>
>I doubt it. I'd be willing to bet that "had" is a consequence of phonological
>change and because "had" is frequent, it wasn't regularized afterwards.
>
>The way you put it, "some frequently used words are made irregular for
>efficiency" makes it sound like you believe someone is deciding to make
>words irregular. There is nothing so deliberate going on.

Except for acronyms. It is hypothesised that languages change when someone
makes a mistake. If the mistake makes the language easier it spreads from
a one-time accident to a broad change.

Possibly in early (?)Anglish, Saxon, or Protogermanic, somebody tripped
over their tongue when trying to say "hafed" and ended up with "had". The
irregular past of "have" (like most other irregulars) is essentially
unchanged from Old English (600 - 1100).

------------------------------

Date: Thu, 23 Jun 88 04:42 EDT
From: Celso Alvarez <sp299-ad@violet.berkeley.edu>
Subject: Re: Irregular forms [Was Re: Shallow Parsing]


In article <775@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:

>...It is hypothesised that languages change when someone
>makes a mistake. If the mistake makes the language easier it
>spreads from a one-time accident to a broad change.

I've always found this hard to believe, at least in the way it's
formulated. In language change you have to account for diffusion
of linguistic phenomena -- that is, the likelihood that a
given, individual mispronunciation spreads through networks of
speakers, instances of interaction, etc.

>Possibly in early (?)Anglish, Saxon, or Protogermanic, somebody tripped
>over their tongue when trying to say "hafed" and ended up with "had".

It is unlikely for pronunciation difficulties to be individual
phenomena. Usually one also finds regularity in the types of
sound combinations that are hard to pronounce according to a
community's speech habits (e.g. [sR] in Spanish, the
[R] being a trill; also [nr], that's why Old Sp. future forms
like _venre'_, _tenre'_ were *regularly* irregularized as
_venDre'_, _tenDre'_).

Celso Alvarez (sp299-ad@violet.berkeley.edu.UUCP)

------------------------------

End of NL-KR Digest
*******************