[comp.ai.nlang-know-rep] NL-KR Digest Volume 4 No. 58

nl-kr-request@CS.ROCHESTER.EDU (NL-KR Moderator Brad Miller) (06/16/88)
NL-KR Digest             (6/15/88 17:57:05)            Volume 4 Number 58

Today's Topics:
        Info wanted on experiences with Alvey NLP Toolkit
        Please post on NL-KR.  Thanks
        References needed for induction over concept explanations
        Help with NLP
        Shallow Parsing
        Spatio-Temporal metaphor
        Re: Genderless 3rd person pronoun.
        Pronouns and Generative syntax
        re: learning from positive examples
        
Submissions: NL-KR@CS.ROCHESTER.EDU 
Requests, policy: NL-KR-REQUEST@CS.ROCHESTER.EDU
----------------------------------------------------------------------

Date: Mon, 6 Jun 88 12:18 EDT
From: LEWIS@cs.umass.edu
Subject: Info wanted on experiences with Alvey NLP Toolkit

    Well, my recent query about the availability of robust natural language
parsers resulted in only one suggestion, but that one was made by several
people: The Alvey Natural Language Tools, from the Artificial Intelligence
Applications Institute.  So now I'd be interested in hearing if anyone has
used this toolkit and has experiences to report on it.  This sort of thing
would be worth posting to the whole list, but if you'd rather not do this,
please send them to me (letting me know whether you want them sent out to
anyone else who might ask me).  Many thanks, Dave

David D. Lewis                                    
Computer and Information Science (COINS) Dept.                            
University of Massachusetts, Amherst              ph. 413-545-0728 
Amherst, MA  01003                                BITNET: lewis@umass
USA                                               INTERNET: lewis@cs.umass.edu 

------------------------------

Date: Wed, 8 Jun 88 17:34 EDT
From: "Nahum (N.) Goldmann" <ACOUST%BNR.BITNET@CORNELLC.CCS.CORNELL.EDU>
Subject:  Please post on NL-KR.  Thanks

Everybody in Canada who is interested in the Object-Oriented Programming
(OOP) and/or in the Behavioral Design Research as related to the
development of human-machine interfaces (however remotely connected to
these subjects) - please reply to my e-mail address.  The long-term
objective - organization of a corresponding Canadian bulletin board.

Greetings and thanks.

Nahum Goldmann
(613)763-2329

e-mail: <ACOUST@BNR.CA>

------------------------------

Date: Mon, 13 Jun 88 21:35 EDT
From: Shane Bruce <bruce@paul.rutgers.edu>
Subject: References needed for induction over concept explanations

In an interesting article in the Proceedings of the 1988 AAAI Spring
Symposium on EBL, Flann and Dietterich discuss the idea of performing
induction over multiple functional explanations of a concept (in their
case, minmax game trees) to generate a generalized explanation of the
concept.  This, of course, is done instead of the standard technique
of performing the induction on the feature language description of the
concept.  In the article they list some other projects in which
induction over explanations was performed.  Is anyone aware of any
other work which involves induction of generalized concept
explanations from multiple explanations?  I would particularly be
interested in hearing about projects in which induction is performed
over causal process explanations generated by qualitative or
quantitative domain models.

Please email any references which you might have concerning this topic
to me at bruce@paul.rutgers.edu.  I will, of course, post the results
of this query to the net if there is enough interest.  Thanks for the
help.
-- 
Shane Bruce
HOME: (201) 613-1285                WORK: (201) 932-4714
ARPA: bruce@paul.rutgers.edu
UUCP: {ames, cbosgd, harvard, moss}!rutgers!paul.rutgers.edu!bruce

------------------------------

Date: Wed, 15 Jun 88 10:22 EDT
From: Florence M. Reeder <@ECL.PSU.Edu:FMR@ICF.HRB>
Subject: Help with NLP


     I have been reading the NL-KR news letters on the net and have
realized how little I know about the field of Natural Language Processing.
I am looking for a few good reference texts to get me started.  If
you have any suggestions, I would appreciate it.

     While I am not sure you can get a reply to me, if it is posted
to the digest, I am sure I wil be able to read it.  Please, however,
try to reply as our network person will be interested to know if
we can communicate with the outside world.

                                            Thank you,
                                            Flo Reeder

address: FMR%ICF.HRB@ECL.PSU.EDU
mail (U.S.) : 530 Toftrees Ave. #140
              State College, Pa. 16803

disclaimer: I have no opinions which in any way resemble those of my
            employer.

[My suggestions (in no particular order, other than on my bookshelf)
Allen, James. _Natural Language Understanding_ Benjamin Cummings 1987
Winograd, Terry _Language as a Cognitive Process_ Addison-Wesley 1983
Tennant, Harry _Natural Language Processing_ Petrocelli Books 1981
Martinich, A.P. _The Philosophy of Language_ Oxford 1985
Sells, Peter _Lectures on Contemporary Syntactic Theories_ CSLI 1985
Pereira and Shieber _Prolog and Natural-Language Analysis_ CSLI 1987

Other reasonably good books exist and moderate to advanced levels, maybe
others on the net will have suggestions. - BWM]

------------------------------

Date: Mon, 13 Jun 88 19:09 EDT
From: Steven Ryan <smryan@garth.uucp>
Subject: Shallow Parsing

I once heard a suggestion humans use a different parsing strategy than
compilers. Compilers use a deep parse using lotsa malenky rules and produce
a very impressive parse tree.

The suggestion was that humans memorise faster than generate and are better
at handling large chunks in parallel than small chunks nested.

What I take that to mean, we memorise words in each inflected form, even
regular forms, and possibly even groups words possibly up to words. Than
language generation consists of inserting these chunks into a verb frame
chunk, with each sentence form being a different chunk.

I think I'll call this suggestion shallow parsing: the parse tree will
only go down two or three levels before running into unanalysed (ig est
memorised) chunks. In terms of a productions, this would mean having
thousands of similar yet distinct productions instead factoring the
similarities to reduce the number of productions.

What interests me about this is the possible application to compilers:
humans parse an ambiguous and nondeterministic language in almost always
linear time. Most programming languages are intentionally designed with a
deterministic context free syntax which is LL(1) or LR(1). LR(1) parsing is
all very interesting, but the resulting language is often cumbersome: the
programmer must write out a sufficient left context to help out the parser
even when he/she/it/they/te/se/... can look a little to right and know what
is happen.

      Example: the Pascal 'var' is there to help the parser know this
               is declaration and still remain LL(1).
         var v1,v2,...,vn:type;

To claim LL(1)/LR(1) is superior because of the linear time, O(n), ignores
the fact that this is context free parse and must be followed symbol
identification. Assuming the number of distinct symbols is logarithmic in the
program length, the time necessary for a context sensitive parse is from
O(n log log n) to O(n**2).

It would be interesting if we could learn something from our own minds and
make computer languages more like natural languages (but excluding ambiguity).

                                    Make it simple--not simplistic.
                                                            sm ryan
From smryan@garth.uucp (Steven Ryan)

[I posted this not to generate flames, but because this *isn't* a refereed
journal, so I'm letting this guy have a say. -BWM]

------------------------------

Date: Wed, 8 Jun 88 16:05 EDT
From: Paul Tanenbaum <pjt@BRL.ARPA>
Subject:  Spatio-Temporal metaphor


     The word "before" is not alone in having both spatial and temporal
meanings.  Most native anglophones would probably feel that its opposite,
"after," has its primary meaning in the temporal domain, and a subordinate
one (the nautical usage) in the spatial.  In fact, the word derives from the
comparative of "aft," clearly a spatial usage.
     +++paul

Paul J. Tanenbaum
<pjt@brl.arpa>
(301) 278-6691

------------------------------

Date: Fri, 3 Jun 88 19:58 EDT
From: James J. Lippard <Lippard@BCO-MULTICS.ARPA>
Subject:  Re: Genderless 3rd person pronoun.

>Date: Thu, 26 May 88 12:40 EDT
>From: Robert France <france@vtopus.cs.vt.edu>

>[...]
>On a more pragmatic note, I've been noticing that "they" and "them" are
>entering spoken English as genderless singular pronouns.  [...]

Actually, "they" has been used as a singular pronoun for centuries, according
to Ann Bodine's "Androcentrism in prescriptive grammar: singular 'they',
sex-indefinite 'he', and 'he or she'", Language in Society 4(Aug 1975):129-146.
According to this article, all three forms were in common use until the late
18th century (with the first proscription of "they" found in a grammar book of
1765).
   Bodine makes the following comparison between use of "they" and use
of "he" as a singular sex-indefinite pronoun:

      If the definition of 'they' as exclusively plural is accepted, then
   'they' fails to agree with a singular, sex-indefinite antecedent by one
   feature--that of number.  Similarly, 'he' fails to agree with a singular,
   sex-indefinite antecedent by one feature--that of gender.  A non-sexist
   'correction' would have been to advocate 'he or she', but rather than
   encourage this usage the grammarians actually tried to eradicate it also,
   claiming 'he or she' is 'clumsy', 'pedantic', or 'unnecessary'.  
   Significantly, they never attacked forms such as 'one or more' or 'person
   or persons', although the plural logically includes the singular more than
   the masculine includes the feminine. (p. 133)

Jim Lippard
Lippard at BCO-MULTICS.ARPA


[Further discussion on this topic is being dropped, it has degenerated into
a discussion totally unrelated to language recognition but one of
"apropriate" language definition, ie. the appropriateness of sexed pronouns
which is not appropriate for this discussion list. -BWM]

------------------------------

Date: Mon, 6 Jun 88 07:59 EDT
From: Bruce E. Nevin <bnevin@cch.bbn.com>
Subject: Re: genderless pronouns

"They" and "them" are not just entering English as genderless 3rd sg
pronouns.  They have been used in that capacity for a century or two
anyway:  OED cites Fielding, Goldsmith, Thackeray, and others.
Webster's 9th New Collegiate Dictionary identifies this usage as:

  often used with an indefinite third person singular antecedent
  <everyone knew where ~ stood -- E.L. Doctorow> <nobody has to go to
  school if ~ don't want to -- N.Y. Times>,"

and adds the usage note:

  _They_ used as an indefinite subject (sense 2) is sometimes objected
  to on the grounds that it does not have an antecedent.  Not every
  pronoun requires an antecedent, however.  The indefinite _they_ is
  used in all varieties of contexts and is standard.

This discussion seems to confuse the notions of antecedent and referent:
they in the examples has an antecedent (the indefinite pronouns
everyone, nobody), but this antecedent (and consequently they itself) is
indefinite as to reference.

Despite the fulminations of dear old Fowler against it (to his ear it
was old-fashioned!), the 3rd-person plural pronoun with reference
indefinite as to number appears to be here to stay.  We can easily use
it in cases where we wish to be indefinite as to gender as well as
number.  In cases where we wish to fog only gender, but number is known
to be singular, we are in a bit of trouble.

Bruce Nevin
bn@cch.bbn.com
<usual_disclaimer>

------------------------------

Date: Sun, 5 Jun 88 19:58 EDT
From: pesetsk%UMASS.BITNET@MITVMA.MIT.EDU
Subject: Pronouns and Generative syntax


==========
In a recent NL-KR, Wojcik (rwojcik@BOEING.COM) writes:

RW) Arild Hestvik writes:
RW) AH> The fact that you can create a context where the Binding Theory doesn't
RW) AH> work doesn't mean that the Binding Theory is wrong, it simply means that
RW) AH> you set up a lousy experiment which yielded garbage as a result. That is
RW) AH> very easy, you can do that to everything you learnt in high-school
RW) AH> physics, simply by doing the experiment wrongly.
RW)
RW) It is very difficult to set up experiments to confirm or disconfirm anything
RW) in generative theory.  Postal's satire, "Advances in Linguistic Rhetoric"
RW) (NLLT Feb. 1988), is the best explanation I have seen of why that is the
RW) case. I did not set up a context where "Binding Theory doesn't work."  I
RW) merely showed the futility of trying to build a theory of language
RW) understanding on a syntactic theory that purports to work independently of
RW) pragmatic context.  You can't take a sentence such as 'He shaves John' and
RW) claim that the NPs necessarily refer to different individuals...

Wojcik is, of course, correct as to the "necessarily", but it seems to me
that both Wojcik and Hestvik miss the point of examples of this sort. These
examples raise two questions, which can and should be distinguished as
questions -- even if the answers are related:

For a pair of NPs, a pronoun P and a name (non-pronoun) N:

1. Why, under certain discourse conditions, is it impossible to understand
   N and P as coreferent when P c-commands N, but possible otherwise?

    e.g.
           Let me tell you something about John(i).  *He(i) likes John(i).

     vs.  Let me tell you something about John(i). His(i) mother
          really likes John(i).


2. Why, under other discourse conditions, is it possible to understand
   N and P as coreferent even when P c-commands N?

I will not purport to answer either question here. The answer to question 1
reduces in part to something like Chomsky's "Condition C" or various
alternatives that are discussed in the literature. Whatever the answer to
question 1 is, it very clearly does involve a structural condition on
relations between nodes in a phrase marker. Otherwise, how can you explain
robust contrasts like that between "he" and "his mother" in the examples
above? The investigation of such conditions is quite rightly a
stock-in-trade of "generative syntax".  To the extent to which such
conditions govern actual linguistic effects, they are of great and obvious
relevance to a theory of linguistic knowledge or "language understanding".
How could they not be?

Turning to question 2 and the remaining aspects of question 1, here the
answers will involve the connection between phrase marker relations and
their interpretation in discourse. And, of course, we ultimately want to
have answers to these questions as well, if we are interested in
understanding linguistic knowledge and, yes, how "language understanding"
is possible.

Hestvik overstates the matter in accusing Wojcik of constructing a "lousy
experiment" pure and simple. Wojcik has indeed constructed a lousy
experiment if his goal is simply to determine the structural factors that
help us to answer question 1, but that is apparently not the question he is
interested in answering. So be it; both questions 1 and 2 are worth
answering, since the answers to both bear on linguistic competence and
language use. It is not the case that the factors that are described in
question 2 are in and of themselves "noise" or "garbage" -- they constitute
noise or garbage only relative to experiments that seek to answer question 1
alone.

Wojcik, in turn, overstates the matter when he dismisses generative syntax
as a theory on which it is "futile" to build a theory of language
understanding. The examples at hand have nothing to do with this question.
The existence of circumstances under which the c-command effect goes away
does not mean that the c-command condition does not exist, nor that
investigating the condition is an exercise in futility for any investigation
of "language understanding". It merely means that we have (at least) two
interacting phenomena to explain, not just one.

Wojcik somehow links his empirical statements about "generative theory" to a
view that aspects of generative theory like those being discussed cannot be
confirmed or disconfirmed.* One hears this charge made now and again, but it
is simply belied by the actual practice of the field: for example, the
structural conditions refered to in question 1 have been a topic of lively
disputes since Langacker's and Ross's work in in the 1960s. Langacker's
original proposals may be held to be disconfirmed in the form in which he
stated them by later work of Reinhart, Lasnik, Aoun and Sportiche, Chomsky,
and many others (though the basic properties of Langacker's theory are with
us to this day).

[*Mind you, if we take a literal-minded reading of Wojcik's actual
claim that that it is merely "very difficult to set up experiments to
confirm or disconfirm anything in generative theory", then there is nothing
to quibble with.  If the theory is of sufficient richness or complexity that
it becomes difficult to determine at first glance the implications of some
particular set of data, this is no fault in the theory, but rather a sign of
its maturity -- a good thing.  But I doubt that this is the meaning that was
intended here.]

Examples of confirmation and disconfirmation abound in the generative syntax
literature: Taraldsen's explanation for the absence of "that-trace" effects
in Italian was disconfirmed by Rizzi. My own suggestions in a similar domain
concerning an interaction between "doubly-filled COMPS" and "that-trace"
effects were disconfirmed by Bennis. Taraldsen's, Kayne's and my own
reduction of these effects to the Nominative Island Condition was
disconfirmed in an article by Freidin and Lasnik. On a different topic, Mark
Baker's work (U of Chicago Press; just published) confirms in an exciting
way a movement theory of incorporation phenomena, but Williams and DiSciullo
(MIT Press) claim to undermine some of the evidence Baker brings to bear.
The jury is still out on this one; but it is perfectly clear that empirical
issues will decide the question and how a decision can be made. To take
another example, a theory of passive or raising that lacks some equivalent
of "traces" (e.g. most lexicalist theories) is disproved by Rizzi's
observations in his paper "On Chain Formation" (in the Syntax & Semantics
volume on clitics; H.Borer, ed.). For another example, a theory of parasitic
gaps that derives them in a manner fundamentally distinct from "movement"
structures (e.g. Taraldsen's theory, or Chomsky's in "Concepts and
Consequences") is disconfirmed if parasitic gaps display all the familiar
properties of movement, including island sensitivity (as claimed by Chomsky
in his later book "Barriers).

To quote the King of Siam: "et cetera, et cetera, et cetera".

Where things are not settled enough to claim "confirmation" or
"disconfirmation", the overwhelming bulk of the literature "seeks truth
through facts" in the manner of any science, developing ideas that may be
confirmed in part or disconfirmed in part by subsequent research.

In fact, for discussion bearing directly on "question 2", there is
discussion squarely within "generative theory" by Higginbotham (recent LI
papers; e.g. LI 16.4, pp. 568ff), by Lasnik (in forthcoming Reidel book), by
Reinhart (cf. article in volume on acquisition from Reidel/Kluwer; B.Lust
ed.), and by various workers in the Kamp/Heim theories of "Discourse
Representation". See also G. Evans' article that appeared in LI around 1980
or so, or Jackendoff's 1972 book for older discussions. I'm sure there is
plenty more. Note that Wojcik himself neither offers nor cites answers to
question 1 or 2 in his note (nor have I; the literature that I cite probably
does not settle the matter). To the best of my knowledge, there is no
approach outside "generative grammar" that has made the kind of progress
on these questions that Wojcik demands. If I am wrong, let's hear about it.

In the NLLT article that Wojcik cites, Postal implies that rhetorical
tricks are substituted for empirical evidence in the GB literature as a
matter of normal procedure. This charge is logically unrelated to the
possibility of "confirming" or "disconfirming" alleged results in
generative grammar. In fact, Postal (sometimes with Pullum) has written
numerous articles that disconfirm or claim to disconfirm various proposals
within generative syntax, particularly "GB syntax", so he at least must
feel that the exercise is feasible.

Returning to Postal's actual contentions, let the readers of this column
judge for themselves the merits of the case by actually reading a source
cited by Postal, like Burzio's book "Italian Syntax", and asking whether
this is a book of rhetoric or a book of linguistic analysis and scholarship.
This is not to say that we can't all pick favorite examples that do fall
under Postal's rubric. (Nor are such examples peculiar to "GB"; almost any
sufficiently populous school of generative grammar, linguistics or cognitive
science will furnish examples.) I do think nonetheless that Postal's claim,
if read as a general description of the literature as a whole, is a canard.
However, as I say, judge for yourselves.


David Pesetsky
Dept. of Linguistics
University of Massachusetts
Amherst, MA 01003

------------------------------

Date: Mon, 6 Jun 88 07:34 EDT
From: Bruce E. Nevin <bnevin@cch.bbn.com>
Subject: re: learning from positive examples

I believe you are right.

The "argument from poverty of data" for UG goes something like this:
language is too complex, and children's mental capacities too limited,
for the language learning that we observe by children to be explicable
other than by UG (Universal Grammar) constraints of the sort proposed
in GB (Government-and-Binding) theory.

Children turn out to have much greater mental capabilities than they
had been credited with, and more is discovered as (a) research methodologies
become more subtle and (b) expectations rise (Pygmalion effect?).

Language turns out to have a fairly straightforward structure of
dependencies and dependencies on dependencies when described in a
maximally efficient and compact way, without what might be called
"noise" (in the information-theoretic sense) in the description.
I have given references to an existence proof of this proposition.

Given these two facts, the argument from poverty of data loses much
of its force.  Reference:  _Language and Information_, cp pp 97, 111-113.
This is the Columbia U. Press book based on the Bampton Lectures
delivered by Harris in the Fall of 1986.  I am reviewing this book for
a future issue of _Computational Linguistics_.

	Bruce

------------------------------

End of NL-KR Digest
*******************