[comp.lang.prolog] Question about DCGs and natural language grammars

mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) (11/21/90)

I have a question about the *conceptualization* of (annotated)
context-free grammars implicit in their "standard" DCG translations.
Maybe I'm just terribly confused (is this what happens when you
worry too much about GB parsing?), in which case maybe someone
can point out an obvious mistake I'm making.

What's worrying me is this: if we use a DCG grammar to
define a relation means/2 true of a string of English words
and some representation of its meaning (say, an encoding of
a first-order formula), we can prove things like the
following:

  S=[i,saw,a,man,with,a,telescope], means(S,M1), means(S,M2).

where M1 represents a meaning where where the man has a telescope,
and M2 represents a meaning where the seeing is done with the
telescope.

That is, from our axioms we can prove that S means M1 *and* S means M2.
The problem is that in the real world it *doesn't*: the English sentence
S means M1 *or* M2.

There seem to be two ways out of this dilemma.  

First, you can deny that natural language ambiguity really leads to 
disjunctive, rather than conjunctive, consequents.  But I think that 
this problem arises in a more serious manner in other cases.

For example, suppose we try to write a predicate presupposes/2, which
is true of a string of English words and a formula expressing one
of the things it presupposes, e.g.

   S = [the,man,saw,the,woman], presupposes(S,exists(X,man(X))),
   presupposes(S,exists(Y,woman(Y))), ...

presupposes(Sentence,Presupp) :- parse(Sentence,Tree), presupp(Tree,Presupp).

But now the ambiguity problem comes back with a vengance.  If we
try to find the presuppositions of an ambiguous sentence, say,
"I saw the man from the hill with the telescope", we can prove that
both the existence of the man from the hill, and also the
existence of the hill with the telescope are presupposed,
whereas these are alternative presuppositions corresponding
to alternative parses of the ambiguous sentence.

Of course there are a number of technical fixes for this problem,
so in practise there is no problem here.  But I think there is
a deeper issue here -- why doesn't the straight-forward approach
sketched above work?

Second, you can admit that we really do want to get disjunctive
consequents.  (This takes us out of Horn clause logic, but disjunctive
logic programming is possible).  Lexical ambiguity can be dealt with
by disjunctive axioms; e.g. the lexical entry for an ambiguous word
like "ball" might be something like:

lex(ball,n,X^(toy(X)&round(X)&...)) ; lex(ball,n,X^(dance(X)&social-event(X)&...)).

Such "disjunctive" axioms for ambiguous lexical entries express
the intuition that an ambiguous word means either one thing or
another thing, but not both things simultaneously. 

But I don't know how to write axioms that represent an (annotated) CFG
that result in syntactic ambiguities producing disjunctive consequents.
(Any suggestions?)

Thanks,

Mark Johnson
mj@cs.brown.edu
mark@adler.philosophie.uni-stuttgart.de

ted@nmsu.edu (Ted Dunning) (11/21/90)

In article <MARK.90Nov21120114@adler.philosophie.uni-stuttgart.de> mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) writes:

   ... if we use a DCG grammar ... we can prove things like the
   following:

     S=[i,saw,a,man,with,a,telescope], means(S,M1), means(S,M2).

   where M1 represents a meaning where where the man has a telescope,
   and M2 represents a meaning where the seeing is done with the
   telescope.

   That is, from our axioms we can prove that S means M1 *and* S means M2.
   The problem is that in the real world it *doesn't*: the English sentence
   S means M1 *or* M2.

huh?

without context, both interpretations of the sentence are perfectly
valid.  it may be that you want to add (implicitly, usually) another
axiom to your language understanding system that a sentence can have
only one meaning, go right ahead.  

not that prolog must in any sense reflect how people use language.

   For example, suppose we try to write a predicate presupposes/2 ...
   now the ambiguity problem comes back with a vengance.  ... we can prove that
   both the existence of the man from the hill, and also the
   existence of the hill with the telescope are presupposed,
   whereas these are alternative presuppositions corresponding
   to alternative parses of the ambiguous sentence.

so?  why is this bad?

without context, both presuppositions _are_ possible.

there is nothing that says that prolog has to work like the inside of
somebody's head, nor that it must reflect truth.

   Lexical ambiguity can be dealt with
   by disjunctive axioms; e.g. the lexical entry for an ambiguous word
   like "ball" might be something like:

   lex(ball,n,X^(toy(X)&round(X)&...)) ; lex(ball,n,X^(dance(X)&social-event(X)&...)).

why not just

lex(ball,n,X^(toy(X)&round(X)&...)).
lex(ball,n,X^(dance(X)&social-event(X)&...)).

   Such "disjunctive" axioms for ambiguous lexical entries express
   the intuition that an ambiguous word means either one thing or
   another thing, but not both things simultaneously. 

not really.  i can still say lex(ball,_,M1), lex(ball,_,M2), M1 \== M2

   But I don't know how to write axioms that represent an (annotated) CFG
   that result in syntactic ambiguities producing disjunctive consequents.
   (Any suggestions?)

don't worry about it.  there isn't a problem.

--
I don't think the stories are "apocryphal".  I did it :-)  .. jthomas@nmsu.edu

morgan@bach.cogsci.uiuc.edu (Jerry Morgan) (11/22/90)

mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) writes:

>What's worrying me is this: if we use a DCG grammar to
>define a relation means/2 true of a string of English words
>and some representation of its meaning (say, an encoding of
>a first-order formula), we can prove things like the
>following:

>  S=[i,saw,a,man,with,a,telescope], means(S,M1), means(S,M2).

>where M1 represents a meaning where where the man has a telescope,
>and M2 represents a meaning where the seeing is done with the
>telescope.

>That is, from our axioms we can prove that S means M1 *and* S means M2.
>The problem is that in the real world it *doesn't*: the English sentence
>S means M1 *or* M2.

It seems to me the DCG system has got the facts right: it's true that
S means M1, and it's true that S means M2. What's missing is the
distinction between sentence and use of a sentence.

Even though (S means M1 and S means M2), on any particular occasion
of use where S is uttered by A, it's not true that (A means-to-convey M1
AND A means-to-convey M2); rather, (A means-to-convey M1 OR A m-to-c M2).

So you need some way of distinguishing sentence type and sentence token.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/23/90)

In article <MARK.90Nov21120114@adler.philosophie.uni-stuttgart.de>, mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) writes:
> What's worrying me is this: if we use a DCG grammar to
> define a relation means/2 true of a string of English words
> and some representation of its meaning (say, an encoding of
> a first-order formula), we can prove things like the
> following:

>   S=[i,saw,a,man,with,a,telescope], means(S,M1), means(S,M2).

> where M1 represents a meaning where where the man has a telescope,
> and M2 represents a meaning where the seeing is done with the
> telescope.

What this means is
	"M1 is a possible reading of S"
    and	"M2 is a possible reading of S"
which is true.  Where's the problem?

> That is, from our axioms we can prove that S means M1 *and* S means M2.

This is not a problem with DCGs, it's a problem with your English.
What you can prove is that M1 is a possible reading of S and that
M2 is a possible reading of S, and that's _true_.

> First, you can deny that natural language ambiguity really leads to 
> disjunctive, rather than conjunctive, consequents.

This is not a way out of the pseudo-problem.  You get means(S,M1)
and means(S,M2) as solutions precisely *because* means/2 is "disjunctive".

Let me give you an example.  There are two ways I can leave this office.
I can go through the door, or I can jump out the window (not a good idea,
I'm on the 11th floor).  So we have
	has_available_exit(ok, door).
	has_availalbe_exit(ok, window).
meaning that I can go through the door OR I can go through the window.
From this we can prove
	?- has_available_exit(S, M1), has_evailable_exit(S, M2).
	S = ok,
	M1 = door,
	M2 = window
which *correctly* says that it is possible for me to go through the
door *and* that it is possible for me to go through the window.
No problem!  (As long as I _don't_ go through the window (:-).)

>    S = [the,man,saw,the,woman], presupposes(S,exists(X,man(X))),
>    presupposes(S,exists(Y,woman(Y))), ...

> presupposes(Sentence,Presupp) :- parse(Sentence,Tree), presupp(Tree,Presupp).

> But now the ambiguity problem comes back with a vengance.

There is NO ambiguity problem here, only a misreading.
presupposes(S, X^man(X))	gives you ONE presupposition, not all of them
presupposes(S, Y^woman(Y))	gives you ONE presupposition, not all of them.

If you want something that gives you *all* the presuppositions that are
implicit in a particular reading, you will have to write a predicate
that *DOES* that, i.e. that returns a structure representing a _set_ of
presuppositions, e.g.
	presupposes(S, [X^man(X),Y^woman(Y)])

> But I think there is
> a deeper issue here -- why doesn't the straight-forward approach
> sketched above work?

Because it's WRONG.  It simply doesn't say what you think it says.
This has nothing to do with DCGs or Prolog as such.  It's a question
of how to use logic to say what you mean.  What you are saying is not
what you mean.

> Second, you can admit that we really do want to get disjunctive
> consequents.

I don't know what "we" want, but in your examples it's not been what
_you_ need.
-- 
I am not now and never have been a member of Mensa.		-- Ariadne.

mark@adler.philosophie.uni-stuttgart.de (Mark Johnson) (11/24/90)

Richard A. O'Keefe's (ok@goanna.cs.rmit.oz.au) seems to have
misunderstood my original posting, and interpreted it as
a "How can I do this in Prolog?" question or even a 
"See how stupid X is in Prolog!" comment.

As I said at the start of my original message, this is a
question of *conceptualization*, I am *not* suggesting that
the DCG axioms are wrong, but that they are based on a conceptualization
of grammar that differs from one in which ambiguity is represented
by a disjunction of alternative readings or trees.  Much recent
work on feature structures represents, say, lexical ambiguity as
a *disjunction* of possible feature values, for example, so
I don't think this conception is incoherent.

I agree that there is a coherent interpretation of DCGs where
predicates like my "means" are interpreted as "is-a-possible-meaning-of",
but so what?  This is *not* the conception I want to axiomatize.

>> First, you can deny that natural language ambiguity really leads to
>> disjunctive, rather than conjunctive, consequents.

But of course that is exactly what you are doing - you are saying
that under the appropriate conception the consequents *are* conjunctive.

Let's see where that gets you...

>If you want something that gives you *all* the presuppositions that are
>implicit in a particular reading, you will have to write a predicate
>that *DOES* that, i.e. that returns a structure representing a _set_ of
>presuppositions, e.g.
>	presupposes(S, [X^man(X),Y^woman(Y)])

So if S is ambiguous, we get something like

	presupposes(S, [P1,P2,P3]), presupposes(S, [Q1,Q2]).

How can we read this in English?

	S presupposes P1 and P2 and P3, *or* it presuposses Q1 and Q2.

Isn't it a little funny that the set construction is used to
express English conjunction, and the conjunction connective is
used to express English disjunction?

I think I know what Richard will say - we must also interpret
"presupposes" as "is a possible set of presuppositions of".

That works, but what if that isn't the interpretation I am
interested in?  Let me be clear: I am not after a Prolog program
that "solves my problems", I am after an axiomatization that
expresses my conceptualization, or some principled explanation
of why such an axiomatization does not exist.

Note, by the way, what I am *not* saying.  I am *not* saying that
there is anything wrong with doing things the Prolog way: given the
limitations of Horn logic (no disjunctive conclusions) it's
clear that something like this has to be done if you want to
use Prolog.

But in full first-order logic it is a little strange to have to
interpret conjunction as disjunction, given that the language
has the means to express disjunction directly.

>> But I think there is
>> a deeper issue here -- why doesn't the straight-forward approach
>> sketched above work?

>Because it's WRONG.  It simply doesn't say what you think it says.
> ... What you are saying is not what you mean.

Huh?  I realized before I posted the article that the DCG axioms 
don't conceptualize of the "means" relation the way I do, that's
*why* I posted the article.  I am asking for *other axioms*
that do express my conceptualization, or else principled
reasons for why such an axiomatization does not exist.

If you think that my conceptualization is "WRONG", maybe you
might like to explain why?  The existence of another conceptualization
does *not* show this.

Addenda:  
  1.  In my previous article I forgot to put in an
axiom that requires "means-in-context" (or whatever it was called)
to be single-valued.

  2.  Jochen Doerre and Andreas Eisle here at the IMSV have pointed
out that my disjunctively formulated lexical entries don't correspond
to my intended interpretation, since the same word can appear multiple
times in one utterance.

jtl@humanist.uio.no (Jan Tore Loenning) (11/28/90)

From following this discussion I am still not sure what the 
problem is, but here is my contribution to the confusion.

First, I think we agree that a sentence is not a string of 
words but a highly structured object (sign/complex AVM 
containing all kinds of information/c+f-
structure/d+s+...structure/etc. depending on your favourite 
linguistic model).  It is this structure that has an 
associated meaning (and an associated string), not the 
string.  (If the string has a meaning I see no problem in 
claiming that it has more than one).  The "mean" relation 
between string and meaning can then be taken to be a 
derived one

mean(String, Meaning) :- hsf(Struc, String), carries(Struc, 
Meaning).

where hsf(Struc, String) can be read "Struc has the surface 
form String".

Minimal assumptions on the sentence structure is that one 
can derive the string and the meaning from it:

hsf(a,b) & hsf(a,c) -> b=c
carries(a,b) & carries(a,c) -> b=c

If one then for a particular string, s, makes the call

hsf(Struc, s), carries(Struc, M1), carries(Struc, M2)

M1 and M2 have to return the same value.  The call that 
makes them return different values

mean(s,M1), mean(s,M2)

is equivalent to

hsf(Struc1, s), carries(Struc1, M1), hsf(Struc2, s), 
carries(Struc2, M2)

It means something different, and it is not the one we are 
after.


Someone might say that I am begging the question here.  
Isn't it the same type of problem with the call

hsf(Struc1,s), hsf(Struc2,s).

How can the same string have two associated structures?  
The easiest is to answer by an analogy.  Suppose Ann and 
Mary both has red hair.  This can be represented by

red_hair(mary).
red_hair(ann).

But an equally good representation seems to be

has_hair(mary, red_hair).
has_hair(ann, red_hair).

Everything works fine if we ask something like "Who has red 
hair?"   Then it is ok to get the answer "Ann and Mary".  
But we get problems if we saw someone with red hair and we 
ask who we saw.  Now, "Ann or Mary" is correct, but "Ann 
and Mary" is not.  This is not particular to "see" (no 
intentionality lurking around).  Given a portion of red 
hair we cannot use the prolog representation to answer who 
it belongs to (again both girls).  The point here is that 
in

has_hair(mary, red_hair).

red_hair does not refer to a specific object, the girls 
don't have the same hair, it refers to a certain property 
the girls share.

Similarly, I think the string of words we give to our DCG-
parser does not represent a particular object, but a 
property that several sentences share.  If the goal is from 
a string, s, to get the result "struc1 or struc2 or ... 
strucn" where these are all the structures such that 
hsf(struci, s), then this can be compared to the following 
task. From an instance of red-hair r and knowledge of who 
is red-haired to determine who this hair belongs to and get 
the answer "mary or ann or ..".  One way to obtain this is 
by the axiom

(i) instance_of(r, red_hair) -> Ey(has_hair(y, red_hair) & 
belongs_to(r, y))

(here E is the existential quantifier) and an axiom of the 
form

(ii) Ay(has_hair(y, red_hair) <-> y=ann or y=mary or .. or 
                                       y=bill)

The list of names must be conclusive.  From (i) and (ii), 
(iii) follows

(iii) instance_of(r, red_hair) -> 
             (belongs_to(r,ann) or .... belongs_to(r, bill))

I don't know whether simpler solutions are possible.  We can 
think of a similar axiomatization of  grammars.  But it is not 
obvious how one shall give a Prolog-implementation of the existence 
quantifier in (i), in particular not in combination with (ii).

Jan Tore Loenning
University of Oslo
jtl@ulrik.uio.no