[comp.ai.nlang-know-rep] NL-KR Digest Volume 3 No. 46

nl-kr-request@CS.ROCHESTER.EDU (NL-KR Moderator Brad Miller) (11/11/87)
NL-KR Digest             (11/11/87 02:53:50)            Volume 3 Number 46

Today's Topics:
        Re: Practical effects of AI (speech)
        power of Montague syntax
        Re: Langendoen and Postal (posted by: B
        Re: Why can't my cat talk?
        
----------------------------------------------------------------------

Date: Tue, 3 Nov 87 09:29 EST
From: George Tatge <gt@hpfcmp.HP.COM>
Subject: Re: Practical effects of AI (speech)

>
>Those of us who work on speech will be very encourage by this enthusiasm.
>However,
>
>(1) Speaker-independent continuous speech is much farther from reality
>    than some companies would have you think.  Currently, the best
>    speech recognizer is IBM's Tangora, which makes about 6% errors
>    on a 20,000 word vocabulary.  But the Tangora is for speaker-
>    dependent, isolate-words, grammar-guided recognition in a benign
>    environment.  Each of these four constraints cuts the error rate 
>    by 3 or more times if used independently.  I don't know how well
>    they will do if you remove all four constraints, but I would guess
>    about 70% error rate.  So while speech recognition has made a lot 
>    of advancements, it is still far from usable in the application you 
>    mentioned.
>
>Kai-Fu Lee
>Computer Science Department
>Carnegie-Mellon University
>----------

Just curious what the definition of "best" is.  For example, I have seen
6% error rates and better on grammar specific, speaker dependent, continuous
speech recognition.  I would guess that for some applications this is
better than the "best" described above.

George (floundering in superlative ambiguity) Tatge

------------------------------

Date: Sun, 8 Nov 87 12:14 EST
From: Kai-Fu Lee <kfl@SPEECH2.CS.CMU.EDU>
Subject: Re: Practical effects of AI (speech)

In article <930001@hpfcmp.HP.COM>, gt@hpfcmp.HP.COM (George Tatge) writes:
> >
> >(1) Speaker-independent continuous speech is much farther from reality
> >    ...
> >Kai-Fu Lee
> 
> Just curious what the definition of "best" is.  For example, I have seen
> 6% error rates and better on grammar specific, speaker dependent, continuous
> speech recognition.  I would guess that for some applications this is
> better than the "best" described above.
> 

"Best" is not measured in terms of error rate alone.  More effort and
new technologies have gone into the IBM's system than any other system, 
and I believe that it will do better than any other system on a comparable
task.  I guess this definition is subjective, but I think if you asked other 
speech researchers, you will find that most people believe the same.

I know many commercial (and research) systems have lower error rates
than 6%.  But you have to remember that the IBM system works on a 20,000
word vocabulary, and their grammar is a very loose one, accepting
arbitrary sentences in office correspondences.  Their grammar has a
perplexity (number of choices at each decision point, roughly speaking)
of several hundred.  Nobody else has such a large vocabulary or such
a difficult grammar.  

IBM has experimented with tasks like the one you mentioned.  In 1978,
they tried a 1000-word task with a very tight grammar (perplexity = 5 ?),
the same task CMU used on Hearsay and Harpy.  They achieved 0.1% error
rate.

> George (floundering in superlative ambiguity) Tatge

Kai-Fu Lee

------------------------------

Date: Tue, 3 Nov 87 09:36 EST
From: Greg Lee <lee@uhccux.UUCP>
Subject: power of Montague syntax


I posted a question about the power of Montague syntax.  I guess
the answer is obvious.  No assumptions constrain the functions
which determine the form of phrases given the forms of their parts.
So such functions could be specified as the product of a list
of transformations, or a Turing machine, for that matter.

So why is it that Montague grammar is widely regarded as a
non-transformational model?  Am I missing something?
	Greg Lee, lee@uhccux.uhcc.hawaii.edu

------------------------------

Date: Tue, 3 Nov 87 18:14 EST
From: Jeffrey Goldberg <goldberg@russell.STANFORD.EDU>
Subject: Re: power of Montague syntax

In article <1057@uhccux.UUCP> lee@uhccux.UUCP (Greg Lee) writes:
>No assumptions constrain the functions
>which determine the form of phrases given the forms of their parts.
>So such functions could be specified as the product of a list
>of transformations, or a Turing machine, for that matter.

>So why is it that Montague grammar is widely regarded as a
>non-transformational model?  Am I missing something?
>	Greg Lee, lee@uhccux.uhcc.hawaii.edu

Montague grammar is more or less a theory of semantics.  Though
many of the practitioners of MG use some form of catagorial grammar
or they use a PSG with wrapping.

But MG as defined in "Introduction to Montegue Grammar" by D.
Dowty, R. Wall, and S. Peters places no restriction of the
syntactic combination of elements and is very likey turing
equivalent.

-jeff goldberg
-- 
Jeff Goldberg 
ARPA   goldberg@russell.stanford.edu
UUCP   ...!ucbvax!russell.stanford.edu!goldberg

------------------------------

Date: Wed, 4 Nov 87 10:33 EST
From: Paul Neubauer <neubauer@bsu-cs.UUCP>
Subject: Re: power of Montague syntax


In article <1057@uhccux.UUCP>, lee@uhccux.UUCP (Greg Lee) writes:
> I posted a question about the power of Montague syntax.  I guess
> the answer is obvious.  No assumptions constrain the functions
> which determine the form of phrases given the forms of their parts.
> So such functions could be specified as the product of a list
> of transformations, or a Turing machine, for that matter.
> 
> So why is it that Montague grammar is widely regarded as a
> non-transformational model?  Am I missing something?

I don't think you're missing anything.  If the functions are not defined,
then they could be anything, but since they are not defined to be
transformational-type rules, Montague grammar is not [explicity] a
transformational grammar.  Unfortunately, we just don't know what it IS.

To be fair, though, I don't think that the question of weak generative power
worries most Montague grammarians.  In fact, I don't think it really worries
ME.  I suppose that my former life as a Generative Semanticist has jaded me
on that question, but I can't get excited about the weak generative power of
an under-defined class of grammars when what I see as relevant is the strong
generative power of a particular, substantively defined grammar or class of
grammars.  [I use "substantive" in a more substantive sense than Chomsky,
whose "substantive" universals I still consider mostly formal.]

-- 
Paul Neubauer         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!neubauer

------------------------------

Date: Wed, 4 Nov 87 01:31 EST
From: goldfain@osiris.cso.uiuc.edu
Subject: Re: Langendoen and Postal (posted by: B

> /* Written 10:34 am  Nov  1, 1987 by berke@CS.UCLA.EDU in comp.ai */
> /* ---------- "Langendoen and Postal (posted by: B" ---------- */
> I just read this fabulous book over the weekend, called "The Vastness
> of Natural Languages," by D. Terence Langendoen and Paul M. Postal.
>    ...
> Their basic proof/conclusion holds that natural languages, as linguistics
> construes them (as products of grammars), are what they call
> mega-collections, Quine calls proper classes, and some people hold cannot
> exist.  That is, they maintain that (1) Sentences cannot be excluded from
> being of any, even transfinite size, by the laws of a grammar, and (2)
> Collections of these sentences are bigger than even the continuum.  They are
> the size of the collection of all sets: too big to be sets.
>            ...
> /* End of text from osiris.cso.uiuc.edu:comp.ai */

Hang on a minute!   It *sounds* as though  you are talking  about Context-Free
Grammars/Languages (CFGs/CFLs) here.  Most linguists  (I'd wager) set up their
CFGs as admitting  only finite derivations  over a  finite  set of  production
rules, each rule only allowing finite expansion.  Thus, although usually a CFL
is only a  proper subset of  this, we are   ALWAYS working  WITHIN the  set of
finite strings (of  arbitrary  length) over a  finite alphabet.
     Such a set is countably infinite.  Far from being a proper class, this is
a very  manageable set.  If you  move the discussion  up to the cardinality of
the  set of "discourses", which would   be finite  sequences of strings in the
language, you are still only  up to  the power set of the  integers, which has
the same cardinality  as the  set of Real numbers.  Again,  this is a set, and
not a proper class.
     I haven't seen the book you cite.  They must make some argument as to why
they  think natural  languages  (or linguistic   theories about  them)   admit
infinite sentences.  Even given that, we  would have only  the Reals (i.e. the
"Continuum") as a cardinality without some further surprising claims.  Can you
summarize their argument (if it exists) ?

           Mark Goldfain                    arpa: goldfain@osiris.cso.uiuc.edu
           Department of Computer Science
           University of Illinois at Shampoo-Banana

------------------------------

Date: Fri, 6 Nov 87 20:47 EST
From: Mitchell Spector <spector@suvax1.UUCP>
Subject: Re: Langendoen and Postal (posted by: B


In article <8300011@osiris.cso.uiuc.edu>, goldfain@osiris.cso.uiuc.edu
comments on an article by berke@CS.UCLA.EDU:
> 
> > /* Written 10:34 am  Nov  1, 1987 by berke@CS.UCLA.EDU in comp.ai */
> > /* ---------- "Langendoen and Postal (posted by: B" ---------- */
> > /* End of text from osiris.cso.uiuc.edu:comp.ai */
> 
> Hang on a minute!   It *sounds* as though  you are talking  about Context-Free
> Grammars/Languages (CFGs/CFLs) here...

   The set of all finite sequences of finite strings in a language (the set
of "discourses") is still just a countably infinite set (assuming that the
alphabet is finite or countably infinite, of course).  The set of infinite
sequences of finite strings is uncountable, with the same cardinality as the
set of real numbers, as is the set of infinite strings.  (By infinite string
or infinite sequence, I mean an object which is indexed by the natural
numbers 0, 1, 2, ....)

   In general, sets of finite objects are finite or countably infinite.
(A finite object is, vaguely speaking, one that can be identified by means
of a finite representation.  More specifically, this finite representation
or description must enable you to distinguish this object from all the
other objects in the set.)  If you want to get an uncountable set, you
must use objects which are themselves infinite as members of the set.
Many people lose sight of the fact that a real number is an infinite object
(although an integer or a rational number is a finite object).  Any general
method of identifying real numbers must use infinitely long or large
representations (for example, decimals, continued fractions, Cauchy
sequences, or Dedekind cuts).  Real numbers are much more difficult to
pin down than one might gather from many math classes.  This misimpression
is partly due to the fact that one deals only with a relative small (finite!)
set of specific real numbers; these either have their own names in
mathematics or they can be defined by a finite sequence of symbols in the
usual mathematical notation.  The other real numbers belong to a nameless
horde which we use in general arguments but never by specific mention.

   I certainly agree with the general objections raised to the idea that
natural languages are uncountably large (or, worse yet, proper classes),
although I haven't read the book in question.  Maybe somebody can state
more precisely what the book claimed, but it seems at first glance to
indicate a lack of understanding of modern set theory.

   By the way, logicians do study infinite languages, including both the
possibility of infinitely many symbols and that of infinitely long
sentences, but such languages are very different from what we think of
as "natural language."  It doesn't matter whether you're talking about
context-free languages or more general sorts of languages -- in any
language used by people for communication, the alphabet is finite,
each word is finitely long, and each sentence is finitely long.

-- 
Mitchell Spector                                        |"Give me a
Dept. of Computer Science & Software Eng., Seattle Univ.|     ticket to
Path:    ...!uw-beaver!uw-entropy!dataio!suvax1!spector |             Mars!!"
  or:   dataio!suvax1!spector@entropy.ms.washington.edu | -- Zippy the Pinhead

------------------------------

Date: Wed, 4 Nov 87 08:49 EST
From: necntc!adelie!mirror!ishmael!inmet!justin@ames.arpa
Subject: Re: Why can't my cat talk?


/* Written  4:03 pm  Oct 31, 1987 by roberts@cognos.UUCP in inmet:comp.ai */

Should this crystallization hypothesis prove true, what does this tell us about
gorillas?  And is AMSLAN, in which I understand at least one gorilla has
achieved not only a considerable vocabulary but a remarkable proficiency at
combining "symbols" to denote new concepts, a natural language?  That is to
say, does mastery of a sign language require the same brain functions as those
required to speak a natural language?

/* End of text from inmet:comp.ai */

As I understand it, AMSLAN is, in fact, a proper natural language. The rub is
that the gorillas learning it have only learned it to a point. AMSLAN has
its own particular syntax, and that seems to be the sticking point. While
the gorillas seem perfectly able to learn the concepts, and is able to stick
them together, they don't seem to be able to understand sophisticated
*syntax* (beyond two-word combinations). Just what this implies about
cognition, I'm not sure.
				-- Justin du Coeur

------------------------------

Date: Wed, 4 Nov 87 10:20 EST
From: Alan Lovejoy <alan@pdn.UUCP>
Subject: Re: Why can't my cat talk?


In article <576@russell.STANFORD.EDU> goldberg@russell.UUCP (Jeffrey Goldberg) writes:
/[I allege that most languages (especially primitive ones) rely more on 
morphology than word-order to encode syntax]
/I will take your claim seriously if you do the following:
/
/(1) Devise a sampling method that factors out things that should
/    be factored out.  (I linguist named Matthew Dryer has done
/    some excellent work on this problem, and has consturcted a
/    method that I would certainly trust.)
/
/(2) Provide a definition of "primitive" which would yeild the same
/    result when applied by a number of anthropologists.  (That is,
/    your definition must be explicit enough so that an arbitrary
/    anthropologist could determine what what "primitive".)
/
/(3) Provide a definition of what ever grammatical property you wish
/    to test for which would yeild the same result when applied by a
/    number of linguists.  (That is, your definition must be
/    explicit enough so that an arbitrary linguist could tell
/    whether it is "free word order" (or whatever).)
/
/(4) Apply standard statistical techniques to determine
/    significance.
/
/Until you move to do something like that your claim is like claiming:
/"People with big feet like tometos".  And basing this on the fact
/that you have met a couple families with bigger feet then yours who
/served spaghetti with tomato sause and one even put tomatoes in the
/salad.

Excuse me, but I believe you were the one to propose a new theory
relating hand movement sequences to syntax, and you are the one
publishing a paper expounding your theory.  Why should I do your
research work for you?  I was happy to point out the likely form
of the attacks you would receive  once your paper is published,
but I get paid to do software engineering, not publish research
in cultural anthropology (might be fun, but it doesn't pay enough :-) ).

--alan@pdn

------------------------------

Date: Thu, 5 Nov 87 12:00 EST
From: Elizabeth D. Zwicky <zwicky@dormouse.cis.ohio-state.edu>
Subject: Re: Why can't my cat talk?


In article <8986@shemp.UCLA.EDU> srt@CS.UCLA.EDU (Scott Turner) writes:

  Hypotheses drawn from
>degenerate cases like Genie need to be carefully tested in the normal adult
>population before they can be given any serious consideration.
> 
>    Scott R. Turner

Give me a break here. You CANNOT test hypotheses about whether or not
there is a crystallization period after which language cannot be learned
without dealing with degenerate cases. The case of someone who has been
deprived of all language contact for n years, starting at birth, whether
n is 2, 5, or 12, will always be a degenerate case. Certainly, Genie
is not conclusive evidence, and such cases are (thank God!) rare, and
so the evidence is not conclusive. However, in all known cases, children
deprived of language contact cannstill learn languages normally if they
start before puberty. 

The idea of a crystallization period is supported by the data about
second language learning in normal humans, but the question I was
answering was about learning of *first* languages.	

	Elizabeth Zwicky

------------------------------

Date: Fri, 6 Nov 87 07:07 EST
From: srt@CS.UCLA.EDU
Subject: Re: Why can't my cat talk?

In article <1125@tut.cis.ohio-state.edu> zwicky@dormouse.cis.ohio-state.edu (Elizabeth D. Zwicky) writes:
>In article <8986@shemp.UCLA.EDU> srt@CS.UCLA.EDU (Scott Turner) writes:
>> Hypotheses drawn from
>>degenerate cases like Genie need to be carefully tested in the normal adult
>>population before they can be given any serious consideration.
>
>Give me a break here.

Take two, they're cheap.

> ...You CANNOT test hypotheses about whether or not
>there is a crystallization period after which language cannot be learned
>without dealing with degenerate cases.

Huh?  Studies about second language learning and use clearly bear on this
question.  I don't consider adults who can learn a second language
degenerate. (Well, no more degenerate than the average adult :-).

I agree with your points, by the way.  I'm just cautioning against
building models based on people like Genie without having separate,
confirming evidence that the model is reasonable for normal people.
 
    Scott R. Turner
    UCLA Computer Science     "Delving into mockery science"
    Domain: srt@cs.ucla.edu
    UUCP:  ...!{cepu,ihnp4,trwspp,ucbvax}!ucla-cs!srt

------------------------------

End of NL-KR Digest
*******************