berke@ucla-cs.UUCP (11/01/87)
I just read this fabulous book over the weekend, called "The Vastness of Natural Languages," by D. Terence Langendoen and Paul M. Postal. If you have read this, I have some questions, and could use some help, especially on the more Linguistics aspects of the book. Are Langendoen or Postal on the net somewhere? They might be in England, the Publisher is Blackwell 1984. Their basic proof/conclusion holds that natural languages, as linguistics construes them (as products of grammars), are what they call mega-collections, Quine calls proper classes, and some people hold cannot exist. That is, they maintain that (1) Sentences cannot be excluded from being of any, even transfinite size, by the laws of a grammar, and (2) Collections of these sentences are bigger than even the continuum. They are the size of the collection of all sets: too big to be sets. It's wonderfully written. Clear wording, proofs, etc. Good reading. Help! Regards, Pete
rapaport@sunybcs.uucp (William J. Rapaport) (11/02/87)
In article <8941@shemp.UCLA.EDU> berke@CS.UCLA.EDU (Peter Berke) writes: >I just read this fabulous book over the weekend, called "The Vastness >of Natural Languages," by D. Terence Langendoen and Paul M. Postal. > >Are Langendoen or Postal on the net somewhere? Langendoen used to be on the net as: tergc%cunyvm@wiscvm.wisc.edu but he's moved to, I think, U of Arizona. Postal, I think, used to be at IBM Watson.
turpin@ut-sally.UUCP (Russell Turpin) (11/02/87)
In article <8941@shemp.UCLA.EDU>, berke@CS.UCLA.EDU writes: > I just read this fabulous book over the weekend, called "The Vastness > of Natural Languages," by D. Terence Langendoen and Paul M. Postal. > ... > > Their basic proof/conclusion holds that natural languages, as linguistics > construes them (as products of grammars), are what they call mega-collections, > Quine calls proper classes, and some people hold cannot exist. That is, > they maintain that (1) Sentences cannot be excluded from being of any, > even transfinite size, by the laws of a grammar, and (2) Collections of > these sentences are bigger than even the continuum. They are the size > of the collection of all sets: too big to be sets. Let me switch contexts. I have not read the above-mentioned book, but it seems to me that this claim is just plain wrong. I would think a minimum requirement for a sentence in a natural language is that some person who knows the language can read and understand the sentence in a finite amount of time. This would exclude any infinitely long sentences. Perhaps less obviously, it also excludes infinite languages. The reason is that there will never be more than a finite number of people (ET's included), and that each will fail to parse sentences beyond some maximum length, given a finite life for each. (I am not saying that natural languages include only those sentences that are in fact spoken and understood, but that only those sentences that could be understood are included.) In this view, infinite languages are solely a mathematical construct. Russell
lee@uhccux.UUCP (Greg Lee) (11/03/87)
In article <9445@ut-sally.UUCP> turpin@ut-sally.UUCP (Russell Turpin) writes: >In article <8941@shemp.UCLA.EDU>, berke@CS.UCLA.EDU writes: >> I just read this fabulous book over the weekend, called "The Vastness >> of Natural Languages," by D. Terence Langendoen and Paul M. Postal. >> ... > >Let me switch contexts. I have not read the above-mentioned book, >but it seems to me that this claim is just plain wrong. I would > ... >also excludes infinite languages. The reason is that there will >never be more than a finite number of people (ET's included), and > ... >Russell Although the number of sentences in a natural language might be finite, the most appropriate model for human language processing might reasonably assume the contrary. Suppose, for instance, that we wish to compare the complexities of various languages with regard to how easily they could be used by humans, and that we take the number of phrase structure rules in a phrase structure grammar as a measure of such complexity. A grammar to generate 100,000 sentences of the pattern "Oh boy, oh boy, ...!" would be much more complex than a grammar to generate an infinite number of such sentences. And the pattern seems easy enough to learn ... Concerning the length of sentences, I think Postal and Langendoen are not very persuasive. Most of their arguments are to the effect that previously given attempted demonstrations that sentences cannot be of infinite length are incorrect. I think they make that point very well. But obviously this is not enough To show that one should assume some sentences of infinite length. Greg Lee, lee@uhccux.uhcc.hawaii.edu
djh@beach.cis.ufl.edu (David J. Hutches) (11/03/87)
In article <9445@ut-sally.UUCP> turpin@ut-sally.UUCP (Russell Turpin) writes: >In article <8941@shemp.UCLA.EDU>, berke@CS.UCLA.EDU writes: >> ... That is, >> they maintain that (1) Sentences cannot be excluded from being of any, >> even transfinite size, by the laws of a grammar, and (2) Collections of >> these sentences are bigger than even the continuum. They are the size >> of the collection of all sets: too big to be sets. > >... I would >think a minimum requirement for a sentence in a natural language >is that some person who knows the language can read and >understand the sentence in a finite amount of time. This would >exclude any infinitely long sentences. > >Russell Because of the processing capabilities of human beings (actually, on a person-by-person basis), sentences of greater and greater length (and complexity) are more and more difficult to understand. Past a certain point, a human being will go into cognitive overload when asked to process a sentence which his or her capacities (short-term memory, stack space, whatever you want to call it) are not designed to handle. What the human being can, in practice, process and what is *possible* in a language are two different things. I think that it is the case that some theories of language/grammar explain the production of sentences which are grammatical by use of a generative model. In such a model, it is possible to generate sentences of potentially infinite length, even though it would not be possible for a human being to understand them. == David J. Hutches CIS Department == == University of Florida == == Internet: djh@beach.cis.ufl.edu Gainesville, FL 32611 == == UUCP: ...{ihnp4,rutgers}!codas!ufcsv!ufcsg!djh (904) 335-8049 ==
goldfain@osiris.cso.uiuc.edu (11/04/87)
> /* Written 10:34 am Nov 1, 1987 by berke@CS.UCLA.EDU in comp.ai */ > /* ---------- "Langendoen and Postal (posted by: B" ---------- */ > I just read this fabulous book over the weekend, called "The Vastness > of Natural Languages," by D. Terence Langendoen and Paul M. Postal. > ... > Their basic proof/conclusion holds that natural languages, as linguistics > construes them (as products of grammars), are what they call > mega-collections, Quine calls proper classes, and some people hold cannot > exist. That is, they maintain that (1) Sentences cannot be excluded from > being of any, even transfinite size, by the laws of a grammar, and (2) > Collections of these sentences are bigger than even the continuum. They are > the size of the collection of all sets: too big to be sets. > ... > /* End of text from osiris.cso.uiuc.edu:comp.ai */ Hang on a minute! It *sounds* as though you are talking about Context-Free Grammars/Languages (CFGs/CFLs) here. Most linguists (I'd wager) set up their CFGs as admitting only finite derivations over a finite set of production rules, each rule only allowing finite expansion. Thus, although usually a CFL is only a proper subset of this, we are ALWAYS working WITHIN the set of finite strings (of arbitrary length) over a finite alphabet. Such a set is countably infinite. Far from being a proper class, this is a very manageable set. If you move the discussion up to the cardinality of the set of "discourses", which would be finite sequences of strings in the language, you are still only up to the power set of the integers, which has the same cardinality as the set of Real numbers. Again, this is a set, and not a proper class. I haven't seen the book you cite. They must make some argument as to why they think natural languages (or linguistic theories about them) admit infinite sentences. Even given that, we would have only the Reals (i.e. the "Continuum") as a cardinality without some further surprising claims. Can you summarize their argument (if it exists) ? Mark Goldfain arpa: goldfain@osiris.cso.uiuc.edu Department of Computer Science University of Illinois at Shampoo-Banana
spector@suvax1.UUCP (Mitchell Spector) (11/07/87)
In article <8300011@osiris.cso.uiuc.edu>, goldfain@osiris.cso.uiuc.edu comments on an article by berke@CS.UCLA.EDU: > > > /* Written 10:34 am Nov 1, 1987 by berke@CS.UCLA.EDU in comp.ai */ > > /* ---------- "Langendoen and Postal (posted by: B" ---------- */ > > I just read this fabulous book over the weekend, called "The Vastness > > of Natural Languages," by D. Terence Langendoen and Paul M. Postal. > > ... > > Their basic proof/conclusion holds that natural languages, as linguistics > > construes them (as products of grammars), are what they call > > mega-collections, Quine calls proper classes, and some people hold cannot > > exist. That is, they maintain that (1) Sentences cannot be excluded from > > being of any, even transfinite size, by the laws of a grammar, and (2) > > Collections of these sentences are bigger than even the continuum. They are > > the size of the collection of all sets: too big to be sets. > > ... > > /* End of text from osiris.cso.uiuc.edu:comp.ai */ > > Hang on a minute! It *sounds* as though you are talking about Context-Free > Grammars/Languages (CFGs/CFLs) here. Most linguists (I'd wager) set up their > CFGs as admitting only finite derivations over a finite set of production > rules, each rule only allowing finite expansion. Thus, although usually a CFL > is only a proper subset of this, we are ALWAYS working WITHIN the set of > finite strings (of arbitrary length) over a finite alphabet. > Such a set is countably infinite. Far from being a proper class, this is > a very manageable set. If you move the discussion up to the cardinality of > the set of "discourses", which would be finite sequences of strings in the > language, you are still only up to the power set of the integers, which has > the same cardinality as the set of Real numbers. Again, this is a set, and > not a proper class. > > Mark Goldfain arpa: goldfain@osiris.cso.uiuc.edu > Department of Computer Science > University of Illinois at Shampoo-Banana The set of all finite sequences of finite strings in a language (the set of "discourses") is still just a countably infinite set (assuming that the alphabet is finite or countably infinite, of course). The set of infinite sequences of finite strings is uncountable, with the same cardinality as the set of real numbers, as is the set of infinite strings. (By infinite string or infinite sequence, I mean an object which is indexed by the natural numbers 0, 1, 2, ....) In general, sets of finite objects are finite or countably infinite. (A finite object is, vaguely speaking, one that can be identified by means of a finite representation. More specifically, this finite representation or description must enable you to distinguish this object from all the other objects in the set.) If you want to get an uncountable set, you must use objects which are themselves infinite as members of the set. Many people lose sight of the fact that a real number is an infinite object (although an integer or a rational number is a finite object). Any general method of identifying real numbers must use infinitely long or large representations (for example, decimals, continued fractions, Cauchy sequences, or Dedekind cuts). Real numbers are much more difficult to pin down than one might gather from many math classes. This misimpression is partly due to the fact that one deals only with a relative small (finite!) set of specific real numbers; these either have their own names in mathematics or they can be defined by a finite sequence of symbols in the usual mathematical notation. The other real numbers belong to a nameless horde which we use in general arguments but never by specific mention. I certainly agree with the general objections raised to the idea that natural languages are uncountably large (or, worse yet, proper classes), although I haven't read the book in question. Maybe somebody can state more precisely what the book claimed, but it seems at first glance to indicate a lack of understanding of modern set theory. By the way, logicians do study infinite languages, including both the possibility of infinitely many symbols and that of infinitely long sentences, but such languages are very different from what we think of as "natural language." It doesn't matter whether you're talking about context-free languages or more general sorts of languages -- in any language used by people for communication, the alphabet is finite, each word is finitely long, and each sentence is finitely long. -- Mitchell Spector |"Give me a Dept. of Computer Science & Software Eng., Seattle Univ.| ticket to Path: ...!uw-beaver!uw-entropy!dataio!suvax1!spector | Mars!!" or: dataio!suvax1!spector@entropy.ms.washington.edu | -- Zippy the Pinhead
goldfain@osiris.cso.uiuc.edu (11/12/87)
< /* Written 7:47 pm Nov 6, 1987 by spector@suvax1.UUCP in comp.ai */ < In article <8300011@osiris.cso.uiuc.edu>, goldfain@osiris.cso.uiuc.edu < comments on an article by berke@CS.UCLA.EDU: < > ... < > Such a set is countably infinite. Far from being a proper class, < > this is a very manageable set. If you move the discussion up to the < > cardinality of the set of "discourses", which would be finite sequences of < > strings in the language, you are still only up to the power set of the < > integers, which has the same cardinality as the set of Real numbers. < > Again, this is a set, and not a proper class. < > Mark Goldfain arpa: goldfain@osiris.cso.uiuc.edu < --------------------- < The set of all finite sequences of finite strings in a language (the set < of "discourses") is still just a countably infinite set (assuming that the < alphabet is finite or countably infinite, of course). The set of infinite < sequences of finite strings is uncountable, with the same cardinality as the < set of real numbers, as is the set of infinite strings. < ... < I certainly agree with the general objections raised to the idea that < natural languages are uncountably large (or, worse yet, proper classes), < although I haven't read the book in question. Maybe somebody can state < more precisely what the book claimed, but it seems at first glance to < indicate a lack of understanding of modern set theory. < ... < Mitchell Spector |"Give me a < Dept. of Computer Science & Software Eng., Seattle Univ.| ticket to < Path: ...!uw-beaver!uw-entropy!dataio!suvax1!spector | Mars!!" < or: dataio!suvax1!spector@entropy.ms.washington.edu | - Zippy the Pinhead < ------------------ OOPS---OOPS---OOPS---OOPS---OOPS---OOPS----OOPS---OOPS----OOPS---OOPS----OOPS Mitchell Spector is correct! I must have been thinking very sluggishly, and I hope none of my professors (past or present) is watching. In any case, we still have our basic objection, only it is now greatly strengthened. Indeed, it is a well-known and oft-used proposition in computing theory that the number of things that can be said in a finite-alphabet language is COUNTABLE. ENDOOPS---ENDOOPS---ENDOOPS---ENDOOPS---ENDOOPS---ENDOOPS---ENDOOPS---ENDOOPS Continuing the "attack" ... recall that the original posting said: > /* ---------- "Langendoen and Postal ---------- */ > ... > Their basic proof/conclusion holds that natural languages, as linguistics > construes them (as products of grammars), are what they call > mega-collections, Quine calls proper classes, and some people hold cannot > exist. That is, they maintain that (1) Sentences cannot be excluded from > being of any, even transfinite size, by the laws of a grammar, and (2) > Collections of these sentences are bigger than even the continuum. They are > the size of the collection of all sets: too big to be sets. ... My follow-up and Mitchell Spector's note explained why this statement is incorrect. Note that from the above, the issue is not really "real English", but the size of a language as SPECIFIED BY A GRAMMAR. I then asked: > I haven't seen the book you cite. They must make some argument as to why > they think natural languages (or linguistic theories about them) > admit infinite sentences. Even given that, we would have only the Reals > (i.e. the "Continuum") as a cardinality without some further surprising > claims. Can you summarize their argument (if it exists) ? The only response so far that hints as to their arguments is that posted by lee@uhccux.UUCP : > Concerning the length of sentences, I think Postal and Langendoen are not > very persuasive. Most of their arguments are to the effect that previously > given attempted demonstrations that sentences cannot be of infinite length > are incorrect. I think they make that point very well. But obviously this > is not enough To show that one should assume some sentences of infinite > length. > Greg Lee, lee@uhccux.uhcc.hawaii.edu Still, without more specifics, this whole argument may continue to be way out in left field. At the risk of knocking down a mere "straw man", consider: If these experiments set a size, say 100 words (or more reasonably 100 constituents), then proceeded to test subjects, finding that this was an insufficient upper bound, then tried 200, failing again, then tried 500, still finding subjects who thought the sentences were grammatical, then: 1) I am really, really surprised. I think the experiments should be replicated simply because I find the above too ludicrous to swallow. 2) We still have a LONG ways to go before we conclude that NO upper bound exists. (It is like the humorous story about the "engineer's proof" that all odd numbers > 1 are prime : "Let's see, 3 is prime, 5 is prime, 7 is prime, ... Yep! All of 'em must be!") Let them test a LARGE number like 10,000 and get back to me when the results come in ... 3) Finally, the notion of a set with no upper bound is a different mathematical beast than a set containing non-finite elements! Consider the positive integers; there is no largest integer, but that does not mean that any of them must be infinite ... in fact, none of them are. Note: The last point is rather briefly stated - if you haven't run across it in a course or somewhere before, it deserves a moment or two to sink in. These concepts generally get their first airing in Calculus courses, and many students tend to really grasp them only after about their 2nd Calculus course. ----------------------------------- From here, further discussion seems pointless, unless some of the actual data and claims from Langendoen and Postal are put forth. - Mark Goldfain