ix133@sdcc6.ucsd.EDU (Catherine L. Harris) (08/18/86)
(I'm cross-posting to net.ai because this article, part of Where Does Structure Come From series, ends with an invitation to discussion PDP (connectionist) models of cognition -- which would be best included in net.cog-sci, except that there is no such group!) Questions: 1. Why are languages so similar? 2. Why do children learning different language show the same acquisition strategies and make the same patterns of errors? 3. How can children possibly learn a grammar without being explicitly taught any rules? Answers: 1. They aren't (as much as was thought....) 2. They don't (as much as was hoped) and when they do, it's not for the right reason. 3. It's this new method. See, they don't let the kids hear anything else except the one (or few) languages they're suppose to be learning... (attempts at an) Explanation: I'm aware that I really don't even scratch the surface of these questions, but I'll, uh, leave that as an exercise... a follow-up...) 1. How similar languages are is very debatable. One side alleges that each year sees another few putative language universals hit the dust. I guess the other side states it differently. (Other-siders...?) 2. Chomsky's original explanations for similarities in acquisition pattern was that children are born with information about the structure of language -- that is, they are born with phrase structure trees; with a Universal Grammar. As the cross-linguistic data dribbled in, showing that languages aren't as similar as was thought, this Grammar in the Head idea was modified to become the current Parameter Setting theory. Kids come to the language learning task with information about the range of different permissable forms that a language can take. Their job is to scan the input for clues as to whether they're learning Turkish or English (or Chinese, Kaluli, Tagalog, Navahoe, etc.). Is their language one which requires strict word-order (e.g., English) or can word-order vary (Turkish)? Can the subject of a sentence be dropped if it's uninformative or redundant (Italian) or is the subject obligatory (English -- which is why we say "it's raining" rather than simply "raining"). When the child decides that her language is one which allows word order to vary, she sets the 'word-order-vary?' flag to 'yes' and after a few crunches of the gears whole classes of hypotheses about the structure of the target rule system no longer have to be considered. The problem with Chomsky's parameter setting model is that it predicts (a) sudden, all-or-none decisions; (b) decisions carried out in a pre-specified order; and (c) no opportunity to turn back once a parameter is set. Instead, what we find in the acquisition data is that children cycle in and out of different hypotheses over months or even years -- and even the adult "steady state" exhibits statistical variation that appears difficult to explain with discrete rules of the re-write variety (or any kind of discrete rules, for that matter). Chomsky predicts that the pattern of acquisition should be similar across languages, because the child's early language behavior is being driven, top-down fashion, from the genetically-specified pool of hypotheses. Instead, we find that early language behavior shows an extreme mirroring of the input language. It looks like children's developing rule systems are being driven, bottom up, by the input data. (Digression: Not only does the order and type of hypotheses vary between children acquiring different languages, but it varies among children learning the *same* language. The amount of individual variation in language learning is probably the same as the variation in learning any complex skill -- people take their own strategies and make their own, often incorrect, representations of the target domain. So, some children seem to take an "analytical" approach to language. They focus on trying to decompose the speech stream into its component parts and to learn how those parts function; they learn nouns first and speak "telegraphically". Another strategy to learning language has been called the "dramatic", "expression", or "holistic" approach. These kids focus on language as a means to social goals; they learn sentences as unanalyzed wholes, are more socially gregarious, and make more use of imitation.) One Alternative to the Endogenous Structure View Jeffrey Goldberg says (in an immediately preceding article), > Chomsky has set him self up asking the question: "How can children, > given a finite amount of input, learn a language?" The only answer > could be that children are equipped with a large portion of language to > begin with. If something is innate than it will show up in all > languages (a universal), and if something is unlearnable then it, too, > must be innate (and therefore universal). The important idea behind the nativist and language-modularity hypotheses are that language structure is too complex, time is too short, and the form of the input data (i.e., parent's speech to children) is to degenerate for the target grammar to be learned. Several people (e.g., Steven Pinker of MIT) have bolstered this argument with formal "learnability" analyses: you make an estimate of the power of the learning mechanism, make assumptions about factors in the learning situation (e.g., no negative feedback) and then mathematically prove that a given grammar (a transformational grammar, or a lexical functional grammar, or whatever) is unlearnable. My problem with these analyses -- and with nativist assumptions in general -- is that they aren't considering a type of learning mechanism that may be powerful enough to learn something as complex as a grammar, even under the supposedly impoverished learning environment a child encounters. The mechanism is what Rumelhart and McClelland (of UCSD) call the PDP approach (see their just-released from MIT Press, Parallel Distributed Processing: Explorations in the Microstructure of Cognition). The idea behind PDP (and other connectionist approaches to explaining intelligent behavior) is that input from hundred/thousands/millions of information sources jointly combine to specify a result. A rule-governed system is, according to this approach, best represented not by explicit rules (e.g., a set of productions or rewrite rules) but by a large network of units: input units, internal units, and output units. Given any set of inputs, the whole system iteratively "relaxes" to a stable configuration (e.g., the soap bubble relaxing to a parabola, our visual system finding one stable interpretation of a visual illustion). While many/most people accept the idea that constraint-satisfaction networks may underlie phenomenon like visual perception, they are more reluctant to to see its applications to language processing or language acquisition. There are currently (in the Rumelhart and McClelland work -- and I'm sure you cognitive science buffs have already rushed to your bookstore/library!) two convincing PDP models on language, one on sentence processing (case role assignment) and the other on children's acquisition of past-tense morphology. While no one has yet tried to use this approach to explain syntactic acquisition, I see this as the next step. For people interested in hard empirical, cross-linguistic data that supports a connectionist, non-nativist, approach to acquisition, I recommend *Mechanisms of Language Acquisition*, Brain MacWhinney Ed., in press. I realize I rushed so fast over the explanation of what PDP is that people who haven't heard about it before may be lost. I'd like to see a discussion on this -- perhaps other people can talk about the brand of connectionism they're encountering at their school/research/job and what they think its benefits and limitations are -- in explaining the psycholinguistic facts or just in general. _______ Cathy Harris "Sweating it out on the reaction time floor -- what, when you could be in that ole armchair theo-- ? Never mind; it's only til 1990!"
goldberg@SU-Russell.ARPA (Jeffrey Goldberg) (08/19/86)
I wish to make it clear that my onw opinions are not reflected by the quote made... In article <2814@sdcc6.ucsd.EDU> ix133@sdcc6.ucsd.EDU (Catherine L. Harris) writes: >Jeffrey Goldberg says (in an immediately preceding article), > >> Chomsky has set him self up asking the question: "How can children, >> given a finite amount of input, learn a language?" The only answer >> could be that children are equipped with a large portion of language to >> begin with. If something is innate than it will show up in all >> languages (a universal), and if something is unlearnable then it, too, >> must be innate (and therefore universal). > In that paragraph, I was presenting the Chomsky view (and ridiculing it). For those of you who didn't not see my original posting, it is in net.nlang. I will refrain from presenting a lengthy response to Harris's posting. (I have work to do, and I sent more over the net in the past week then I ahve in my entire life.) But I will say that her attack on language universals is an attack on Chomsky, and there are people (linguists even) who believe in language universals, but share her objections to the Chomsky line. I realize that my original posting was very long (and should have been edited down), but I would suggest to Cathrine Harris that she make a hard copy of it, and read it more carefully. She will find that we agree more than we disagree. >Cathy Harris "Sweating it out on the reaction time floor -- what, > when you could be in that ole armchair theo-- ? Never mind; > it's only til 1990!" -Jeff Goldberg {ucbvax, pyramid}!glacier!russell!goldberg -- /* ** Jeff Goldberg (best reached at GOLDBERG@SU-CSLI.ARPA) */