harnad@mind.UUCP (Stevan Harnad) (08/25/88)
On Pinker & Prince on Rules & Learning Steve: Having read your Cognition paper and twice seen your talk (latest at cogsci-88), I thought I'd point out what look like some problems with the argument (as I understand it). In reading my comments, please bear in mind that I am NOT a connectionist; I am on record as a sceptic about connectionism's current accomplishments (and how they are being interpreted and extrapolated) and as an agnostic about its future possibilities. (Because I think this issue is of interest to the connectionist/AI community as a whole, I am branching a copy of this challenge to connectionists and comp.ai.) (1) An argument that pattern-associaters (henceforth "nets") cannot do something in principle cannot be based on the fact that a particular net (Rumelhart & McClelland 86/87) has not done it in practice. (2) If the argument is that nets cannot learn past tense forms (from ecologically valid samples) in principle, then it's the "in principle" part that seems to be missing. For it certainly seems incorrect that past tense formation is not learnable in principle. I know of no poverty-of-the-stimulus argument for past tense formation. On the contrary, the regularities you describe -- both in the irregulars and the regulars -- are PRECISELY the kinds of invariances you would expect a statistical pattern learner that was sensitive to higher order correlations to be able to learn successfully. In particular, the form-independent default option for the regulars should be readily inducible from a representative sample. (This is without even mentioning that surely no one imagines that past-tense formation is an independent cognitive module; it is probably learned jointly with other morphological regularities and irregularities, and there may well be degrees-of-freedom-reducing cross-talk.) (3) If the argument is only that nets cannot learn past tense forms without rules, then the matter is somewhat vaguer and more equivocal, for there are still ambiguities about what it is to be or represent a "rule." At the least, there is the issue of "explicit" vs. "implicit" representation of a rule, and the related Wittgensteinian distinction between "knowing" a rule and merely being describable as behaving in accordance with a rule. These are not crisp issues, and hence not a solid basis for a principled critique. For example, it may well be that what nets learn in order to form past tenses correctly is describable as a rule, but not explicitly represented as one (as it would be in a symbolic program); the rule may simple operate as a causal I/O constraint. Ultimately, even conditional branching in a symbolic program is implemented as a causal constraint; "if/then" is really just an interpretation we can make of the software. The possibility of making such systematic, decomposable semantic intrepretations is, of course, precisely what distinguishes the symbolic approach from the connectionistic one (as Fodor/Pylyshyn argue). But at the level of a few individual "rules," it is not clear that the higher-order interpretation AS a formal rule, and all of its connotations, is justified. In any case, the important distinction is that the net's "rules" are LEARNED from statistical regularities in the data, rather than BUILT IN (as they are, coincidentally, in both symbolic AI and poverty-of-the-stimulus-governed linguistics). [The intermediate case of formally INFERRED rules does not seem to be at issue here.] So here are some questions: (a) Do you believe that English past tense formation is NOT learnable (except as "parameter settings" on an innate structure, from impoverished data)? If so, what are the supporting arguments for that? (b) If past tense formation IS learnable in the usual sense (i.e., by trial-and-error induction of regularities from the data sample), then do you believe that it is specifically unlearnable by nets? If so, what are the supporting arguments for that? (c) If past tense formation IS learnable by nets, but only if the invariance that the net learns and that comes to causally constrain its successful performance is describable as a "rule," what's wrong with that? Looking forward to your commentary on Lightfoot, where poverty-of-the-stimulus IS the explicit issue, -- best wishes, Stevan Harnad -- Stevan Harnad ARPANET: harnad@mind.princeton.edu harnad@princeton.edu harnad@confidence.princeton.edu srh@flash.bellcore.com harnad@mind.uucp BITNET: harnad%mind.princeton.edu@pucc.bitnet UUCP: princeton!mind!harnad CSNET: harnad%mind.princeton.edu@relay.cs.net