clutx.clarkson.edu (Roger Gonzalez,,,) (03/23/89)
I just read a rather scathing article by Steven Pinker (MIT) and Alan Prince (Brandeis) that tore PDP apart. Has anyone seen any responses to this article that defend Rumelhart and McClelland? The article bummed me out, and I'm not up enough on language processing to really debate what they claim. ++ Roger Gonzalez ++ spam@clutx.clarkson.edu ++ Clarkson University ++ "Just like I've always said; there's nothing an agnostic can't do if he's not sure he believes in anything or not!" - Monty Python
kortge@Portia.Stanford.EDU (Chris Kortge) (03/23/89)
In article <2726@sun.soe.clarkson.edu> spam@clutx.clarkson.edu writes: >I just read a rather scathing article by Steven Pinker (MIT) and >Alan Prince (Brandeis) that tore PDP apart. Has anyone seen any >responses to this article that defend Rumelhart and McClelland? >The article bummed me out, and I'm not up enough on language >processing to really debate what they claim. > The article doesn't tear PDP apart at all (I assume you mean the one in _Cognition_, published also as a book, "Connections and Symbols"). What it does do is tear apart one very simple model of a complex phenomenon (learning of past tenses). As far as I could tell, virtually all its criticisms could be answered with a multilayer network model; that is, the faults of the R&M model derive mostly from the fact that it's just a single layer associative network, with hand-wired representations. As I understand it (having heard Dave Rumelhart's response), the R&M model was never intended to be anything near a complete model of tense acquisition--rather, it was mainly supposed to demonstrate that "rule-like" behavior can coexist with "special case" behavior in the same set of connecting weights. If you want to read an article which _attempts_ to tear PDP _in general_ apart, read the one by Fodor and Pylyshyn in the same book. It didn't make much sense to me, but then I guess I don't have the MIT perspective. If someone really wants to blast PDP, there are _real_ problems with it, like scaling of learning time, which make more sense to focus on than the things F&P talk about. Chris Kortge kortge@psych.stanford.edu
manj@brand.usc.edu (B. S. Manjunath) (03/24/89)
In article <2726@sun.soe.clarkson.edu> spam@clutx.clarkson.edu writes: >I just read a rather scathing article by Steven Pinker (MIT) and ^^^^^^^^^^^^^^^^^^^^^^^^ >Alan Prince (Brandeis) that tore PDP apart. Has anyone seen any >responses to this article that defend Rumelhart and McClelland? >The article bummed me out, and I'm not up enough on language >processing to really debate what they claim. > >++ Roger Gonzalez ++ spam@clutx.clarkson.edu ++ Clarkson University ++ Could you please post a complete reference to this article (where it was published etc.). I am sure many of us will be interested in getting hold of a copy. Thank you, bs manjunath.
clutx.clarkson.edu (Roger Gonzalez,,,) (03/24/89)
> The article doesn't tear PDP apart at all (I assume you mean the one in > _Cognition_, published also as a book, "Connections and Symbols"). What it > does do is tear apart one very simple model of a complex phenomenon (learning > of past tenses). As far as I could tell, virtually all its criticisms > could be answered with a multilayer network model; that is, the faults > of the R&M model derive mostly from the fact that it's just a single > layer associative network, with hand-wired representations. Yeah, so I noticed after I waded through the last 50 pages. My impressions changed after I was done reading the article. Seems to me that they were getting all picky about details that I don't think R&M intended to be the god's truth about language processing... Correct me if I'm wrong, but weren't R&M just saying "Look, here's a simple little model that does a pretty good job with past tenses... and look, it even seems to exhibit some of the behavior of children learning language.." Some of P&P's accusations seemed pretty trivial anyway: "It can learn rules found in no human language" SOoooo? (Or are they assuming the ol' language aquisition device?) Anyway, I'm reading the "nastier" article you suggested right now. - Roger ++ Roger Gonzalez ++ spam@clutx.clarkson.edu ++ Clarkson University ++ "Just like I've always said; there's nothing an agnostic can't do if he's not sure he believes in anything or not!" - Monty Python
clutx.clarkson.edu (Roger Gonzalez,,,) (03/24/89)
> Could you please post a complete reference to this article (where > it was published etc.). I am sure many of us will be interested in > getting hold of a copy. Yeesh! I got over 20 mail inquiries for this! The article was in "Connections and Symbols", MIT Press ++ Roger Gonzalez ++ spam@clutx.clarkson.edu ++ Clarkson University ++ "Just like I've always said; there's nothing an agnostic can't do if he's not sure he believes in anything or not!" - Monty Python
kavuri@cb.ecn.purdue.edu (Surya N Kavuri ) (03/24/89)
In article <1078@Portia.Stanford.EDU>, kortge@Portia.Stanford.EDU (Chris Kortge) writes: > In article <2726@sun.soe.clarkson.edu> spam@clutx.clarkson.edu writes: > >I just read a rather scathing article by Steven Pinker (MIT) and > >Alan Prince (Brandeis) that tore PDP apart. Has anyone seen any > >responses to this article that defend Rumelhart and McClelland? > > If you want to read an article which _attempts_ to tear PDP _in general_ > apart, read the one by Fodor and Pylyshyn in the same book. It didn't ...... There is another paper with a similar claim. "Gradient descent fails to separate" is its title. By : M. Brady and R.Raghavan The paper shows the failure of BP in the case of examples where there are no local minima. They assert (and they could be right as such cliams have been "romantic",as Minsky put it!) that least square solutons do not minimize the # of misclassifications. They have examples where Perceptron does well while the gradient descent with LSR fails. They conclude that the failure of GD and LSE may be much more wide spread than presumed. SURYA (FIAT LUX)
randall@alberta.UUCP (Allan F Randall) (03/26/89)
In article <1078@Portia.Stanford.EDU>, kortge@Portia.Stanford.EDU (Chris Kortge) writes: > If you want to read an article which _attempts_ to tear PDP _in general_ > apart, read the one by Fodor and Pylyshyn in the same book. It didn't > make much sense to me, but then I guess I don't have the MIT > perspective. If someone really wants to blast PDP, there are _real_ > problems with it, like scaling of learning time, which make more > sense to focus on than the things F&P talk about. > > Chris Kortge > kortge@psych.stanford.edu I think the reason Fodor and Pylyshyn do not concentrate on those issues is because they are criticizing the philosophy of connectionism as a general approach to studying the mind, rather than attacking specific problems with current systems. The points they do discuss are all intended to be general problems inherent in the philosophy of connectionism, which they do not anticipate being solved. There are a few aspects of their reasoning that puzzle me; I thought I'd respond by posting my own reactions to the article. I would be interested in hearing from anyone who perhaps understands their perspective better, particularly since I haven't read any connectionist responses to Fodor and Pylyshyn (does anybody know of any?). First of all, though, I think they do a reasonably good job of clearing up a few contentious points concerning which issues are directly relevant to the connectionist/symbolist debate and which are not. For instance, while parallelism is central to most connectionist systems, there is nothing contrary to the classical symbolist view in massive parallelism. The same goes for the idea of soft or fuzzy constraints. I have two major problems with the rest of their article. First, they seem very limited in the types of connectionist systems they are willing to discuss. Most of the examples they give are of systems were each node represents some concept or proposition. Hence, they only discuss connectionist systems that already have a lot in common with symbolic systems. They talk very little about distributed representations and sub-symbolic processes. This seems strange to me, since I would consider these things to be the central justification for the connectionist approach. Fodor and Pylyshyn seem to be artificially limiting the connectionist architecture to a narrow form of symbolism and then judging it on its performance as a logic. What they fail to realize is that it is these very assumptions they use in judging PDP that connectionists are calling into question in the first place. *Of course* PDP, at least in its current forms, fails as a general logical inference mechanism. What (many of) the new connectionists are saying is that these systems work better at *certain types of tasks* than the classical systems. They are meant to address problems with the symbolist approach. Yes, they fail miserably at many things symbol systems do well, but this does not mean we must choose one over the other. This brings me to the other point, which I think is the key problem with Fodor and Pylyshyn's approach. They do not seem to consider the possibility of using *both* approaches. Their main argument is that mental representations have a "combinatorial syntax and semantics" and "structure sensitivity of processes." The upshot of this is that to do the sorts of things humans are good at, a system must have a systematic way to generate mental representations that have a constituent structure. Connectionist systems lack this language-like ability. This is an argument for a "Language of Thought." Because of this emphasis on the language-like aspects of cognition, many of Fodor and Pylyshyn's arguments are about the inability of PDP nets to deal with language. They then generalize to the rest of cognition. While this is not entirely invalid, I think it really weakens their argument, as language is the one aspect of cognition that seems to be the most symbolic and the least connectionistic. However, I would still agree with much of what they say. It is true that thought must have these properties. Cognition must be more than the statistical modelling of the environment. But Fodor and Pylyshyn give short shrift to the idea that both types of architectures will be needed to handle all aspects of cognition. Why could we not have a connectionist system modelling the environment and creating distributed representations that are used by a more classical symbolic processor? (This is, of course, only one way of looking at hybrid systems.) While Fodor and Pylyshyn do spend a little time discussing this sort of thing, it seems to be more of an afterthought, rather than a central part of their argument. This seems strange, especially since this is where the field of AI as a whole seems to be going. In short, Fodor and Pylyshyn are extreme symbolists. They believe in the classical symbolist view in its most extreme form: physical symbol systems are a necessary and sufficient condition for cognition. Their article does a good job of arguing for the "necessary" part, but pays little attention to the more central "sufficient" part. Like the extreme connectionists, they seem convinced that we must choose one or the other. They show that a pure connectionist system could not work and thus conclude that pure symbolism is the answer. To give them credit, while I disagree with their conclusions, I think they do a good job of explaining why an intelligent system must display these properties and why current connectionist architectures are insufficient on their own to do the job. They build a good case against extreme connectionism, but fail to explain why this implies the other extreme. To summarize, my problems with Fodor and Pylyshyn are: i) they criticize connectionism as (a) symbolic logic, and (b) language, largely ignoring its other aspects as unimportant to cognition. ii) they ignore hybrid connectionist/symbolist approaches. Finally, I think Fodor and Pylyshyn simply have different intuitions than I do. They seem to feel that the statistical and distributed nature of intelligence is really not very crucial, if it is there at all. While I disagree, I can certainly respect that. But I was disappointed with their article, because I didn't think it really addressed the issue. I would love to see a critique of connectionism that considered the "sufficient" aspect of symbol systems as rigorously as they discussed the "necessary" aspects. ----------------------- Allan Randall Dept. Computing Science University of Alberta Edmonton, AB