dvm@yale.UUCP (Drew Mcdermott) (03/17/89)
Here is a paper I wrote a few years ago and never did much with, which seems relevant to recent discussions. Excuse the TeXisms. -- Drew The Red Herring Turing Test Drew McDermott Yale University In 1950, Alan Turing (in [<bibref>]) proposed his famous test for whether a machine is intelligent, namely, to see whether it could, when interrogated over a teletype, mimic a human so well as to be indistinguishable from the real thing.\footnote{$^1$}{Turing actually proposed something different, but this version makes more sense, so it is what people usually mean by ``Turing test.''} Since then, this test has become deeply embedded in the thought processes of cognitive scientists, to the point where many take for granted that it is an important criterion for whether a system is intelligent. Turing invented this test as a way of nudging people into entertaining an implausible and shocking idea, that machines could be intelligent. He wanted to duck the question, What is intelligence? What is often overlooked is that, if cognitive science is really ever to become a science, we do not have the option of ducking this question; we must answer it. Hence Turing's little scenario has become totally irrelevant to serious discussion about what thinking is. No one seems to realize this but me, hence this paper. What confuses most people is that they mistake Turing's attempt to avoid the question for an attempt to answer it. But anyone who believes that Turing's test is an interesting test for intelligence is guilty of behaviorism, not a crime in itself, but shameful in anyone who believes in cognitive science, the antithesis of behaviorism. Of course, it is probably true that a system that could fool a trained panel of experts into believing it intelligent would in fact be intelligent, but it is blatant waste of experts' time to have them sit on such panels, when they should be inquiring about how minds actually {\it work}. Compare the following hypothetical case: Human explorers land on a planet whose inhabitants are somewhat technologically backward. The locals are impressed by human gadgets, especially radio. They decide to try and understand it, so they rustle up some philosophers in order first to arrive at a criterion for something's being a radio. Their first cut is that a radio is a device that emits sounds whenever similar sounds are made in the control room of the earthlings' spaceship. But others object that this criterion does not rule out ordinary telephony, so the criterion is modified. Perhaps they arrive at something like, ``A radio is a device that emits sounds similar to those made in the earthlings' spaceship while suspended from the ceiling by a nonconducting string.'' This is all amusing, but a waste of time if the aliens really want to understand radio. No one needs an ironclad behavioral criterion for ``radiohood,'' assuming that there are plenty of indisputably genuine radios around to study. Such a study might eventually lead to a deeper definition of radio as ``A receiver of signals encoded as modulated electromagnetic waves,'' but by the time the definition was available it would be relatively unimportant, when stacked up against the theory of electromagnetism. Similarly with intelligence. If we ever have a theory that explains it, we will no longer care about distinguishing bogus understanding from the real thing. We will have a rich theory based on concepts we can now barely imagine, just as radio is based on something as unlikely as invisible electromagnetic waves. I point all this out because it is often assumed that to seek computational theories of thought, the kind AI researchers look for, requires adopting the Turing Test as a criterion for intelligence, or even the only criterion. Often in the midst of a discussion about such issues, I find the Turing Test suddenly glued to me like fly paper, and I am unable to shake it off. It is not as if I have been forced to adopt it by the logic of my position, but that it has been handed to me casually, almost as a favor by whoever I am arguing with. ``You espouse computationalism? Then you won't want to be without this handy accessory, Turing's Test.'' Flail as I may to get rid of this unwelcome visitor, it stays glued, and the discussion degenerates. A particularly bothersome case is Searle's Chinese Room argument [<bibref>], in which a non-Chinese-speaking person is hired to execute an algorithm that supposedly understands Chinese when executed by a digital computer. The person wouldn't thereby come to understand Chinese, but there's no essential difference between what the person does and what a computer would do, so there would be no reason to say the computer really understood Chinese either. QED I have always maintained [<bibref>] that the solution to this riddle is to see {\it two} understanders when an understanding person executes an understanding algorithm. It's quite simple, really: Computer executing algorithm: 0+1=1 understander Person executing algorithm: 1+1=2 understanders The fact that these two understanders occupy the same body, and the way the two relate, should make us smile, not choke. The Searly comeback is this: ``Come now, it's preposterous to imagine two souls inside one body like this. One of those understanders is there (and knows he's there), but the other is sanctioned only by the Turing Test, which you computationalists insist so much on. Without this behavioral test, what gives you the right to call it an understander at all?'' His actual words [<pageref>] are these (he has cast himself in the role of understander, hence the first-person pronouns): ``What {\it independent} grounds are there supposed to be for saying that the agent must have a subsystem within him which literally understands stories in Chinese? As far as I can tell the only grounds are that in the example I have the same input and output as native Chinese speakers and a program that goes from one to the other. ... The only motivation for saying there {\it must} be a subsystem in me which understands Chinese is that I have a program and I can pass the Turing test, I can fool native Chinese speakers. But precisely one of the points at issue is the adequacy of the Turing test.'' The fallacy here (one is tempted to call it sophistry) is an unjustified revision of the initial assumption halfway through a {\it reductio ad absurdum}. The argument starts by assuming there exists an algorithm $A$ which understands Chinese. No such algorithm has been written, and if none ever is the whole issue will be moot. But that's okay --- the argument is designed to reduce someone's position to an absurdity, so we should start by assuming that someone's position is true. Whose position? --- the computationalists'. But if AI ever succeeds in producing algorithm $A$, presumably it will be due to the discovery of a nontrivial theory of understanding Chinese. It is {\it this theory}, not Turing's Test, that will say whether an entity understands or does not understand Chinese. One consequence of this theory is that any entity that executes $A$ will understand Chinese. Searle may feel that this idea is absurd, but he is not allowed to cite that feeling at this point. One cannot get away with a {\it reductio ad absurdum} of the form: ``Suppose my opponent's position were true. But that would be absurd.'' The argument has to cover some ground, and has to hit a conclusion that everyone agrees is absurd. Since Searle's argument doesn't do this, he is forced to drag in Turing's Test, and pronounce {\it it} absurd. But it is crucial to realize that {\it he} brought Turing's Test in; it didn't come with the position he is attacking. In other words, it is completely false that ``one of the points at issue is the adequacy of the Turing test.'' When he asks for ``independent grounds'' for believing there are two understanders, he overlooks where they come from: the theory behind $A$.\footnote{$^2$}{By the way, his use of the term ``subsystem'' is an effort at disinformation. No such subsystem is proposed by his opponents; rather, the entire system embodies two understanders.} Let me dramatize this rebuttal by returning to the radio-free planet. Suppose that some budding electrical engineer proposes that diodes have something to do with radio reception, and proposes a simple circuit to demodulate radio waves. An alien Searle might respond thus: ``If this proposal were correct, we could make a radio by having someone hold two halves of a diode in each hand, while electricity was conducted through his gut. Then, since we could suspend him from the ceiling by a nonconducting rope, your theory would have the consequence that this person would be a radio! This is ridiculous on its face, not to mention the consequence that his gut, which is manifestly a digesting organ, would also have to be a conducting organ as well, which doesn't seem possible.'' This argument is not in detail analogous to Searle's Chinese Room argument, but it does show two similar features: \item{1} Part of its force comes from its painting a ludicrous picture. In this case it's a picture of a person suspended from the ceiling and receiving radio waves. In the original argument, it's a picture of a person apparently understanding Chinese without knowing it. In the case of the radio, we are not fooled, since we know that you really could build a radio this way. In the case of the understanding algorithm, where we don't yet know if it would work, mere silliness should still carry little weight. \item{2} The rest of the argument's force comes from its invocation of a pointless behavioral criterion. The electrical engineer doesn't need the definition of a radio as a device that emits sounds when suspended from the ceiling. If he didn't lack philosophical street smarts, he would insist that this crazy criterion be thrown out. In the original argument, there is no need at all for AI researchers to defend Turing's Test, but philosophers often succeed via presupposition in convincing us that we have some stake in it. \noindent When these two features are eliminated, neither argument has any force at all. Searle ends up attributing to his opponents the premise that a computational theory of mind exists and has been confirmed by Turing's Test. But I don't want that premise; I want what Searle calls ``strong AI,'' the theory that any computational device that simulates thoughts and emotions in the right way would {\it have} those thoughts and emotions. I don't want this premise ``as confirmed in a certain way''; I just want the premise, plain and simple. As Searle says at the outset [<pageref>]: One way to test any theory of the mind is to ask oneself that would it be like if my mind actually worked on the principles that the theory says all minds work on. \noindent But if strong AI is actually true, then perforce any creature, including our deluded human CPU, will indeed exhibit understanding when he runs algorithm $A$. If strong AI is true, it will not matter whether it has been confirmed or even noticed by the human race. If it's true, then any system executing $A$ will understand, and that's that. I realize that some may find my counterargument perplexingly vacuous. Nothing I have said here provides the slightest evidence that anything like an AI account of mind will prove correct. But my goal was only to knock Searle's argument down, not set up one of my own. In order to refute a {\it reductio ad absurdum} argument, it is not necessary to find reasons for believing the starting assumption after all; it is only necessary to find a flaw in the route to the alleged contradiction. In fact, we know almost nothing about what a computational theory of mind would look like. Why should this bother us, seeing as how Searle is willing to reveal almost nothing about his alternative ``causal'' theory? My only point is that, given the anemic state of our theorizing, it is ridiculous to demand that we provide in advance any empirical test for the presence of understanding, intelligence, consciousness, or any other phenomenon. By the time a theory of these concepts is available, they will no doubt have all been revised beyond recognition anyway. (Suppose that early physicists had been required to provide in advance a test for whether their concept of energy matched the phenomenon exhibited by a person ``bursting with energy.'') Hence, we do not have to embrace Turing's Test, or provide any substitute, at this point in our efforts. In fact, there is no reason for anyone to be drawn into an argument about the a priori plausibility of AI, except for the fascination and strongly conflicting intuitions that the subject affords. If a computational theory of mind is ever found, Turing's Test will play no role in it. For one thing, it can never hope to provide a necessary condition for intelligence, but only a sufficient one. Presumably a full-blooded theory will allow us to locate many different degrees and types of intelligence in all sorts of entities that could never pass Turing's Test. Contrariwise, although it is implausible that something could pass Turing's Test and not be intelligent according to such a theory, it is possible that this is so. (For instance, people might have some blind spot which caused them to be inevitably gullible about certain patterns of behavior which a machine could duplicate without any real intelligence.) But even if passing the Turing Test does turn out to be a sufficient condition for intelligence, it will still be the theory of intelligence that will matter, not the test.
reiter@babbage.harvard.edu (Ehud Reiter) (03/20/89)
As a followup to Drew McDermott's excellent article on the Turing test and the Chinese Room problem, let me add one small note. There is a certain mindset that equates intelligence with whatever humans do. Now, we know that there are plenty of types of reasoning which humans are pretty bad at. Multiplying large numbers is an obvious case. More interesting perhaps are - probabilistic reasoning. Kahneman and Tversky have shown that people make fundamental mistakes, such as ignoring priors and assuming that p(A&B) can be greater than p(A). - predictive tasks. A long literature, dating back to Paul Meehl, shows that statistical techniques usually out-perform expert human judgement, provided that the data is quantifiable. (I sometimes wonder what the expert system people have to say about Meehl's findings. If a simple linear regression can do a better job than a human expert, why bother building a computer system that attempts to emulate human judgements?). (see JUDGEMENT UNDER UNCERTAINTY: HEURISTICS AND BIASES, edited by D. Kahneman, P. Slovic, and A. Tversky. Especially chapters 1 and 28). A machine that could pass the Turing test would have to be programmed to do as badly as humans at multiplying, probabilistic reasoning, and predictive tasks. But is it really critical to the definition of intelligence that, say, an entity ignore prior probabilities when making probabilistic judgements? I doubt it, and suggest that finding out how to do a good job on the above reasoning tasks is more important than finding out how to replicate the mistakes humans make. Ehud Reiter reiter@harvard (ARPA,BITNET,UUCP) reiter@harvard.harvard.EDU (new ARPA)
mayoung@bnr-di.UUCP (Mark Young) (03/21/89)
In article <1441@husc6.harvard.edu>, reiter@babbage.harvard.edu (Ehud Reiter) writes: > A machine that could pass the Turing test would have to be programmed to do > as badly as humans at multiplying, probabilistic reasoning, and predictive > tasks. This is not strictly true. The machine would only have to be able to emulate this behaviour. Aftir all, I can imatate a guy who cant spel to good, even if my own spelling is (reasonably) good. Remember that the point of the TT was for the machine to _fool_ the observer into thinking it was a person. If the machine feels that giving the wrong answer to a multiplication problem will help it in this task, it will give the wrong answer. > > Ehud Reiter Mark Young