kck@g.gp.cs.cmu.edu (Karl Kluge) (06/10/89)
> From: gilbert@cs.glasgow.ac.uk (Gilbert Cockton) > Subject: Turing Test and Subject Bias > If a system passes the Turing Test with one subject, but not with > another, should it be considered intelligent? > > If 55% of a sample of hundreds say the system is intelligent, is it? > > What if the subjects are: > > a) drunk or on drugs (or a') the experimenters are :-))? > b) mentally subnormal? > c) polite and don't want to upset the experimenters > (especially if a' applies too :-])? > > Then how valid is the Turing Test? Bzzzt! Straw man alert! Sorry, what you're referring to here is clearly not the Turing Test proposed by Turing but the pseudo-Turing Test one more often hears described. The real question one wants to ask is: If out of a sample of 200 people the number who pick the computer-pretending- to-be-a-human as being the human is 50% (or the same fraction as pick the man-posing-as-a-woman as being the woman in part one of the "imitation game", take your pick), then what is the strongest conclusion we should be willing to draw from this? > Just what sort of Science did young Mr. Turing have in mind when he > decided that subjective opinion could ever be a measure of system > performance? Probably the sort of science philosophers do. Since philosophers have never bothered in 3000 years of bickering to define what they mean by terms like "mind", "consciousness", and "intelligence", I hardly think Turing can be faulted for not providing an iron-clad test for determining whether a system rates being called "intelligent"/"conscious". What he did was fall back on competence as a test -- if the system can fool people sufficiently well that they can't do better than chance at chosing the real human over the program pretending to be a human, than the ball is in the philosopher's court to either call the system "intelligent"/"conscious", or to provide us with a distinction that lets humans retain those properties while excluding the program. Not to say that one can't offer methodological criticisms of Turing's experimental design. For instance, the subjects should either be unaware of the nature of the task they're really doing, or should be offered monetary rewards for correctly picking the human. Given the usual population from which psych subjects are chosen (i.e. college freshentities taking intro to psych), any or all of a) - c) above may apply as well. Paying the subjects for correctly unmasking the inhuman machine probably helps eliminate c). I think part one of the "imitation game" may have been to try to tease out this sort of bias. You have to be doing pretty badly at picking random subjects if you have "hundreds" who are "mentally subnormal". > How do AI types *REALLY* test their systems? Depends on which AI types you're talking about. The AI types who aren't doing cognitive modeling generally use competence at the task the system is supposed to perform. From what little exposure I have to the cognitive modeling types here, the answer would seem to be comparing learning curves/ reaction times for the system and humans on a task. By the way, I agree that the choice of field name was unfortunate. Computer Aided Cognitive Modeling (which unfortunately is taken as an acronym) and Computational Modeling of Complex Tasks / Perceptual Tasks would have been better choices. Given the general discredit that the notion of intelligence as something measurable as a scalar has fallen into, it makes the field sound not only pompous but out of date as well. Karl Kluge (kck@g.cs.cmu.edu) --