[comp.ai] Turing test

kck@g.gp.cs.cmu.edu (Karl Kluge) (06/10/89)
> From: gilbert@cs.glasgow.ac.uk (Gilbert Cockton)
> Subject: Turing Test and Subject Bias
> If a system passes the Turing Test with one subject, but not with
> another, should it be considered intelligent?
> 
> If 55% of a sample of hundreds say the system is intelligent, is it?
> 
> What if the subjects are:
> 
> 	a) drunk or on drugs (or  a') the experimenters are :-))?
> 	b) mentally subnormal?
> 	c) polite and don't want to upset the experimenters
> 	   (especially if a' applies too :-])?
> 
> Then how valid is the Turing Test?

Bzzzt! Straw man alert! Sorry, what you're referring to here is clearly not
the Turing Test proposed by Turing but the pseudo-Turing Test one more often
hears described.  The real question one wants to ask is:

If out of a sample of 200 people the number who pick the computer-pretending-
to-be-a-human as being the human is 50% (or the same fraction as pick the 
man-posing-as-a-woman as being the woman in part one of the "imitation game",
take your pick), then what is the strongest conclusion we should be willing
to draw from this?

> Just what sort of Science did young Mr. Turing have in mind when he 
> decided that subjective opinion could ever be a measure of system 
> performance?

Probably the sort of science philosophers do. Since philosophers have never
bothered in 3000 years of bickering to define what they mean by terms like
"mind", "consciousness", and "intelligence", I hardly think Turing can be
faulted for not providing an iron-clad test for determining whether a system
rates being called "intelligent"/"conscious". What he did was fall back on
competence as a test -- if the system can fool people sufficiently well that
they can't do better than chance at chosing the real human over the program
pretending to be a human, than the ball is in the philosopher's court to
either call the system "intelligent"/"conscious", or to provide us with a
distinction that lets humans retain those properties while excluding the
program.

Not to say that one can't offer methodological criticisms of Turing's
experimental design. For instance, the subjects should either be unaware of
the nature of the task they're really doing, or should be offered monetary
rewards for correctly picking the human. Given the usual population from
which psych subjects are chosen (i.e. college freshentities taking intro to
psych), any or all of a) - c) above may apply as well. Paying the subjects
for correctly unmasking the inhuman machine probably helps eliminate c). I
think part one of the "imitation game" may have been to try to tease out
this sort of bias. You have to be doing pretty badly at picking random
subjects if you have "hundreds" who are "mentally subnormal".

> How do AI types *REALLY* test their systems? 

Depends on which AI types you're talking about. The AI types who aren't
doing cognitive modeling generally use competence at the task the system is
supposed to perform. From what little exposure I have to the cognitive
modeling types here, the answer would seem to be comparing learning curves/
reaction times for the system and humans on a task.

By the way, I agree that the choice of field name was unfortunate. Computer
Aided Cognitive Modeling (which unfortunately is taken as an acronym) and
Computational Modeling of Complex Tasks / Perceptual Tasks would have been
better choices. Given the general discredit that the notion of intelligence
as something measurable as a scalar has fallen into, it makes the field
sound not only pompous but out of date as well.

Karl Kluge (kck@g.cs.cmu.edu)


--