[net.sci] Parapsychology: more on the decline effect.

cooper@pbsvax.dec.com (07/04/86)

Bill Jefferys ({allegra,ihnp4}!{ut-sally,knew}!utastro!bill,
bill@astro.UTEXAS.EDU) has claimed as an explanation for the decline effect
in parapsychology (the tendency for the scoring rate to decrease over the
course of an experiment) the following: 

>Not if you have selected for further study precisely those individuals
>who scored high initially, and if you include those earlier trials
>in the overall test results.

I admitted that such the error of inappropriately mixing screening with
testing trials might have occurred on rare occasions in the
parapsychological literature (as a chemist might admit that dirty test
tubes may have been the actual cause of some published results).  I thought
it was *very* unlikely that any such flaw was the source of any claimed
evidence for the decline effect which had appeared in a refereed
parapsychology journal or other legitimate piece of parapsychological
literature, however.  The error would have been too obvious in this case
to have been missed by the referees. 

I therefore challenged him to produce even a single example of this, by
"direct or indirect" citation. 

He responded with:

>The University of Texas doesn't subscribe to parapsychological journals,
>so I can't satisfy you completely. However,
> 
>A quick trip to the library turned up the book, "Parapsychology: Science
>or Magic?", by J. E. Alcock (Pergamon, 1981). His discussion and explanation
>of the "Decline Effect" is basically the same as the one I suggested.

This is a legitimate and reasonable answer to my request.  It is the type
of thing which I had in mind when I was speaking of an "indirect" citation. 

Unfortunately, in this case, it is not enough. 

I own or have read a fair amount of the critical literature, but I do not
own nor have I read this book.  I checked for it in the MIT library system
but no copy is owned by them either.  This is, of course, no criticism of
the book.  It just means that I cannot effectively respond without further
information. 

Since you have access to this book please post the citations to the
literature which the author uses to support his/your thesis.  If he does
not provide such citations, then this book does not even partially answer
my challenge.  I do not question that others have made this accusation
before.  I have seen or heard it made a number of times.  I have just not
seen any concrete evidence to back it up. 

>He also cites Spencer Brown ("Probability and Scientific Inference", 1957)
>who showed that data from 100,000 published random numbers, analyzed using
>the Quartile method in use at Duke University, showed a very significant
>Quartile Decline effect.

Sorry, I fail to see how this supports your thesis at all. 

G. Spencer-Brown is best known for his book "The Laws of Form."  (I'm doing
this off the top of my head so I can't give publication info)  It is the
thesis of this work that there is something fundamentally wrong with all
existing formal logic, and therefore, with the foundations of all modern
mathematics.  A friend of mine once said of it, "It would clearly be a work
of genius if only it made any sense at all."  Many people who read it,
including those with a fair amount of mathematical sophistication, are left
with the feeling that Spencer-Brown seems to have said *something* of
importance but it's completely unclear what. 

Spencer-Brown considered himself a critic of parapsychology.  He felt that
parapsychologists felt that they were investigating a physical phenomena
while in reality, all they were doing was demonstrating that probability
and statistical theory were fundamentally flawed (of course).
Specifically, he believed that published random number tables were not
"random".  He never really specified the nature of their non-randomness
except that it would conveniently result in the success of many
parapsychological tests whatever use they made of the table (the dominant
methodology for randomization during the 50's in parapsychology was to use
published random number tables, such as the RAND 1,000,000 random digits,
to determine the targets for the experiment in one or another ways). 

I should say that I would consider the demonstration of such an error in
elementary statistics and probability to be a *very* unexpected outcome.
In my opinion, it would, however, justify the most basic claim of
parapsychologists, i.e., that psi is something real and important.  I have
heard similar statements by other parapsychologists.  Of all the resolutions
I can think of for the mystery of psi, this one would have one of the
largest impacts on the practice of science and engineering. 

Anyway --

To prove his point Spencer-Brown performed the cited experiment.  He
entered a random number table (probably the RAND table, but I don't
remember for sure) at two arbitrary points.  He declared one entry point as
being for his targets and the other as being for his calls.  He then matched
the next N digits (I'll take Bill's word that it was 100,000) one by one and
came up with a significant number of matches and a scoring decline.  (I may
be misremembering the details here, its been years since I read it.  I have
the feeling that his actual procedure was a bit more complex than this, but
I'm pretty sure that this was the essence). 

I won't bother to discuss generally the various interpretations of this
experiment here, as its irrelevant to the point. 

If Spencer-Brown did what he claimed (entered the table with the first pair
of entry points he tried) then this is either an irrelevant fluke, or
*disproves* your point, since the decline effect appears without screening
trials to be inappropriately included with the data. 

If Spencer-Brown actually tried multiple sets of entry points until he
found some which gave many early hits, then it still gives no support to
your claims.  No one is denying that a decline effect would be produced if
someone committed this obvious and egregious error (although, the apparent
*continuous* decline which appears to be the characteristic of the actual
effect would not be produced this way).  Spencer-Brown was a *critic* of
parapsychology, however unconventional, rather than a parapsychologist.
His behavior therefore cannot be taken to say anything about the way that
parapsychologists conduct experiments. 

>I also would guess that some of the articles cited by Marks (1986: _Nature_
>vol 320, pp. 119-124) would treat this issue, particularly Refs. 42-45.
>Unfortunately, I don't have access to any of them so I don't know for
>sure.

I am only familiar with three of those four references.  None of these
touch upon the issue.  I doubt if the fourth supports your thesis.  First,
because if John Beloff, probably Great Britain's leading parapsychologist,
had made such a claim, I suspect I would have heard about it.  Secondly,
the decline effect does not seem to me to be particularly germane to the
topic of the paper, i.e., the inappropriateness of strict repeatability as
a criterion for the acceptance of evidence of psi.  Third, I have heard
Beloff discuss this topic elsewhere, and he made no mention of the decline
effect. 

As for the other references, none that I have read provide any solid
support for your thesis.  I am willing to accept an "indirect" reference
through the critical literature if I have access to the work in question.
But a reference to a set of references covering a moderate amount of the
technical critical literature of the last 20 years, with a comment of
"something here probably supports my claim" is hardly a reasonable
response. 
 
>The odd thing is that the "Decline Effect" is cited by parapsychologists
>as "Evidence" of the reality of Psi. In any other field, such an "effect"
>would be cited as evidence that the original observations were flawed 
>in some way, or were the result of a statistical fluke.

Unless it were found repeatedly under a variety of experimental conditions
and scientists involved were at all competent.  Then they would give it
a name (such as "the decline effect" or "conditioning extinction" or
"pulsar slowdown" :-) and attempt to study it.

		Topher Cooper

USENET: ...{allegra,decvax,ihnp4,ucbvax}!decwrl!pbsvax.dec.com!cooper
INTERNET: cooper%pbsvax.DEC@decwrl.dec.com

Disclaimer:  This contains my own opinions, and I am solely responsible for
them.

bill@utastro.UUCP (William H. Jefferys) (07/10/86)

Topher Cooper says:

>Bill Jefferys ({allegra,ihnp4}!{ut-sally,knew}!utastro!bill,
>bill@astro.UTEXAS.EDU) has claimed as an explanation for the decline effect
>in parapsychology (the tendency for the scoring rate to decrease over the
>course of an experiment) the following: 

>>Not if you have selected for further study precisely those individuals
>>who scored high initially, and if you include those earlier trials
>>in the overall test results.

>I admitted that such the error of inappropriately mixing screening with
>testing trials might have occurred on rare occasions in the
>parapsychological literature (as a chemist might admit that dirty test
>tubes may have been the actual cause of some published results).  I thought
>it was *very* unlikely that any such flaw was the source of any claimed
>evidence for the decline effect which had appeared in a refereed
>parapsychology journal or other legitimate piece of parapsychological
>literature, however.  The error would have been too obvious in this case
>to have been missed by the referees. 

Aha! I see the problem now. We have been talking at cross-purposes. It is
my fault; I was trying to keep my response to Dave short, and I worded it 
very badly. I didn't mean to accuse experimenters of such an obvious 
mistake. But I agree that that is how my comment reads. I have to apologize 
to you. I had in mind a more subtle kind of unconscious "selection" that 
could take place after the trials began in earnest.

There could be a "selection effect" of the following sort: Alcock cites a 
number of studies showing that initial success in this kind of experiment
is much more likely to lead to belief by the subjects (and presumably by
the experimenter) that nonchance effects are involved, than is initial
performance at the chance level followed by increasing success. (The
outcomes of the "random" results in some of these studies were, 
unknown to the subjects, manipulated by the experimenter).

In any population of subjects that has been selected for further study,
a certain fraction will (purely by chance) continue to score well for
a while, while others will revert quickly to the chance level. (This
assumes there are in fact no other unknown biases affecting the study). 
There are a number of factors working that will tend to keep the subjects 
who continue to score well in the study for more and more trials, while 
those who initially fall by the wayside will tend to drop out. One is the 
tendency of the experimenter to want to go with a winner. The more 
spectacular the initial success, the more time will be spent with such a 
subject. After all, the name of the game is to publish papers, and no 
experimenter likes publishing papers about his failures. Another is the 
tendency of the subject to drop out if he or she is not showing "paranormal
abilities". The sooner such a subject's performance drops to the
chance level, the less the reinforcement, and the more likely it is
that he or she will drop out. This would happen even if the experimenter
tried to keep all subjects in to the bitter end of a prescribed number
of trials.

The net result would be a set of short trials with subjects who scored 
at the chance level; a smaller set of longer trials with subjects who 
initially scored above chance but reverted to chance rather quickly;
and a small set of subjects who initially scored spectacularly, who
got a lot of attention and many trials, and who also reverted to 
chance levels. In other words, a "Decline Effect". The reason why 
one doesn't see an "Increase Effect" is that few if any studies would 
do extensive testing on subjects that start out poorly.

The effect could be compounded if there are biases (e.g., unconscious
cueing) which initially inflate scores, but which come under better
control as attention is focussed on a high-scoring subject.

Is this an unreasonable scenario? If so, why? (Topher, if you respond
to this, please E-mail me a copy, as I will be at a meeting for a few
days & our machine expires news in 3 days).

-- 
Glend.	I can call spirits from the vasty deep.
Hot.	Why, so can I, or so can any man; But will they come when you
	do call for them?    --  Henry IV Pt. I, III, i, 53

	Bill Jefferys  8-%
	Astronomy Dept, University of Texas, Austin TX 78712   (USnail)
	{allegra,ihnp4}!{ut-sally,noao}!utastro!bill	(UUCP)
	bill@astro.UTEXAS.EDU.				(Internet)

cooper@pbsvax.dec.com (Topher Cooper DTN-225-5819) (07/16/86)

Bill Jefferys ({allegra,ihnp4}!{ut-sally,knew}!utastro!bill,
bill@astro.UTEXAS.EDU) has proposed an explanation for the decline effect
in parapsychology.  Essentially he proposes that parapsychologists tend to
"stick" with subjects who happen to score well initially (the reader is
referred to <981@utastro.UUCP> for details).  This would produce a decline
effect by selection.  He asks if this is "an unreasonable scenario." 

It is not unreasonable but it is irrelevant to what we have been
discussing. 

There is not one "decline effect" but many.  Among them is the run
decline effect, the score-sheet decline effect, the run series decline
effect, the experiment decline effect, and the subject decline effect.
These differ in the unit over which the decline is measured. 

A run is a series of trials which are done essentially in "one go".
Classically, a run corresponded to a single deck of 25 Zener cards. The run
decline effect is the tendency for the average score for the first trials
of an experiment to be higher than the second, which in turn is higher than
the third etc. 

The score-sheet decline has to do with the old manual scoring sheets.  It
was found that there was a frequently a declining trend from the average of
the first run on each sheet to the last run on each sheet. 

The run series decline is a tendency for scores to decline over runs in a
run series.  A run series a part of an experiment over which conditions for
any given subject are held constant. 

The experiment decline is a tendency for the average score per run to
decrease over the course of an experiment. 

The subject decline effect is the tendency for the performance of a subject
used in a series of experiments to decline over time.  It is frequently
noted by researchers that good subjects seem to "burn out". 

These different decline effects can be put together into a single phenomena
by assuming that there is a continuous decline for any given subject. Novel
conditions seem to cause some recovery over the basic trend. This is,
however, speculation -- the different decline effects may come from
distinct causes. 

The best established of these effects are the run decline and the
experiment decline.  We have been very specifically speaking of the
experiment decline. For example, in the article in which he makes this
proposal Bill Jefferys quotes me as saying: 

>>			. . .				   the decline effect
>>in parapsychology (the tendency for the scoring rate to decrease over the
>>course of an experiment) . . .

Nowhere in this discussion has anyone disputed this definition. 

A properly designed experiment of the general form that a parapsychology
experiment takes must include a clearly specified termination criteria. The
termination criteria specifies when the experiment is to stop.  The
termination criteria must be shown to be independent of the hypothesis
under study.  Almost universally, the simple termination criteria of
stopping when the/each subject has performed a specified number of trials
is used. 

One therefore does *not* have some initially good subjects with many
(declining) trials, and (many more) initially poor subjects with a few
trials as required by Bill's hypothesis.  In a given experiment one has one
subject or several subjects with the same number of trials for each.  The
number of trials is determined before the experiment is conducted.  The
decline effect is then either demonstrated across all the subjects or for
those subjects whose *total* score is the highest. 

Bill says that 

>			    This would happen even if the experimenter
>tried to keep all subjects in to the bitter end of a prescribed number
>of trials.

but does not explain how -- presumably, by the experimenter failing. In
such a case, the experimental protocol would have been violated.  According
to standard practice, either the experiment would have to be abandoned or
the experimental report would have to mention this fact.  Selection biases
from weak subjects dropping out are well understood in parapsychology.  Any
analysis of such an experiment, for the decline effect or for any other
purpose, would have to take this into account. 

The scenario you propose *might* explain the *subject* decline effect, but
it requires that high psi scores be a result of chance.  This is the one
hypothesis which can be categorically rejected.  The odds against the
results of parapsychological experiments being due to chance rather than
rather than one or more systematic biases (conventional or unconventional)
are astronomical, to say the least.  The odds against subject differences
being by chance are only slightly smaller. 

A variant on this hypothesis might, however, reasonably explain the subject
decline effect.  If subjects' average score (as somehow magically measured
independently of any experiment decline effect) tends to both increase and
decrease over periods of years (e.g., oscillate) then subjects who are used
repeatedly in experiments would tend to be selected when they are scoring
high.  Inevitably, they would show a decline over many experiments. 

As far as I know the subject decline effect has simply been an observation
by experimenters.  I know of no study which purports to demonstrate it as a
consistent effect.  It is therefore at best a very weakly demonstrated
effect which is, nevertheless, plausible as an extrapolation of the
experiment decline effect.  This hypothesis, though interesting, does
little to change this basic picture. 

>The effect could be compounded if there are biases (e.g., unconscious
>cueing) which initially inflate scores, but which come under better
>control as attention is focussed on a high-scoring subject.

It would be except that care is, of course, taken to apply controls
uniformly across subjects and across time.  This is another form of the
"incompetent experimenter" assumption. 

		Topher Cooper

USENET: ...{allegra,decvax,ihnp4,ucbvax}!decwrl!pbsvax.dec.com!cooper
INTERNET: cooper%pbsvax.DEC@decwrl.dec.com

Disclaimer:  This contains my own opinions, and I am solely responsible for
them.