[net.sport.baseball] NL catchers, AI, and philosophy

dpb@philabs.UUCP (Paul Benjamin) (08/11/85)

Now, why don't we expand this whole discussion so that others
can take part, too? After all, no one else has made any
contributions to this, and we might as well conduct this by
mail, rather than over the net. So since the disagreement is
one of underlying philosophy, let's argue on that ground. It
may provoke more response. I make some relevant responses to
your statistical epic later on, but the important stuff is this:

You feel that by computing statistics based upon the raw numbers
which are available in various publications, you can attain an
understanding of the inner workings of the game, to the point
that you can describe the strengths and weaknesses of players,
discuss strategies, etc. I feel that this is not true, that instead,
statistics lead to a superficial appreciation of correlation, but
no definite understanding of cause-and-effect. 

This can be compared to the type of knowlegdge which current "Artificial
Intelligence" systems employ. Such systems, e.g. MYCIN and its
descendants, use associational knowledge to correlate stimuli and
desired responses, e.g., "If symptom A and symptom B are present,
then prescribe remedy X." This can be effective in many cases. But
there is a great deal of knowledge which cannot be captured this
way - the structural knowledge, i.e., WHY A and B tend to respond
to X. Even adding probabilities to this knowledge does not change
this limitation.

In the same way, baseball stats can reveal many correlations, and
can thus lead to a greater appreciation of the game and its
intricacies. But the structural knowledge, e.g., "Is player A
better defensively than player B?","Will player A contribute more
to my team than B?", is subjective, and hence dependent upon the
knowledge that a person has acquired about the game. Thus, the
only people who are qualified to make these judgements are baseball
professionals, not statisticians. Even though Davey Johnson uses a
compendium of stats to help him manage the Mets, HE still makes the
decisions, and can easily decide to ignore a stat. (After all, he
initially decides which stats to put into his machine, and which
to leave out.)

Also, these baseball pros have access to three sources of info that
you and I do not:

     1) They keep charts of every pitch, and every hit and play in
        the field. Thus, they have MUCH better data to rate the
        players;
     2) They see many more games, and from a better vantage point than
        we do (they have access to the tapes of the games, too, so
        they miss nothing that we see);
     3) They have the accumulated experience from their careers, which
        enables them to interpret what they see in ways that can be
        very different from the way we see things. It is precisely
        this lack of understanding that prompts people like us to
        revert to statistics. (Yes, I used to love stats, too. I
        really used to get off on dreaming up new measures, and reading
        all the numbers I could, until I became convinced of the
        lack of content of the stats.)

So, I leave the decisions to the pros, and hope that my hometown
pros make good decisions. The pros said last year that Pena was
better then Carter defensively. There was no outcry. Frankly, I
was surprised that Pena won it, and expected an outcry. But, when
there was none, I realized that Pena might actually have passed
Carter defensively. My amateur perceptions confirm this. Your stats
fail to convince me, for the reasons above. Send them to the people
who vote for these awards, or else it seems to me that you don't
really believe them yourself (or do you really think you know more
than they all do?)

(I will be away to conferences and vacation for two weeks. Postings
during this period may not be saved by my site.)
------------------------------------------------------------------
               Responses to statistical verbiage:

> Also, we really don't want to consider years
> before the principles established themselves in their respective teams
> starting lineups, so we consider Carter in 1975 and 1977-1984 (in 1975,
> Carter spent most of his playing time in the outfield (as the starting
> left fielder, generally) while serving as the #2 catcher; in 1976, he
> was just the back-up catcher) and Pena from 1981-1984.  

Why consider any years before 1984? The original question was which
should start the 1985 all-star game, not which was better in 1975.
I knew Carter was better than Pena before 1984 even without reading
any stats :-)

> It is widely recognized that the purpose of the offense is run
> production, and there are two distinct ways in which a hitter may
> contribute to it.  The first is to score runs, the second to drive
> them in.  Thus, traditionally, fans have placed great store in the
> most obvious measures of that production, runs scored and runs batted
> in.  Unfortunately, those traditional measures are heavily dependent
> on circumstances beyond the hitter's control: how well his teammates
> fare in doing THEIR job.  You can't score if no one drives you in, and
> you can't drive some one in if no one is on base.  If we are to
> evaluate individual performance, we must look at statistics that are
> NOT dependent on the action of anyone save the individual in question.

EXACTLY!!! If you can find a statistic that is truly independent of
teammate's contributions, I'd love to see it. All the stats you list
below (Putouts, %thrown out, DP's, BA, OBA, HR, R, RBI, slugging,
etc.) are dependent on: teammates' seasons, manager's tactics, place
in the lineup, ballparks, and others. Apparently one of your favorite
stats for pitchers is Earned Runs Prevented. You love to post a detailed
list to the net every so often. This stat is no more independent than
the old ERA. For example, if a pitcher pitches for a team which scores
fewer runs for him, then he may be lifted earlier, on the average.
This leads to fewer innings for him, and a lower score on ERP. This
effect can also be achieved bu playing for a manager who loves to go
to the bullpen early (e.g. Chuck Tanner.) These stats can be revealing
at times (Gooden leads by a large margin no matter how you measure
things) but using them to make finer distinctions is meaningless.

> Thus, we look at on On-Base Percentage (a.k.a.
> Average) to evaluate how well a player performs this function.  

You really love this stat. Fine. This is a free country. I think
it is just another meaningless stat. Hits are better than walks
any day. Since you love to compute, why not analyze how often runs
are scored with only walks, versus how often runs are scored with
only hits? Or how often runs are scored with no hits at all, versus
how often runs are scored without walks? This is baseball, not the 
on-base derby.

>(1) As we are limiting ourselves to statistics upon which the
>   performance of one's teammates has no direct bearing (OB
>   and SA), and for which there is no empirical evidence of a
>   substantial indirect bearing, it is irrelevant.

As stated above, this is an unproven assumption.

>(2) Pittsburgh was not a substantially less capable offensive
>   team than Montreal.  Pittsburgh's worst year was 1984

We are voting for the 1985 starting all-star catcher, remember?

> Both Carter and Pena batted in the middle of their respective orders
> (generally fifth for Carter, sixth for Pena), and probably have about
> 45 such opportunities in a season.  Assuming that the opportunities
> are uniformly distributed among out counts, 30 of these occurred with
> none or one out (actually, this probably overestimates the number of
> such opportunities, as outs accumulate as batters bat, thus implying
> that more runners are on base, on average, with two out than with none
> out).  Pena has good speed for a catcher, average for all runners, and
> would probably advance to third about 33% of the time (choosing the
> median value from Texas regulars); for Carter, my best guess is 20%
> (he's not as hopeless on the bases as Sundberg).  The difference,
> then, is probably about 30/3 - 30/5 = 4.  It does not make up for
> Pena's negative contribution in his stolen base attempts.

By your argument, speed is a negligent factor in baseball, at least
offensively. You apparently feel that breaking up double-plays at
second base, or avoiding them at first, or taking the extra base,
or causing an errant throw, are small factors. You like HRs and 
walks. Well, why not take up this argument some time with a professional
baseball person (which neither of us is) and tell him that speed is
negligable offensively? From everything I have heard and read, this is
not so. Again, we are fans. I trust what I hear from pros more than
your amateur judgement (or my own). 

			FIELDING

> As always, I advocate ignoring the number of errors and the fielding
> percentage, as lending substantial credence to those figures favors
> the sure handed man who doesn't cover much ground.  Total chances may
> be ignored, as we will treat assists and put outs separately.  Double
> plays should also be ignored, as they are more a function of
> opportunity (pitchers who tend to get grounders, pitchers who tend to
> load bases) than of skill.

Although I tend to agree with you here, can't you see that these
assumptions are subjective? How can you prove that the statistics
you favor are the "right" ones, and that the others can be ignored?

> If we assume that those strikeouts were as likely
> to occur at one time as another, 
> Catcher     PO              Est. K              Est. (PO-K)
> Carter     772           .83*861 = 715              57
> Pena       895           .86*992 = 853              42

Why make this assumption? It is quite possible that the Ks are not
exactly uniformly distributed. For example, it is possible that
Tanner used Pena's backup only with experienced pitchers. Now,
two of the three main K pitchers on Pitt in 84 were experienced,
but only one of the remaining two starters was a big K pitcher.
This would invalidate your assumption. Are you going to analyze
all 162 Pitt and 162 Montreal games? This is especially true since
your final numbers (57 and 42) are so small relative to the size of
the Ks. A few here and there could change things drastically.

> Thus we find that, in 1984, Carter was more successful in hunting down
> foul balls and getting putouts at the plate in somewhat less time catching.

Awfully unstable conclusion. Particularly in view of your own statement
that Montreal's park is larger than Pitt's. Even 10 or so more foul
pops caught by Carter change the stats significantly.

> The results: inconclusive, but not consistent with Paul's
> claims of clear Pirate supremacy.

These are not just my claims. As I have said before, and will say
again and AGAIN, argue this with the voters for the Gold Glove.
They have more knowledge about the players and their abilities,
not just numbers. HOW DO YOU KNOW THAT YOU EVEN HAVE THE RIGHT
RAW DATA WITH WHICH TO COMPUTE STATISTICS? Couldn't some raw
numbers be not even available to you?

david@fisher.UUCP (David Rubin) (08/19/85)

[Frankly, I'm now eager to move onto other things.  Apparently, Paul
 will continue to insist upon ignoring my arguments and in setting up
 straw-men to knock down (he continues, for example, to point out the
 shortcomings of SA and OB without addressing my contention that their
 indirect shortcomings are far smaller in magnitude than those of
 statistics directly influenced by teammates, and to attribute falsely
 to me a belief that these statistics reveal all).  Rather than answer
 him point-by-point (and wind up repeating myself incessantly in again
 presenting an argument for him to again ignore by again answering the
 argument he wishes or thinks I made instead of the one I did), I will
 answer carefully only the "philosophy" arguments (they are relatively
 fresh) and will summarize why I find it useless to continue the
 "statistical" argument.                                              ]

>............................... So since the disagreement is
>one of underlying philosophy, let's argue on that ground....

If the disagreement were truly one of philosophy, we would have
immediately headed for that ground.  Instead, you began by attempting
to support Pena statistically.  This is inconsistent with the general
indictment of those statistical methods that follows.  I cannot help but
wonder whether these philosophical questions would have even been
appealed to had I accepted your original argument based upon BA and
R+RBI-HR.

>You feel that by computing statistics based upon the raw numbers
>which are available in various publications, you can attain an
>understanding of the inner workings of the game, to the point
>that you can describe the strengths and weaknesses of players,
>discuss strategies, etc.

I am not that sanguine.  I realize, though, that if I am to achieve
ANY understanding of the game, I cannot ignore what I and others
observe.  Statistics are no more or less than summaries of those
observations.  Poor summaries can obfuscate and mislead; good
summaries can illuminate and inform.

>                         I feel that this is not true, that instead,
>statistics lead to a superficial appreciation of correlation, but
>no definite understanding of cause-and-effect. 

Whether the appreciation is superficial or fundamental depends on the
quality of the analysis.  Cause-and-effect is something statistics can
only suggest. We must eventually face up to the decision on whether a
given statistical association is causal or not; we usually decide this
by examining whether a plausible mechanism exists to link the proposed
cause and effect.  It can, however, guide us in a choice between two
plausible causes, for we may find that one proposed cause is far more
closely associated with the effect than another.  You seem to think
that I advocate using statistical methods without consideration for
previous knowledge; nothing could be further from the truth.  Our
differences arise not because I use the statistics without regard for
established knowledge, but because some things that you accept as
established I do not.

>................ baseball stats can reveal many correlations, and
>can thus lead to a greater appreciation of the game and its
>intricacies. But the structural knowledge, e.g., "Is player A
>better defensively than player B?","Will player A contribute more
>to my team than B?", is subjective, and hence dependent upon the
>knowledge that a person has acquired about the game. 

We may have a semantic problem here.  You apparently use "subjective"
to describe anything that is not directly observable; I use it to
describe that which cannot be logically inferred.  I do not contest
that these questions cannot be answered in any intelligent way by
watching a few games or reading the box score everyday; I believe that
for most players, we can make such comparisons once we have available
enough experience.

>.........................................................Thus, the
>only people who are qualified to make these judgements are baseball
>professionals, not statisticians. 

This is the real philosophic difference.  In any field of endeavor,
the relative merit of two proposals should be decided (in my view) by
force of argument; the origin of the proposals, though they might
affect our original estimation of their likelihood of correctness,
ought not affect our final evaluation.  I will not leave war to
generals, religion to clergy, government to politicians, education to
teachers, nor baseball to its "professionals".  

>                                   Even though Davey Johnson uses a
>compendium of stats to help him manage the Mets, HE still makes the
>decisions, and can easily decide to ignore a stat. (After all, he
>initially decides which stats to put into his machine, and which
>to leave out.)

There is no other intelligent way to use them; remember, statistics
are summaries, and one should feel free to incorporate information
that is not contained in them or properly weighted by them (though the
latter case strongly urges the selection of a new statistic).  When I
questioned your arguments for considering other factors, it was not
because I objected to such a consideration, but because I viewed the
introduction of such factors, with no attempt to substantiate them, as
speculative.  I did not question, for example, Pena's better speed,
but did question the extent to which it contributed to the Pirates in
ways that were not already accounted for in the evidence that I
presented.

>Also, these baseball pros have access to three sources of info that
>you and I do not:
>     1) They keep charts of every pitch, and every hit and play in
>        the field. Thus, they have MUCH better data to rate the
>        players;

But, as they do not make this information generally available, we have
no assurance that they even use it intelligently.  You seem to be
arguing that baseball professionals are uniformly capable folk in the
handling of data; ho!

>     2) They see many more games, and from a better vantage point than
>        we do (they have access to the tapes of the games, too, so
>        they miss nothing that we see);

Even a baseball professional does not have more than 24 hours in a day
to watch tapes; they, too, must eventually synthesize what they see
and accept summaries of what they don't see.

>     3) They have the accumulated experience from their careers, which
>        enables them to interpret what they see in ways that can be
>        very different from the way we see things.

	a) It is not as different as you imply.
	b) Where it differs, the "expert" may be wrong.
	c) The "experts" themselves differ, in which case who
	   is truly expert is an open question....

>                                                  It is precisely
>        this lack of understanding that prompts people like us to
>        revert to statistics.

Actually, it is the limited storage capacity of the brain, combined
with the unachievability of omniprescience, that drives us to
statistics.  Summarize we must.

>So, I leave the decisions to the pros, and hope that my hometown
>pros make good decisions.

This myth of a great gulf between expert knowledge and layman
knowledge has been created in order to protect your hometown pros from
the consequences of their decisions.  With your attitude, all you can
do is hope.  I prefer to criticize, and by criticizing, provoke
change (well, at least provoke argument!).

To shorten my article, I will content myself with noting that you have
been all too slippery.  First you used statistics, then you declared
them entirely irrelevant.  First you appealed to 1984's performance,
then you declared everything before this year irrelevant.  First you
claim that Pena is a better catcher than Carter, then you reduce that
claim to defense, or to 1985 (April-June inclusive), or whatever
appears to be the path of least resistance.  In just the latest
instance, you responded to my overwhelming evidence that the Pirates
of 1984 were NOT less productive than the Expos of 1984 by declaring
1984 irrelevant to the issue at hand -- while in the same article, you
again bring up the supposedly critical 1984 Gold Glove! 

I have been clear about my claims:  that Carter has been, is, and likely
will be substantially more productive offensively; that in the past, his
defense has been about as keen as Pena's, although I have no information
for this year; and that his offense so clearly outdoes Pena that even if
Pena is having a better year defensively, Carter has more merit as a
starting all-star this year.  You have claimed superiority for Pena,
offensively and defensively, this year and in years past, retracting
those claims as appears seemly, and occassionly reintroducing them once
the evidence I presented against them recedes from memory.

Regarding statistics, it appears to me that you consider them targets
of opportunity to be exploited for your argument's benefit; you use
your position to select your statistics, rather than the other way
around.  You may naturally presume others treat them the same way, and
thus can lament that they cannot yield any information.  A more
accurate statement is that they cannot yield any information when so
abused.  You lament the flaws any statistic must possess, and then
falsely infer that all statistics are thus equally worthless.  My
"alternate" statistics were never presented as perfect, but rather as
substantial improvements.

I mean none of this in a hostile or ill-mannered way.  Rather, I am
somewhat saddened at finding yet another person who so fundamentally
misunderstands what Statistics is all about.  It is NOT a collection
of techniques used to crank out numbers that we use in some prescribed
fashion; it is instead the search for patterns, and the interpretation
of patterns (or lack of them) found.

					David Rubin
			{allegra|astrovax|princeton}!fisher!david