[net.sport.baseball] Return of Lineup Dependency

dpb@philabs.UUCP (Paul Benjamin) (10/02/85)

David Rubin writes (again!):

>>.......................................It is not just my responsibility to
>>prove that lineup dependencies exist. It is also yours to prove that
>>they don't!

>You can never prove something doesn't exist (how does one proceed in a
>disproof of existance?).  It is considered sensible in most circles to
>keep one's explanation of events as simple as possible: we need only
>consider new factors if they somehow improve our understanding of
>events.  It is therefore the burden of one who wishes to include an
>"effect" to show its inclusion improves our knowledge or understanding,
>for if we can do as well without it, we have no reason to use it.

IF we are starting with your explanation, and then considering mine
as the "new" one, which I am sure is your point of view. But mine is
the opposite: to me my understanding of baseball is the original, and yours
is the "new" one. Thus, to me the burden is on you to show how SA+OBA is so
important in improving our understanding of baseball.

>>> All of Paul's explanations mean little,
>>> therefore, until he establishes that what his explanations explain has
>>> indeed happened!  Only in the case of Mattingly does he attempt to
>>> actually demonstrate that a lineup effect exists, and I will therefore
>>> concentrate on it.  Elsewhere, he merely shows lineup effects are
>>> consistent with his selected observations without either showing other
>>> explanations are inconsistent or that the observations would be
>>> inexplicable without lineup effects. 
>
>>This is exactly the point I have been making for weeks. "it may be
>>misleading unless we know that the circumstances of the two categories 
>>are otherwise similar..." 
>
>Strange, but when I did try to adjust for these effects in the
>Carter-Pena discussion, you protested vociferously!  I am all for
>adjusting for effects whose existance is demonstrable, and thus had
>called earlier for the inclusion of Palmer's "Park Factor", which
>considered, explictly or implicitly, park dimensions, day/night
>balance at the home field, and the quality of the hitter's own
>pitching staff.  If it could be shown that some complex scheme to
>correct for the changing quality of the opposition is necessary (most
>teams remain about as talented in August as they were in May, and the
>ones that don't may not have a substantial effect), I would certainly
>entertain that correction.  At the time I first brought up the matter
>of such adjustments, you held your hands up to your ears and screamed
>that he didn't want to hear about such stuff; as those factors did not
>strongly affect the relative offensive merits of Carter and Pena, I
>didn't press the issue then.  Naturally, I'm stunned by your reversal;
>>stunned, but not surprised.

You tried to adjust for only a few effects. I don't recall any adjustment
for differing opposition, or pitchers faced (some teams face more lefties
than others, because they have more left-handed hitting.) Again, the
burden of proof is on you, because you are the statistical fan. You make
the claim that these stats are so good, so you have to show that these
other effects are negligable, or cancel out. It is not up to me to derive
a complex scheme, or show its necessity - it's up to you to show that
your scheme is sufficient. All along, you have presented statistical
arguments without proper foundation. That is why I question your
understanding of statistics. For example, you went to a great deal of
trouble to fill screens with numbers about Pena's and Carter's OBA, and
to try to convince about the importance of OBA, but never even showed
that the difference between the two was statistically significant! What
if their difference is not significant? Then half your argument was not
only in vain, but wrong, because Pena would have as good an OBA as
Carter. Now, the difference may be statistically significant, but my
point is that you never even showed it was!

I never reversed my opinion about statistics. I have always held that
stats can be extremely useful. I DO feel that SA+OBA is woefully
inadequate, for reasons I have explained before, and that the proper
statistical understanding of baseball is much more complicated than
your simplistic explanations. It may well be that more comprehensive
stats would favor Carter over Pena even more, and I would find that
very convincing. It's your reasoning I object to, not the conclusion.
(I happen to have numerous friends who fell Carter is better. They
are much better at presenting justifications than you.)

>Incidentally, it is likely that Murphy derives more benefit from
>Fulton County Stadium than Guerrero derives from not having to face
>his own staff; adjusted statistics will likely favor Guerrero even
>more than the unadjusted ones do!

You obviously feel this way because you have to. You may even be right!
But I must repeat, YOU CAN'T TOSS AROUND STATS THIS WAY!!! In order to
make such a statement, you MUST demonstrate its truth. Adjust the
stats, and then make the statement. THIS IS WHAT I HAVE OBJECTED TO
IN ALL YOUR POSTINGS!!!!! It is certainly the case that a good argument
can be made for Carter, and I sincerely welcome such opinions. But
your statistical arguments are SO simplistic. Just because you can
compute something doesn't mean it exists. In order for your arguments
to convince me, your statistical treatment will have to improve greatly.
And don't just point to books such as "The Hidden Game of Baseball." I
have examined it, and once again I say, "Garbage in, garbage out", i.e.,
if you consider only certain types of stats, which ignore certain types
of information, such as the game situations and the surrounding lineup,
then you will derive only models of baseball which ignore these
aspects of the game. Empirical observation leads me (and others I
have talked to) to the conclusion that these are very important factors
in the game. In fact, these seem to be the factors that make it a game,
instead of a HR contest, or a basestealing contest, etc.

>>So, unless you can correct for ALL these factors, and others, to
>>ensure that your circumstances are similar, all the analyses that
>>you have posted are "worthless (possibly even worse: (they) may be
>>misleading".
>
>I will adjust for all the factors that can be demonstrated;
>"adjusting" for a factor that has not been demonstrated (and therefore
>cannot be quantified) is a theological exercise.  Rather than asking
>ourselves how much a factor affects our statistics, we wind up asking
>ourselves how much we BELIEVE a factor affects our statistics.

Now you repudiate your own law of empirics! To repeat Rubin's law
of empirics, you must show that all circumstances are the same in
two different situations, or else your results may be misleading.
Or are you suggesting that the difference between the LA pitching staff
and the Atlanta staff has not been demonstrated? Or that the fact that
hitters do worse against better pitching staffs has not been demonstrated?
Doesn't sound very convincing to me!

>>................................But even this attempt showed your
>>statistical inexperience. Saying, for example, that park A is 10% percent
>>harder to hit in than park B because the overall averages (of say, slugging)
>>are 10% lower, is a valuable and meaningful stat when applied to the whole
>>group of hitters - it provides information on the park to its owners.
>>But it is TOTALLY MEANINGLESS to apply this stat to individual batters
>>in this park. One must also know the shape of the distribution. It could
>>be that almost nobody hits 10% worse in that park - that many hit much worse
>>or better, and it averages out to 10%. For example, if a country's families
>>have 2.3 children on the average, it doesn't mean that anyone has 2.3
>>children, or even that most families have 2 or 3 children. Bivariate
>>distributions are not uncommon, and in these, almost noone is around
>>the mean.
>
>You are correct, but need not worry.  It is necessary to check that
>the detriment/advantage supplied by a home park effects the players
>equally (or that deviations from equality are random, rather than
>systematic).  You will be pleased, therefore, to hear that such
>deviations are binomially/normally distributed, and that where
>individual players fall on these distributions appears random, and
>that the distribution of ALL players is tighter once these effects are
>taken out.

I am not pleased. I do not care what the distribution of the whole
population is. You are missing my point. Where are Carter and Pena in the
distribution? That is the relevant fact that we need. Or, in comparing
Guerrero and Murphy, where are they? I don't care that the population
is normally distributed. What if one player is consistently 50 points
better in SA in Atlanta, and another is 15 points better? Then their
stats must be corrected this way, not according to the average for all
players. Again, I repeat what I said before, "Don't apply population
characteristics to individuals."

>>You'll see that when a runner
>>is on base, it affects (among other things):
>>    1) the way the pitcher throws. Using the stretch instead of a full
>>       windup definitely hurts most pitchers' performances. Otherwise, 
>>       there would be no need for anyone to ever windup.
>>    2) the pitch selection;
>>    3) the defensive alignment.
>
>No doubt, but none of these things is done often enough to
>substantially effect a player's OB or SA.  Let's say, for example,
>that a player gets 500 AB's.  On a really lousy team, there's a runner
>on when he bats, say (these are only guesses; if you have the real
>numbers, go ahead and substitute, as I doubt that I am SO far off as
>to invalidate my argument) 25% of the time, while with a really good 
>team, it might be 50% of the time.  The lucky player gets an extra 125
>AB's with runners on.  Consider #3;  this lucky player, if he's a
>right-handed pull or straight-away hitter gets the secondbaseman in a
>position where the second baseman is less likely to make the play.
>Let's say he is a contact-hitter who NEVER strikesout, and he hits
>lots of groundballs, with few down the line.  Then he might hit
>a groundball toward the secondbaseman about 20% of the time, and the
>secondbaseman may now convert only two thirds of them, rather than 
>three quarters of them, into outs.  So we have 125*.2*(.75-.67) is an
>extra four or so singles over the course of the season.  If the batter
>in question strikes out some, or hits a lot of fly balls, than the
>difference is even less.  Of course, with 125 extra shots at an RBI,
>THAT total will rise substantially.

Why would he hit a groundball toward 2b 20% of the time? It may be much
higher. From personal observations, it seems that many players have
distinct tendencies to hit particular pitches to a certain part of the
field. This is why managers change the position of the outfielders and
infielders. Haven't you ever noticed that some players consistently hit
grounders up the middle for outs, while others can hit grounders with the
same force up the middle for hits? This is because, as you can see on the
replays, and infielder was shifted towards the middle for the first player,
anticipating that he would hit such a grounder. Notice that this is in
conjunction with the pitcher, who must pitch a certain way. Often I have
seen a batter hit a gap shot, say in left-center, and the announcer point
out that the defense was shifted for him to pull it to right, and the
pitcher threw the pitch to the wrong location (outside to a lefty, or
inside to a righty.) In sum, I question your 20%, and your 2/3 and your
3/4. It seems to me that the secondbaseman should convert nearly all
the grounders hit at his normal position when he is there, and will
convert very few if he is moving to cover second base on a steal.

But again, even if you are completely right, such bogus statistical
arguments are worthless. Where are the real stats? They are the only ones
that you can use! The rest is hopeless guesswork, and no better than
unsubstantiated opinion.

>My point is not that these things are fiction, only that it is
>unlikely that they SUBSTANTIALLY affect a player's SA or OBA.  

Unlikely? This is pure opinion, not statistically grounded theory.
I feel that personal stats are highly dependent on the players who bat
around you. You want another example? Ok, let's look at Mike Scioscia
of LA. He is second in the NL in OBA. Are you seriously going to try
to tell me he is the second best in the league at getting on base this
year? Baloney. There are several reasons he has such a high OBA:

   1) He bats in a predominantly right-handed lineup. The Dodgers
      tend to face more right-handers as a result (I read this somewhere
      recently). As a lefty, he thus faces fewer lefties.

   2) He is rested against the toughest lefties. He faces some, but
      doesn't have to face as many as other lefthanded batters usually do.

   3) He is having a better than average year.

Notice that he bats 7th in the LA lineup. Is LaSorda making a big mistake,
or does he realize that Scioscia's OBA is misleading? Can you correct
his OBA to reflect the disparity in right-handed pitching he faces,
or the fact that he doesn't play quite full-time? If not, why should I
trust his OBA stat?

>I do NOT say that lineup doesn't affect a team's performance, only that
>it has precious little effect on an individual's performance.

But then you weaken your entire argument for OBA+SA, because then these 
personal stats may have precious little to do with a team's performance! 
Perhaps we should look for personal stats which DO have a lot to do with
a team's performance. This two-line admission of yours effectively
states that it isn't so much what an individual's stats are, but how the 
manager uses them to build a team that counts. This undercuts your
entire goal, which is to be able to evaluate players in terms of
invariant accomplishments. But if the team's accomplishments are not
strongly connected to the individual's, then mustn't we admit that
baseball is a team game, and that individual accomplishments are
meaningless?

No, I don't like that conclusion at all. I much prefer to think that there
ARE personal stats which can reflect an individual's contribution to
his team's accomplishments, and these stats can be found. OBA+SA doesn't
do the whole job.

How about lineup-dependent stats? I suspect that managers use these anyway.
For instance, OBA would be most important for leadoff hitters, but for
4-5-6 hitters, the percentage of runners that they deliver, when given
an opportunity, should be most important. This need not be the same as SA,
since SA counts 1 for a single and 4 for a HR, even if the single is with
the bases loaded, and the HR is solo. I know you (Rubin) will say, "Oh,
but those are lineup dependent." Exactly, and so is the outcome of
baseball games! Can you really deny that the timing of hits is crucial?
OBA+SA do not include anything about the timing of hits or walks. 

As an example of the weakness of OBA+SA for 4-5-6 hitters, consider Jason
Thompson of the Pirates. His OBA+SA is nearly .800, which is excellent.
Yet he is on the bench, and the team is looking for a replacement. Why?
He is slow, and his defense is less than great. Also, at the time of his
benching, he led the NL in walks. This helped his OBA+SA greatly, but
did little for the team. He wasn't driving in the runs when he had the
opportunity - too many two-out singles with the bases empty, too few
hits with runners in scoring position. The Pirates would love to end up 
with a player with lower SA and OBA, but who delivers the RBI when
presented with the opportunity.

>>    As ANY real baseball fan knows, managers carefully
>>    pick the order to help run production, e.g. alternating left-handed
>>    and right-handed batters, and putting speedsters in front of hitters
>>    who hit well with men in scoring position. WHY WOULD THEY BOTHER TO 
>>    DO THIS IF THERE WERE NO LINEUP INTERACTION??? Why not bat Mattingly
>>    leadoff, to get him more atbats? Maybe the fact that he would be
>>    batting behind a much weaker hitter just MIGHT have a teeny-weeny
>>    little bit to do with it?!
>
>Nyahh.  The reason that we don't bat Mattingly lead-off is not that we
>fear his production will drop, but because we fear his production will
>be wasted.  There is a difference.

I disagree completely. His production would also drop if he batted
eighth in an NL lineup (or in most AL lineups). Or don't you believe that
either?

>>    Thus, we see that some excellent managers, such as Whitey Herzog,
>>    deliberately put a player like Coleman, who has a lower OBA and
>>    slugging average than McGee, in the spot where he will get the most 
>>    at-bats, thus effectively reducing the overall OBA and slugging pct of 
>>    his team. Do you really think he is deliberately reducing the run-scoring
>>    ability of his team? Or do you just think that all these baseball
>>    professionals are sadly misguided?
>
>I think Herzog is making a mistake.  Not a big one, but probably one
>that will cost him a few runs over the course of the season.  Herzog
>is not sadly misguided, just slightly in error.  Herzog makes
>mistakes, Benjamin makes mistakes, even Rubin makes mistakes!  That we
>HOPE that Herzog makes them less frequently is no guarantee of his
>infallibility.  I vaguely recall Herzog being fired from a couple of
>jobs.  Perhaps he did make mistakes...or do you believe that the
>professionals running the Rangers and the Royals did??  Some of these
>professionals must have erred if a firing was necessary.....Of course,
>you will argue that Herzog knows so much, I cannot question him.  Thus
>I ask you: if there thirty professional managers who, in a given
>situation, would do ten different things, does that make most of them "sadly
>misguided"?  Of course not; men of good faith can disagree without
>calling one another idiots.  

Of course we all make mistakes. You like statistics. OK, let's talk 
statistics. The odds of you or I making an error about baseball are 
much higher than those of someone like Herzog or LaSorda.

As for being fired, everyone knows that managers are often made scapegoats.
Herzog was fired from KC because he had the gall to take a team to first-place
finishes a couple of times, then finish second! He basically was fired
because someone else was successful. As far as baseball expertise is
concerned, he undoubtedly has more than everyone on this net combined.

As far as your understanding of expert knowledge, it is completely
wrong. Of course experts disagree. That does not make them all wrong,
or any less expert. They are reaching their conclusions based on
knowledge that we don't even have (they have played and managed
professionally; we haven't.) Of course, we have knowledge they don't
(about statistics.) Where I disagree with you is that you feel that
your knowledge of stats can substitute for their direct knowledge of
the game. They can disagree, and it is an expert disagreement, but if
we disagree with them, then we are presumptuous, for we would be
saying that their knowledge of the game is not necessary, i.e., let's
fire Herzog and put a math professor in his place!

So yes, if you find yourself disagreeing with people like Herzog, then
I say the wise course is to reconsider, rather than assume that they
are making a mistake.

There is an old truism: It's better to be wrong for the right reasons
than right for the wrong reasons. The latter represents dumb luck.
The former is totally understandable - nobody's perfect.

>Certainly, it is NOT true maximizing a team's OBA
>and/or SA is the SAME as maximizing the teams run production, and I
>have never said that it was.  I have suggested it's pretty darn close,
>though.  The relationship between team OBA, SA, and run production is
>close, but not exact.  It would cost the Dodgers some runs to bat
>Guerrero lead-off, but not because Guerrero wouldn't be a good lead off
>man.  You've merely shown that OBA, SA, and runs are not identical: 
>another straw man bites the dust!

And I have suggested that it may not even be close. But then again, You are
the one who loves the statistical approach, so you have to show that it
is close.

Guerrero would make an awful leadoff man. I would love to have you write
to LaSorda and tell him Guerrero would be a good leadoff man, and
see what he says.

>>The lineup can even affect the selection of relief pitchers. And haven't
>>you ever heard a manager say that what he really needs is a left-handed
>>power-hitter (or more speed in the lineup, etc.)? Why are these things
>>important to managers if the players in lineups don't interact?
>
>Again, you misunderstand what I am saying.  The new left-handed power
>hitter may see big changes in his RBI totals, and his new team may see
>a surge in runs scored, but the new player is unlikely to see any
>substantial change in his OBA and SA, once those two are properly
>adjusted.

But you must agree that right-handed batters do better against lefties,
and vice versa! Don't you? This is a well-known statistical fact. But
then, if we have, say, Carter batting between Hernandez and Strawberry,
then a manager will more likely bring up a lefty reliever to face the
heart of the Met order, since two of the three really dangerous hitters
on the club are lefty. I have seen this happen several times in just
the last couple of weeks. Few teams have so many good relievers that
they will bring in a lefty to face Hernandez, then a righty to face
Carter, then another lefty to face Straw. So then, Carter gains by
hitting between two lefties! His OBA and SA can go up. We don't know
the extent of this for him. But how about Jason Thompson. He bats
so much better against righties (when he hit 31 HR a few years ago,
30 were against righties.) Batting between two good righties should
help him a lot more than it would Strawberry, who hits both equally
well (so far this year, he has something like 17 HRs versus righties,
and 11 versus lefties, about right for the percentage of lefties in
the league.)

So personal stats, OBA+SA included, can vary with lineup. And the amount
of variation varies with the individual, so that it is definitely
possible to find players for which there is no variation, or players
for which there is a real variation.

>>> I suppose Paul believes Carter has a special dispensation: in
>>> moving from the Expos to the Mets, he gains by being surrounded by
>>> Keith, Darryl, and George, while those three do NOT gain from Gary's
>>> presence.  The fact is, the production of all four has remained about
>>> the same over the past two years, an argument AGAINST lineup effects.
>>Or an argument that Carter is about as productive as Hubie Brooks is.
>Correct.  It says a lot about lineup effects if they indicate that
>Carter is about the same hitter as Brooks is.  It says just how off
>the wall they are...

But their production ISN'T the same as last year. So your comment is
totally unjustified. 

>Of course, I should have expected this.  Brooks is about as productive
>a player as Pena, and so Paul must assert that Brooks is about on par
>with Carter.  That is, of course, why the Mets were obliged to throw
>in Youmans, Fitzgerald, and Winningham into a deal involving palyers
>of equal value.  Well, Paul, if you're right, the Mets and Expos
>managements must be mistaken about Carter's value vis a vis Brooks.
>So you, too, find yourself in contradiction with baseball "authority".
>Let us all savor this moment: it is as if the Pope were found guilty of
>heresy!

What absurdity! Do you ever read what I write? Of course Carter is 
more valuable than Brooks. The economics of the availability of people
at different positions enters here. After all, Carter is a very good defensive
catcher (the second-best in the league) and Brooks is a good thirdbaseman
(decent shortstop). I never said Brooks is of equal value as a player
as Carter. I said that they contributed about the same to the Met offensive
productivity. If you examine the stats (ahem!) you see that, even though
the Expos score 10% fewer runs than the Mets, Brooks had (at the end
of August) about the same R and RBI as Carter. Now, you can try to
ignore R and RBI since they are lineup dependent, but when Carter's
team scores clearly more than Brooks', and they both bat cleanup,
why does Brooks produce as many R and RBI as Carter (actually more RBI
and fewer R) even with fewer HR (a HR is the only way to add to both R 
and RBI simultaneously, and thus should help the HR hitter)? 

To emphasize that good catching ability is hard to find, and requires
a high price on the market, be aware that when the Dodgers were trying 
to get Pena, the Pirates demanded Marshall, A.Pena, and (I think)
Reynolds. The Dodgers didn't want to trade Marshall for anyone, so the
deal wasn't made. 

Thus, your statements about my contradicting baseball authority and
being found guilty of heresy are totally uncalled for.

>Read my lips:
>
>I HAVE NEVER NEVER NEVER NEVER DENIED THAT LINEUPS EFFECT RBI'S!!!!!
>
>To show an increase in RBI's shows the TEAM has had a better (or
>worse) year, not that the player has had a better or worse year.  

Read my lips: 

YOU ARE WRONG!!!

It is painfully obvious that a change in RBI can also be attributable
to a player's having a better or worse year. To state that it can show
only that the team had a better of worse year is extraordinarily
narrowminded. The team's performance must also be considered. Again,
you can easily dismiss Brooks having more RBI (as of the end of August)
than Carter IF MONTREAL IS OUTSCORING THE METS, BUT THEY ARE NOT!
They are 10% behind. So, this difference in RBI cannot possibly show
that the team Brooks is on is having a better year than the team Carter
is on.

>Fact is, these are the kind of outputs we would have expected from all
>four had Carter remained in Montreal.......

Carter has NEVER hit this many HRs before. Is this what you mean by
"expected"?

>P.P.P.S.  There is something called Linear Weights that does even
>better with run production that OBA and SA; it includes things you
>object to having left out, such as SB's.  It is a SLIGHT improvement,
>while being a GREAT increase in complexity.  The increased complexity,
>in my view, is too great to be justified by this slight improvement.
>You may well think otherwise.  

I do. If complexity works better, then it's necessary. Too bad for
simple minds.