david@fisher.UUCP (David Rubin) (06/01/85)
[This is very long...] Sometime ago I promised a breakdown of offensive production by position in the National League, as measured by runs created. Well, I finally got around to doing it, and the results are presented below. But before you peruse them, I thought I'd first explain briefly what the idea was behind a "run created". Standard baseball statistics don't measure the quantities which we assume the do. For example, batting average is used by most fans to describe how often a hitter safely reaches base, when that is not what it measures at all. Al Oliver, for example, hits .300 and reaches base not much more than 30% of the time while Gary Matthews could hit .250 and reach base 35% of the time (in fact, last year Matthews hit just under .300 and reached base 41% of the time). That is why Wally Backman led off for the Mets last season, even though his average was virtually the same as Mookie Wilson's (.280 vs. .276) and Wilson is somewhat faster: Backman reached base 36% of the time while Wilson reached base 31% of the time. Also, standard baseball statistics are too heavily influenced by one's TEAMMATES' performances. Scoring a lot of runs does't necessarily mean you're the best at scoring, just that you were fortunate enough to have productive teammates behind you. Similarly, nearly ANYONE batting in the middle of a productive order will still have a greater number of rbi's than Superman batting eighth for the Pittsburgh Pirates. What is required is some measure of how a player's INDIVIDUAL contribution (1b's,2b's,3b's,hr's,sb's,sh's,sf's,cs's, etc.) would produce independently of any other player. Consider this a thought experiment, in which we measure how good Dale Murphy is as an offensive player by creating nine copies of him and fielding a Dale Murphy team and watching how many runs they score per game during the season (they wouldn't stand a chance, though, of beating nine Rick Rhodens in a pennant race). This is what Bill James's "Runs Created/27 Outs Made" attempts, though here we are concerned only with Runs Created over the course of a season, rather than at the rate at which they are created. Other "new" baseball statistics attempt the same thing, though I've done Runs Created because it is explained in Bill James's Abstracts, and is therefore widely available to other net readers. Without explanation, Runs Created == A*B/C, where A is the number of runners produced by the batter (hits+walks+HBP-CS-GIDP), B is a measure of how well the batter advances base runners (total bases+.52*(steals+sacrifices)+.26*(unintentional walks+HBP)), and C is the number of plate appearences (AB+BB+SH+SF+HBP). For each player, stats were divided proportionately by the number of appearences at each position (excluding those positions at which he had no fielding chances, and treating the outfield as one position, ala official defensive statistics), added up for each team, and put through James's formula. The results were adjusted for home park and home pitching staff (ala Pete Palmer's method (James's is much more complicated and the numbers necessary for it weren't available to me)), and the results are what is beneath this unbrief introduction: TEAM\POS p c 1b 2b 3b ss of (of/3) Chi 13 65 97 116 81 35 270 (90) NY 13 48 124 84 79 63 269 (90) StL 18 68 72 75 84 68 275 (92) Phi 9 75 109 97 116 50 285 (95) Mon 13 103 104 55 78 34 325 (108) Pit 21 82 93 103 73 49 233 (78) SD 24 63 76 91 78 61 287 (96) Atl 14 46 76 69 61 52 240 (80) Hou 12 70 90 94 80 70 321 (107) LA 13 62 62 55 76 56 217 (72) Cin 13 51 101 62 75 66 285 (95) SF 11 96 88 65 73 49 370 (123) Medians 13 66 92 80 78 54 280 (93) Overall Median: 78* Without Pitchers: 80* Without Shortstops, too: 90* Dropping C's, SS's AND P's: 90* (* of/3 weighted tripley) Comment #1: A typical pitcher is typically bad, indeed, and hits only 1/7 as well as a typical firstbaseman-outfielder. Comment #2: Secondbasemen and Thirdbasemen pull their weight. There was a mild surprise for me, as I expected thirdbase to be one of the top producing positions, along with of and 1b. Comment #3: The typical shortstop is a pretty awful hitter, producing barely more than half of what the typical 1b-of produces with the bat. Comment #4: The typical catcher is weak, somewhat more than 2/3 as productive as those batsmen who play 1b-of. Comment #5: The typical hitter hits like a secondbasemen. If pitchers were excluded, the typical hitter would hit like a thirdbaseman---little gain. If shorstops, too, were excluded, the typical remaining hitter would hit like an firstbaseman---bingo! Comment #6: How this all pertains to the DH: If one were concerned with MEAN performance, there would be some justification (but still no individual justice) in excluding pitchers and no one else. However, it is my contention that the fan bases his expectations on MEDIAN, not MEAN performance. Informal proof: consider a hypothetical league of six teams whose catchers had seasons like Davis (65), Porter/Nieto (68), Kennedy (63), Scioscia/Yeager (62), Virgil (75), and Carter (103). If you judged "typical" performance by medians, you would say that Porter/Nieto had typical offensive abilities for a catcher. If you judged "typical" performance by means, you would consider Virgil's performance most "typical". Since most fans, presented with a league such as this, would consider Davis, Porter, Kennedy, and Scioscia pretty "typical", with Virgil somewhat better and Carter far better, I submit that where there is a discrepency between means and medians (i.e. skewness has set in), fans judge by medians. And if we are judging by medians, the NL would not have the ability of its "typical" hitters rise substantially unless pitchers AND shortstops were eliminated from the lineup. My apologies for bringing up a topic which had blissfully expired, but I felt obligated by my earlier promise to produce the numbers, and once the numbers were produced, there was no point in posting them without explanation and comment. David Rubin {allegra|astrovax|princeton}!fisher!david