[This article was first published on The Prince of Slides, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
It’s been a while since I have posted here. I have been swamped with some papers I am trying to get out, finishing up the dissertation, and interviews (faculty ones in addition to others). I should have some big news in the next couple of weeks regarding this last activity. But something spurred me to take a break from this process and post today.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The Book Blog recently provided a link to Baseball Analytics claiming a huge increase in strike calling by umpires over the past 4 years. However, this is a somewhat questionable finding. Before reading on, go there and check out the discussion already taking place. Then come back.
I also ran into this “huge discovery” earlier last year when trying to come up with a topic for my Baseball ProGUESTus post. I spoke about the finding with Mike Fast, and we quickly realized (more Mike than I) that, rather than the umpires becoming more accurate within the strike zone, it was the stringers creating the “sz_top” and “sz_bot” measurements who were actually getting better over time. I talked briefly about this problem HERE, and temporarily took down my original umpire strike zone calculations because of that. As an example of the bias, below I have the strike rate of pitches within the rulebook zone using a fixed zone and the stringer-provided top and bottom of the zone numbers:
Fixed Zone:
2008: 86.08%
2009: 85.90%
2010: 86.78%
Stringer-Provided Top & Bottom:
2008: 79.99%
2009: 82.20%
2010: 85.55%
From the above fixed zone, we’d expect about a 0.034 to 0.060 increase in run scoring per game (or 0.017 to 0.030 per team per game) when accounting for the number of pitches that we expect to change from a ball to a strike here (between 550 and 970 per year total). Keep in mind this is a very rough estimate, and does not account for changing behavior of pitchers or batters. But that’s about a factor of 10 smaller than the estimate found with the stringer data (and the one that Baseball Analytics reports). So we should be wary about these numbers.
So, we can see here that one really needs to use a consistent top and bottom of the strike zone to ensure we don’t see weird changes like this due to something other than changes in umpire behavior. That’s not to say that there hasn’t been any change (and I suspect that younger umpires are better than their older counterparts partly due to the extensive technological monitoring and performance training they must go through) but the 5 percentage point increase in strikes within the zone is well above what that change really is.
Another criticism of this is only calculating changes in strike calling on pitches within the zone. If umpires also decrease the number of strikes called on pitches outside the zone, then the net change in run scoring could be zero. Rather than calling more strikes, umpires could simply be getting better at their jobs. If anything, the latter is the most likely choice, given the data I have. So let’s take a look at the fixed zone strike percentage on pitches that are outside of the rulebook zone:
Fixed Zone:
2008: 11.84%
2009: 11.51%
2010: 11.39%
Aha! So any increase we see on in-zone strikes tends to be cancelled out by a decrease in out of zone strikes. This change amounts to about 0.076 runs per game total (or about 0.038 per team per game) since there are more than twice as many pitches outside the zone than within that are called by the umpire.
That’s interesting. But this doesn’t explain everything. There could be a decrease in the quality of pitches outside the zone over time. And if pitchers now that they’ll get more strikes well outside the zone, they may try to nibble way out there more often. We’ll see that this might be going on a bit later. But note that using a discrete measurement also may be an issue. Not every pitch within the rulebook strike zone is created equal, nor is every pitch outside of it. There are varying degrees of strike likelihood depending on how close the pitch is to the edge of the zone.
To be clear, I think there is still something going on. I don’t believe it is anywhere near the effect size reported at Baseball Analytics. But there is certainly plenty of good reason to think monitoring and training using these advanced technologies is improving umpire performance. In fact, I believe we see this training taking hold for those younger umpires coming up through the ranks. This is, of course, a VERY interesting effect. But it almost certainly does not account for any huge change in run scoring.
So I figured I would go a little deeper. Below I have mapped out and measured the size of the strike zone across 2008, 2009 and 2010. These are maps I have used before, employing cross-validated smoothing parameters in order not to overfit. The contours tell you the boundary at which–within that contour–pitches are called strikes at least 50% of the time. In the tables, I map out the area of each contour, as well as the area within each contour. This will provide some evidence as to WHERE the umpires are changing their strike calling behavior, if they are at all.
As you can see, there are some slight changes. Right handed batters do seem to be getting more low strikes called against them. However, the net change is nearly zero once we account for the fewer strikes being called on the outside of the plate. We see a smaller change for lefties low and outside; however, it seems that the inside strike is being called less often against lefties.
But above I only plot the 50% contour, and the size is rather ambiguous. Luckily, we can actually measure the size of this zone in R (HOORAY R!). You have seen Josh Weinstock do this in the past, so the measurement is nothing new. We can do this at all the different contours to see if the umpire is changing on pitches down the middle, just those on the edges, or perhaps those on the well-outside edge of the previous zone. Below I have the 30% through 90% contour zone sizes from 2008 to 2010, and the change in each from year to year.
RHB | RHB | RHB | RHB | RHB | RHB | RHB | RHB | RHB | RHB |
Zone Size (sq. in.) | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
2008 | 672.77 | 586.56 | 530.90 | 485.95 | 445.19 | 404.87 | 361.54 | 309.86 | 235.65 |
2009 | 670.73 | 582.44 | 525.40 | 479.47 | 438.05 | 397.45 | 354.37 | 303.80 | 232.76 |
2010 | 681.09 | 594.13 | 535.39 | 487.17 | 443.46 | 400.94 | 356.69 | 306.37 | 238.80 |
Changes | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
2008 to 2009 | -2.04 | -4.12 | -5.50 | -6.48 | -7.14 | -7.42 | -7.17 | -6.06 | -2.89 |
2009 to 2010 | 10.36 | 11.69 | 9.99 | 7.70 | 5.41 | 3.49 | 2.32 | 2.57 | 6.04 |
2008 to 2010 | 8.32 | 7.57 | 4.49 | 1.22 | -1.73 | -3.93 | -4.85 | -3.49 | 3.15 |
Percent Changes | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
2008 to 2009 | -0.30% | -0.70% | -1.04% | -1.33% | -1.60% | -1.83% | -1.98% | -1.96% | -1.23% |
2009 to 2010 | 1.54% | 2.01% | 1.90% | 1.61% | 1.24% | 0.88% | 0.65% | 0.85% | 2.59% |
2008 to 2010 | 1.24% | 1.29% | 0.85% | 0.25% | -0.39% | -0.97% | -1.34% | -1.13% | 1.34% |
LHB | LHB | LHB | LHB | LHB | LHB | LHB | LHB | LHB | LHB |
Zone Size (sq. in.) | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
2008 | 658.93 | 575.93 | 521.70 | 477.83 | 437.93 | 398.50 | 356.20 | 306.04 | 235.00 |
2009 | 657.55 | 575.06 | 521.14 | 477.31 | 437.35 | 397.69 | 355.00 | 304.15 | 232.16 |
2010 | 675.05 | 587.10 | 527.48 | 478.58 | 434.54 | 392.08 | 348.32 | 299.01 | 233.88 |
Changes | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
2008 to 2009 | -1.38 | -0.87 | -0.56 | -0.52 | -0.58 | -0.81 | -1.20 | -1.89 | -2.84 |
2009 to 2010 | 17.50 | 12.04 | 6.34 | 1.27 | -2.81 | -5.61 | -6.68 | -5.14 | 1.72 |
2008 to 2010 | 16.12 | 11.17 | 5.78 | 0.75 | -3.39 | -6.42 | -7.88 | -7.03 | -1.12 |
Percent Changes | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
2008 to 2009 | -0.21% | -0.15% | -0.11% | -0.11% | -0.13% | -0.20% | -0.34% | -0.62% | -1.21% |
2009 to 2010 | 2.66% | 2.09% | 1.22% | 0.27% | -0.64% | -1.41% | -1.88% | -1.69% | 0.74% |
2008 to 2010 | 2.45% | 1.94% | 1.11% | 0.16% | -0.77% | -1.61% | -2.21% | -2.30% | -0.48% |
You can see that, overall, the zone size isn’t changing at the huge rate suggested at Baseball Analytics. In fact, within the defined zone the strike rate is somewhat decreasing. What is very interesting, though, is that umpires DO tend to be calling more pitches well outside the strike zone, with an increase of about 1% to 1.5% in the size of the outer edges of the zone. This is not a trivial change, and the majority of it DOES seem to be coming from the low strike, as you can see below. The strange thing is that this seems to contradict the in-and-out-of-zone numbers cited early on in this post. That likely means there’s something going on in between these contours.
One thing to remember is that these contours do have confidence intervals. And it could be that the CI’s get larger as we get further out toward the 50%, 40%, 30%, 20% and 10% contours, since this is where the most variation comes in across umpires. Therefore, the differences we see should be attenuated somewhat to account for this uncertainty. I haven’t plotted the CI’s here because they’ll just make things confusing to look at. But you must remember that these contour lines are not the end all of the conversation on how the zone is being called. I think my method gives the best estimate that we can really get from this type of data, but that doesn’t mean it is anywhere near perfect.
Also, keep in mind I have not done this by count or pitch type, so everything is pooled together. So, if more of a certain type of pitch is being thrown, this could affect our results (and ultimately mean that umpires aren’t changing their behavior, it is the pitchers who are throwing more of certain types of pitches that umpires are more likely to call strikes outside the zone). Or, if umpires are changing behavior differently in different counts, we won’t pick this up and it means they are changing their behavior in some way that creates only a very small net effect when averaging across all of them. Certainly that would be the next step in the analysis, but I just don’t have the time right now. And once we start cutting up the data into smaller sample sizes, there are issues in the reliability and comparability of zone measurements across these very different sample sizes.
But we can see if pitchers tend to throw more to this area by taking the percentage of pitches thrown between each contour (i.e. between the 10% and 20% contour, between the 20% and 30% contour, etc.). Doing this, we’ll want to use the 2008 strike zone to predict the probability of our 2010 pitches. That way, the contours are comparable. Additionally, we want to apply this model to ALL pitches thrown, not just those called by the umpire. This is because there could be changes in contact and swing rates, and we’re just interested in the pitchers’ behavior.
But, if pitchers in 2010 are throwing more to the 2008 10% and 20% contours, then we may be able to say that pitchers are trending toward throwing to these areas where the umpire seems to be expanding the zone some bit. Below, I have the changes across time for the areas that pitchers are throwing to:
RHB | RHB | RHB | RHB | RHB | RHB | RHB | RHB | RHB | RHB |
Pitches Seen in Area | 10-20% | 20-30% | 30-40% | 40-50% | 50-60% | 60-70% | 70-80% | 80-90% | > 90% |
2008 | 18,674 | 13,458 | 11,880 | 11,408 | 12,179 | 13,972 | 18,092 | 28,819 | 115,688 |
2009 | 18,673 | 13,379 | 11,746 | 11,482 | 11,745 | 13,728 | 17,926 | 28,864 | 116,928 |
2010 | 19,473 | 13,800 | 12,139 | 11,841 | 12,677 | 14,326 | 18,689 | 30,137 | 120,643 |
Rate Thrown To | 10-20% | 20-30% | 30-40% | 40-50% | 50-60% | 60-70% | 70-80% | 80-90% | > 90% |
2008 | 7.65% | 5.51% | 4.87% | 4.67% | 4.99% | 5.72% | 7.41% | 11.80% | 47.38% |
2009 | 7.64% | 5.47% | 4.80% | 4.70% | 4.80% | 5.62% | 7.33% | 11.81% | 47.83% |
2010 | 7.67% | 5.44% | 4.78% | 4.67% | 5.00% | 5.65% | 7.37% | 11.88% | 47.55% |
Change | 10-20% | 20-30% | 30-40% | 40-50% | 50-60% | 60-70% | 70-80% | 80-90% | > 90% |
2008 to 2009 | -0.0001 | -0.0004 | -0.0006 | 0.0002 | -0.0018 | -0.0011 | -0.0008 | 0.0000 | 0.0045 |
2009 to 2010 | 0.0004 | -0.0003 | -0.0002 | -0.0003 | 0.0019 | 0.0003 | 0.0003 | 0.0007 | -0.0028 |
2008 to 2010 | 0.0003 | -0.0007 | -0.0008 | -0.0001 | 0.0001 | -0.0008 | -0.0004 | 0.0007 | 0.0017 |
Percent Change | 10-20% | 20-30% | 30-40% | 40-50% | 50-60% | 60-70% | 70-80% | 80-90% | > 90% |
2008 | 0.13% | 0.71% | 1.25% | -0.52% | 3.68% | 1.87% | 1.04% | -0.03% | -0.95% |
2009 | -0.48% | 0.62% | 0.42% | 0.63% | -4.00% | -0.55% | -0.45% | -0.60% | 0.59% |
2010 | -0.35% | 1.32% | 1.67% | 0.11% | -0.17% | 1.33% | 0.59% | -0.64% | -0.36% |
LHB | LHB | LHB | LHB | LHB | LHB | LHB | LHB | LHB | LHB |
Pitches Seen in Area | 10-20% | 20-30% | 30-40% | 40-50% | 50-60% | 60-70% | 70-80% | 80-90% | > 90% |
2008 | 14,073 | 10,126 | 8,883 | 8,645 | 9,254 | 10,250 | 13,191 | 20,763 | 86,135 |
2009 | 14,909 | 10,700 | 9,459 | 9,348 | 9,645 | 11,092 | 14,090 | 21,974 | 91,871 |
2010 | 14,291 | 10,218 | 8,970 | 8,612 | 9,174 | 10,486 | 13,464 | 20,813 | 86,686 |
Rate Thrown To | 10-20% | 20-30% | 30-40% | 40-50% | 50-60% | 60-70% | 70-80% | 80-90% | > 90% |
2008 | 7.76% | 5.58% | 4.90% | 4.77% | 5.10% | 5.65% | 7.27% | 11.45% | 47.50% |
2009 | 7.72% | 5.54% | 4.90% | 4.84% | 5.00% | 5.74% | 7.30% | 11.38% | 47.58% |
2010 | 7.82% | 5.59% | 4.91% | 4.71% | 5.02% | 5.74% | 7.37% | 11.39% | 47.44% |
Change | 10-20% | 20-30% | 30-40% | 40-50% | 50-60% | 60-70% | 70-80% | 80-90% | > 90% |
2008 to 2009 | -0.0004 | -0.0004 | 0.0000 | 0.0007 | -0.0011 | 0.0009 | 0.0002 | -0.0007 | 0.0008 |
2009 to 2010 | 0.0010 | 0.0005 | 0.0001 | -0.0013 | 0.0003 | -0.0001 | 0.0007 | 0.0001 | -0.0014 |
2008 to 2010 | 0.0006 | 0.0001 | 0.0001 | -0.0005 | -0.0008 | 0.0009 | 0.0009 | -0.0006 | -0.0006 |
Percent Change | 10-20% | 20-30% | 30-40% | 40-50% | 50-60% | 60-70% | 70-80% | 80-90% | > 90% |
2008 to 2009 | 0.52% | 0.77% | 0.01% | -1.54% | 2.13% | -1.62% | -0.31% | 0.62% | -0.16% |
2009 to 2010 | -1.30% | -0.92% | -0.21% | 2.64% | -0.52% | 0.10% | -0.98% | -0.09% | 0.29% |
2008 to 2010 | -0.77% | -0.14% | -0.21% | 1.14% | 1.62% | -1.52% | -1.29% | 0.52% | 0.13% |
All in all, the table above seems rather ambiguous, but I’d be interested in hearing any patterns that others see here. RHB have been seeing more pitches at the 20% to 40% contours, while LHB have been seeing more at the 40% to 60% contours. Of course we would also have to understand if batters s are changing their behavior that affects run scoring in a more significant way than we would expect due to the lower strikes outside the zone or any of the changes in the table above. But given that we’re talking about so few pitches in terms of overall ball-to-strike changes (or vice versa), I am going to be cautious about making any large statements about the effects of this on the run-scoring environment.
Lastly, what I think could also be going on is these younger umpires coming into the league. Umpires are being trained more and more using pitch f/x type technology and use it for learning missed calls, problems in their own strike zone, and so on. They get reports for their games from the umpire’s association as they come up through the minor leagues. I think this should have a real effect on strike calling behavior, likely meaning that younger umps call strike closer to the rulebook zone. As a rough comparison, below I have Mike Estabrook (a younger umpire) and his zone compared to, say, Jerry Crawford. Now, this is only two umpires and a small sample, but there could be something to the idea that a younger umpire like Estabrook is willing to form his zone to the rulebook, as opposed to Crawford who essentially has tenure as an MLB umpire. Keep in mind that the below is only two umpires, and we’ll probably need a few more years of data to detect any changes for younger umpires vs. older ones.
So that’s all I have for today. I have spent way too much time on this, but please provide comments if you have thoughts, criticisms, or flat out think I’m an idiot. There is certainly more to say than what I have here.
NOTE: At the suggestion of Tango, I provided some additional information below regarding the size of the strike zones which I have used in a recent academic paper I have under review. In this case–rather than use square inches–I report the approximate number of baseballs that could fit side-by-side through the given area if it were, say, an actual square rather than strips of changes around the zone (a baseball is about 8.4-8.5 square inches, or a 2.9-by-2.9 inch area, the diameter of the ball). So, for example, 81 baseballs means a square 9 baseballs by 9 baseballs. Hopefully, this will help to visualize the size changes in the zone in a context relevant to the discussion, as I think Tango makes a very good point.
RHB | RHB | RHB | RHB | RHB | RHB | RHB | RHB | RHB | RHB |
Zone Size (in baseballs) | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
2008 | 80.00 | 69.75 | 63.13 | 57.78 | 52.94 | 48.14 | 42.99 | 36.84 | 28.02 |
2009 | 79.75 | 69.26 | 62.47 | 57.01 | 52.09 | 47.26 | 42.14 | 36.12 | 27.68 |
2010 | 80.99 | 70.65 | 63.66 | 57.93 | 52.73 | 47.67 | 42.41 | 36.43 | 28.39 |
Changes (in baseballs) | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
2008 to 2009 | -0.24 | -0.49 | -0.65 | -0.77 | -0.85 | -0.88 | -0.85 | -0.72 | -0.34 |
2009 to 2010 | 1.23 | 1.39 | 1.19 | 0.92 | 0.64 | 0.41 | 0.28 | 0.31 | 0.72 |
2008 to 2010 | 0.99 | 0.90 | 0.53 | 0.15 | -0.21 | -0.47 | -0.58 | -0.41 | 0.37 |
LHB | LHB | LHB | LHB | LHB | LHB | LHB | LHB | LHB | LHB |
Zone Size (in baseballs) | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
2008 | 78.35 | 68.48 | 62.03 | 56.82 | 52.07 | 47.38 | 42.35 | 36.39 | 27.94 |
2009 | 78.19 | 68.38 | 61.97 | 56.76 | 52.00 | 47.29 | 42.21 | 36.17 | 27.61 |
2010 | 80.27 | 69.81 | 62.72 | 56.91 | 51.67 | 46.62 | 41.42 | 35.55 | 27.81 |
Changes (in baseballs) | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% |
2008 to 2009 | -0.16 | -0.10 | -0.07 | -0.06 | -0.07 | -0.10 | -0.14 | -0.22 | -0.34 |
2009 to 2010 | 2.08 | 1.43 | 0.75 | 0.15 | -0.33 | -0.67 | -0.79 | -0.61 | 0.20 |
2008 to 2010 | 1.92 | 1.33 | 0.69 | 0.09 | -0.40 | -0.76 | -0.94 | -0.84 | -0.13 |
To leave a comment for the author, please follow the link and comment on their blog: The Prince of Slides.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.