Weighted Errors and Adjusted Fielding Percentage Explication
Weighted Errors and Adjusted Fielding Percentage Explication
7/26/22 by Drew Duffy
The game of baseball has experienced notable progress in recent years in the way that players have been evaluated with advanced statistics. Though offensive statistics have grown by leaps and bounds, it seems that analysis of players based on their defensive performance has remained effectively the same. While there have been notable discoveries with stats such as Outs Above Average (OAA) and Defensive Runs Saved (DRS), there is an untapped opportunity to analyze a player’s errors to adjust their overall defensive effectiveness. Errors remain one of the most subjective stats in baseball. While seemingly every game event can be broken down into thousands of different parts, errors are still somewhat basic. Currently, there are websites that break down errors into throwing and fielding errors, showing which part of the play the error occurred. While this is beneficial, it still does not tell the whole story. I have researched a way that puts a value on errors based on the outcome of the error as well as the situation of the game. This analysis might be considered similar to breaking down a quarterback’s interceptions. Consider a scenario where two quarterbacks throw an interception in a given game. One interception is on a 50-yard “Hail Mary” as time expires in the first half, while the other occurs on first and ten in the opponent’s red zone. Though each quarterback would be assessed the same statistic of one interception, clearly the “Hail Mary” play is not nearly as detrimental to the team’s success as the latter. All interceptions are not equal. Similarly, all errors are not equal (just about all plays in a given game could be described similarly).
The main premise of the analysis is that some errors made by fielders are more costly than others. This is not an outlandish idea, in fact it is one that has been replicated in other aspects of the game, especially hitting. Different hits are valued more than others as seen in traditional “Slugging Percentage”, as well as wOBA or “Weighted On-Base Average”, which places value on a player’s offensive output based on how valuable a certain at-bat outcome ends up being. So why not do this with errors? My method takes into account two major factors. First, I have calculated the total number of bases that are given up on each error by looking at archived video and play-by-play descriptions. While it is somewhat cumbersome determining this in retrospect, it can easily be reported as a part of an official scoring decision, or by a statistician who is entering gameday data for a particular team. Second, I have used FanGraphs’ “leverage index”, which calculates the situation in a game on a plate appearance-by-plate appearance basis. This number has served as my baseline for the net detrimental effect of an error in a given situation
Here is an easy view into two different situations that will be treated much differently than they are currently. These two scenarios illustrate the reasoning for my research and my hope is that you will understand the rationale behind the process, as well in the videos below, we see examples of errors impacting a game in different ways. In the first video, we have a blowout at Fenway Park. Alex Bregman is fielding a ground ball in the bottom of the 8th inning with a nine-run lead. He misplays the ball and batter Jackie Bradley Jr. reaches on the error. Bradley’s presence on first base really has little to no effect on the outcome of the game that night. In the spreadsheet, we can see that the leverage index for that at-bat was 0.01, effectively 0.
The second video is much different. With Ke’Bryan Hayes representing the winning run on first base in the bottom of the 10th inning. Bryan Reynolds hits the ball to Eric Hosmer at first but Hosmer misses the ball and allows it to trickle into the outfield. Hayes eventually scores the winning run at home (after review) and the error winds up becoming the last play of the game. The leverage index for Reynolds’ at bat was 3.21, a high leverage point in the game.
(see videos below)
Errors themselves also vary greatly depending on the position on the field. There will not be as many errors in the outfield due to the number of chances given to each player. The infielders see a much greater impact in my model compared to their outfield counterparts.
I began the research by looking at the position with the most errors committed, and thus, one of the most difficult positions to play, shortstop. As of July 14th, the Atlanta Braves' Dansby Swanson had the best fielding percentage in the MLB among shortstops, with a total of 336 chances (as defined by the sum of Putouts, Assists, and Errors). At that point, Swanson had committed only 5 errors on the season. After looking into his plays individually, I counted an identical 5 “bases on errors”. This means that Dansby was responsible for giving up the same number of bases as he did errors, aligning each error with only one base. Again, this is the ideal outcome for most fielders. (There are situations that result in an error with giving up no bases, but we can get into that later.) If we compare Dansby Swanson’s fielding with another shortstop that committed the same number of errors, we can see just how valuable a deeper look into the play is. A comparable player in my data is Carlos Correa of the Minnesota Twins. As the conventional statistic works, Correa and Swanson have committed the same number of errors with 5. The reason their fielding percentages are different is due to the fact that Swanson has had more “total chances”. To the naked eye, Correa and Swanson have both been as beneficial (or detrimental) to their team’s successes as each other, and are only separated by .008 in their fielding percentages.
But, using my model, we will see that there is another important layer beyond the conventional fielding percentage. Now we must tie in FanGraphs’s “Leverage Index.” Again, this is the value placed on individual situations. I have looked at each error's relation to the leverage index of the play to calculate the overall effect of each error.
To read more about Leverage Index, read here: https://library.fangraphs.com/misc/li/
Here we can take a look at Swanson and Correa side by side:
Table 1
In Table 1, we see the fielding percentages (both standard and adjusted) highlighted in light blue and the new rank in yellow. Again, both Correa and Swanson have committed 5 errors each, but here we see that Correa in fact doubled Swanson in bases given up on those 5 errors. The most interesting number is seen in the Σ Effect column, which is where the leverage index comes into the equation. Correa has a much larger effect because some of the errors he committed were at more inopportune times. Below we are able to see each individual error with the bases given up as well as the leverage index, which gives an overall effect:
Table 2
Table 2 shows us two things:
(1) Leverage Index matters. For an average play in a game, the leverage index hovers around 1.00. Any deviation from this number is indicative of a higher or lower stake situation. Dansby Swanson committed errors where the “Leverage Index” was very low, meaning the situations were not as drastic when he booted a ball. On the other hand, Correa was not as fortunate. His significant outlier occurred in a game on June 30th against the Cleveland Guardians, where he actually committed two errors, the latter having a massive impact on the game and his team’s chance of winning. Correa overthrew a third baseman in the bottom of the 8th inning with the bases loaded (an LI of 6.05!), allowing the tying run to score and every subsequent runner to advance an extra base. This is a compelling example of the stat showing what it is intended to. This error had a much more distinct and profound impact on the outcome of the game than a different error does at a less stressful point in the game.
(2) Bases on Errors matter. Dansby Swanson was consistently “good” in committing his errors, not making a worse play than he already made. Correa, on the other hand, was not as fortunate. He committed one, “one base” error, three “two base” errors, and one “three base” error, which is reflected in his new overall fielding percentage. Counting bases on errors gives us a better insight into how important each play actually was. As it currently stands, we look at the box score after the daily slate of games and see the far right column representing errors. This column is usually occupied by a “1” or “2”, occasionally a “3” on a night that was one to forget defensively. My research has shown that there are some players who have errors and bases on errors that are similar, which is generally better than a bigger gap between the two.
So where does all of this matter? Well, there are still great opportunities to continue the advancement of this stat into more useful and applicable metrics. For now, I have investigated the impact that this new “adjustment” has on the fielding percentage. By turning the Σ Effect into a decimal value, I have been able to adjust (down) the original fielding percentage with the effect to create a new value. Even knocking the fielding percentage down a few hundredths is impactful over a 162-game season, let alone a career. Now, this would only be a logical application if it really shifted the rankings of the existing fielding percentage leaders. I have found that some players had a dramatic change in ranking from this new way of thinking as evidenced below:
Table 3
The top 5 as it stands with the conventional fielding percentage view is largely similar, with a little shuffle in the third, fourth, and fifth slots. At the sixth spot in the current model lies Francisco Lindor, who would drop to ninth in the new application. Correa would drop a whopping nine spots to #16. Nico Hoerner would move into the top 10 at the #7 spot. The biggest riser would be Geraldo Perdoma from Arizona who would bump himself up seven spots to #13.
Across the league, the trend for bases on errors compared to errors themselves are similar. The research has shown that the closer the ratio of E to BoE is to 1, the more beneficial the defensive outcome is. Dansby Swanson enjoys a 1:1 ratio and the overall #1 spot in the league among shortstops. In the bottom half of the league, the closest ratio to one is by Baltimore’s Jorge Mateo, at 11:13. The graph below gives a visual representation of how each player fares in regard to giving up bases on errors:
Graph 1
Remember that bases on errors do not tell the entire story, however. Their counterpart in my model is the leverage index, which, factored together, creates their overall leverage index for every error committed. The juxtaposition between bases and the sum of the effect shows the net effect of the errors per player, giving a good glimpse into how these players performed in certain situations. The higher red lines show more costly overall errors by the players, as we see with Carlos Correa. In a similar way, Trea Turner and Corey Seager have committed errors at poor times. These would cause a larger impact in ranking on fielding percentage. It is interesting to see the players who enjoy a higher blue bar, illustrating that their errors come at lower leverage and sometimes in less impactful situations.
Graph 2
As I stated earlier, there is still a circumstance that is an outlier in games as well as in this calculation model. Say, for instance, a popup is hit in the air but it is in foul territory. The fielders run to catch the ball but it glances off their glove and falls harmlessly to the grass. The fielder in this instance should be appropriately charged with an error, though the result of that pitch turns into a foul ball. No bases are earned or given up in the way that I have calculated the other errors so far, showing a flaw in my process as it stands. For evidence, Ty France, first baseman for the Seattle Mariners, has committed two errors this season at the time of this research. Both of his errors occurred on dropped foul balls! Thus, in his row in the spreadsheet, there are no bases on errors with the Leverage Index becoming a nonfactor. This is where the stat can continue to evolve into easier and more beneficial terms of use, an aspect that I look forward to researching deeper. I am continuing to search for new methods to make up for this aspect of the data since catching the ball would eliminate potential bases that would follow.
Of course, statistics do not, by themselves, tell the entire story of how a game or season transpires. However, as statistics play a greater and greater role in how the game is played (leading to shifts for particular batters, pitches offered to batters in particular situations, etc.), a deeper understanding of defensive metrics can help players and teams evaluate themselves and adjust accordingly. Weighing errors offers an opportunity to add more nuance to a stat that desperately needs to be adjusted. The assignment of “error” itself does not tell enough of the story - we need more context to truly assess the impact of the play. With our new understanding, the impact of errors becomes more complex and we gain a more definitive process in evaluating these plays - maybe even nudging the study of defensive statistics to have a little more meaning, like their offensive counterparts.
Thank you for reading and visiting the site. Feel free to look at the workbook sheet (in the upper right drop down menu) that I have used for the research so far. I hope to make an updated post at some point this year as well.
I would love to hear your thoughts/comments/questions/suggestions at
duffydigest@gmail.com !
7/26/22 by Drew Duffy
Bregman Error
Hosmer Error