Pieces of code can be found here: https://github.com/duffyd08/BaseballWork/blob/main/Duffy%20Defensive%20Rating
Goals:
1. Find a way to quantify "errors" based on their game impact
2. Use this aspect of fielding to create a more overall understanding of a defender's ability
3. Produce data on which fielders commit more costly errors and how we can use that to evaluate defensive effectiveness
4. Create a second order effect to continue the stream of thought and the "trickle down" to other players with this model
Three Major Iterations to my Original Process:
1. Automating the calculations
2. Shifting from combination of Leverage Index and Bases to Win Probability Added
3. Taking “range” into account through creation of “Duffy Defensive Rating”
After releasing the first iteration of my project on “Weighted Errors”, I was very fortunate to receive some incredible feedback from many people within the game of baseball. MLB front office members, professional and college coaches, and even players offered insight into what they thought about the statistic, how they thought it could be applied, and most importantly, what needed to be improved going forward. I have spent some time over the past year looking into ways to iterate my methods. First, I needed to make the process more efficient. One of the first individuals in a front office that I spoke with wondered if I had the capabilities to automate this process. At the time, I did not, but I have spent some time over the past few months learning better coding methods. I took my project to Python, which has allowed me to produce visualizations, extract insights, and work with large datasets in a much easier way. Second, though the idea of “counting bases” is the way that I see the impact of an error on the field, combining leverage index and bases has a very similar effect to WPA. Now, I have used win probability added to be the main metric in calculating my “weighted error” view. Third, many issues with the idea was that it would penalize players who are able to reach balls but may make an error, over their counterparts who do not have the same “range”. To combat this, I have offered a new defensive rating, “Duffy Defensive Rating”, which combines my weighted error statistic with more established defensive metrics like Outs Above Average (OAA) from BaseballSavant and Defensive Runs Saved (DRS) from FieldingBible. The new rating weighs the advanced metrics heavier than my own, due to the overall importance in evaluating the defender.
Process:
Underlying Issues:
Inconsistency in Data
Error Totals Do Not Align Across All Sources
Still learning Python as I go
I have used BaseballSavant’s “Search” function to pull data on all errors in the Savant database since 2008, though have only been able to use data from 2016 until now for my Duffy Defensive Rating. The data on Savant is greatly beneficial, though it is not without its flaws. The errors are admittedly pretty difficult to pull and are still left up to the discretion of the official scorer. The error data that I was able to read with the play-by-play data does not perfectly align with other sites, such as Baseball Reference. Because of this, I would not say that this list of error data is completely comprehensive and accurate.
I have learned Python through a series of videos on Twitch (Nick Wan’s Bootcamp) and YouTube. Most of my experience has come from trial and error as well, sticking to mostly what makes sense for me to build models and visualizations. I still have a lot of work to do with this language!
Since I have pulled data in many different sources, it has been very difficult to focus on more specific aspects of calculation that would help to add more nuance. For example, I tried hard to compile a ‘position’ column, though had some difficulty when working with FanGraphs and Baseball Savant. I want to continue improving to make specific positions a part of it going forward and hope to continue predicting Gold Glove Award winners. Similarly, I will look to include age breakdowns in my future research, to see if there is a certain age where players are seen to commit more costly errors or have a massive dip in defensive rating, outside of random chance.
Results:
I have included a couple of visualizations to show some of the outcomes of the data. First, I begin with an overall look at Win Probability Added on errors since the 2008 season with a box plot seen below. This shows year by year trends in how costly errors have been on a "macro" scale. The normal range has been somewhat consistent over the last 15 years, but some seasons stand out above others.
Using my "Duffy Defensive Rating", I have tried to make a more comprehensive defensive statistic incorporating my view on how to properly penalize bad errors. I have used a small multiple time series plot here to show the individual seasons since 2016 (when OAA was established). I think this graph is a cool way to visualize the extent of players who are committing costly errors, as more data points extend to the left, the more impactful they become. As the data shows, a continuance of impactful errors in the game is likely a byproduct of some below average fielders. The best of the best are making very few errors at low leverage times. The relatively flat regression lines should not be a huge surprise, as the "weighted error" aspect of my defensive rating carries the lowest weight. Outs Above Average and Defensive Runs Saved are much more impactful in the rating because they are more encompassing of a defender's ability.
The shortstop position has always been important to me in my research, as this is one of the spots on the field where defenders shine or falter the most, since it is arguably the hardest. Some former Gold Glove winners at shortstop in the past few years have included Javy Báez, Carlos Correa, and Dansby Swanson. I decided to look at their yearly trends as a way to see which directions these players have trended in both the "weighted error" portion of my research, as well as there overall impact. As seen below, all three shortstops have had a different trajectory under my defensive rating system, with Swanson trending upward currently (having won two straight NL Gold Glove awards), and Correa trending down after his 2021 Gold Glove. Baez posted a 31.0 DRS and 31.0 OAA in 2019, which combined with his errors, gave him a defensive rating of 24.822, only second in all my research behind Nick Ahmed in 2018 (25.26).
Next up is the look at all three of these shortstops in their weighted errors, my original idea. I recall two summers ago when I first began researching this idea, Correa was a victim of some costly errors. Though my original idea with counting bases penalized him too much, his WPA is more accepting of his plays. He has very regularly made relatively meaningless errors throughout his entire career. Báez is such an interesting look with his errors, as the dip between 2019 and 2021 (not paying much attention to the shortened 2020 season) is the biggest jump in the data from -0.11 in '19 to -0.79 in '21. Swanson has been pretty consistent throughout his career, though saw a slight dip in '23 relative to years prior.
I have included the top and bottom ten players based on my defensive rating metric since 2016, shown below:
Second Order Effects:
During my outreach, I was told to look a little into the second and third order effects that may arise. I decided to try and quantify the trickle down effect of the pitchers who suffered a poor play in the field behind them. While an imperfect way to most accurately find the effect of defensive plays on pitchers, it does give a little bit more insight into how defenders are affecting pitchers aside from earned runs.
I calculated the average pitches per out for pitchers in the 2023 season by simply dividing their total pitches and outs pitched (3*IP). Then, I found the sum of each position player's errors for each pitcher. For this study, I have assumed that each error costs an opportunity for an out, but maybe it's too simplistic. Ex: Trea Turner committed 14 errors that came up on the Statcast data. Since each pitcher has a different "average pitches per out", I summed each to get his league leading 80.56 pitches.
Below is a histogram of the average pitches per out in 2023:
The scatterplot below shows the "Total # of Average Pitches" for each player with my weighted error view. As expected, both a low error number and weighted error number likely means there are less pitches "costed". The drawbacks to some of this data is that average pitches per out is a flawed metric, especially with a small sample size. There are pitchers who have tough outings and could even be sent to the minor leagues, denoting a high number of pitches. Max Muncy sits at 68.9 pitches and -0.369 on his weighted error value, a pretty dig disparity between the two categories.
Conclusions:
Again, I am still aware of the many shortcomings about my defensive model and where it can improve, but I am pleased with the work that I have done over the last few months to get this out to more eyes. I hoped to keep this page short and brief, as it should read a bit easier than my original findings. The brevity should allow for easier understanding and more focus on the visual aids. As George E. P. Box once put, "all models are wrong, but some are useful". My hope is that this ever-evolving defensive model will bring some more substance to my original idea that I came up with nearly 18 months ago. I am looking forward to tweaking this in an effort to tell more of a story in a season, player, or play type through numbers and cannot wait to hear more feedback. As always, please leave your thoughts with me via email : duffydigest@gmail.com or on Twitter/X as I grow my presence there @DigestDuffy . While you are on the page, feel free to read my past work on Weighted Errors in 2022 here and look at some of the Data Visualizations I have created in the tab or on Twitter!
Thank you for taking the time to read and explore with me!
-Drew