Evaluating MLB Umpire Performance using Statistical Period-Constrainted Neural Networks
Post Date: 10/29/2024
This work is my Statistical Science Master's Thesis, where I evaluate MLB umpire performance using publicly available pitch data, under the advisement of Paul A. Parker, from the UCSC Statistics Department. I won an award for this work at the UConn Sports Analytics Symposium, where I presented a poster among other graduate students. To view this project, the easiest introduction is by reading the poster below. For a more detailed version, the pre-print thesis paper or the accompanying set of presentation slides are also embedded below.
This work responds to the growing need for a statistically robust approach to estimate the called strike zones of MLB umpires, and introduces a novel metric towards evaluating umpire accuracy. To advance beyond existing methodologies, we develop a period-constrained random-weight neural network to predict the probability of a called strike on any location inside and around the rectangular MLB strike zone. By utilizing polar coordinates, we derive an explicit form of the contour line, which facilitates the measuring of umpire accuracy through comparison with the MLB strike zone. Bayesian inference is used in model fitting, which permits uncertainty quantification for the contour line. By employing novel metrics to assess umpire accuracy, we can also evaluate and compare umpire performance league wide, inform decisions about crucial game assignments, and contribute to the ongoing dialogue on fair play in baseball.
To use the results of this project to compare umpires from the 2023 MLB regular season, this Shiny App (Takes ~30 seconds to load) allows you to select an umpire, see their estimated umpire zone, and compare their departure score to other umpires (Lower is better!). There is also a table of the results if you'd like to use it for another project.