Feature engineering is the core driver of model performance in this system. Raw MLB data is transformed into structured predictive signals through multiple layers of aggregation and normalization designed to reduce noise and isolate true performance signal.
Rolling Performance Windows
Each player is evaluated across multiple rolling time horizons (5, 7, 10, 14, 21, and 30 games), allowing the model to capture both short-term form and long-term baseline skill. Short windows are more reactive but higher variance, while longer windows provide stability and reduce sensitivity to small sample fluctuations.
Player performance is separated by opponent handedness to capture matchup-specific effects that are not visible in aggregate statistics. This is especially important in MLB due to the frequency of platoon advantages and pitcher-batter interaction asymmetry.
Empirical Bayes shrinkage is applied to reduce noise in small sample performance estimates by pulling extreme values toward league averages in proportion to sample size. This improves stability and reduces overreaction to short-term streaks.
Pitcher and hitter true talent estimates are combined into a unified interaction signal using a log5-style transformation, producing a probabilistic representation of expected matchup outcomes.
Early versions of the system included features derived from season-level aggregates that unintentionally introduced future information leakage. After identifying and removing these features, the system experienced lower in-sample accuracy but significantly improved out-of-sample performance.