Tyler Janak Baseball Analytics

Overview

TheBullpenBet is a fully automated analytics pipeline designed to simulate how a real-world predictive modeling system would operate under live conditions. It continuously ingests MLB data, processes player and team-level features, generates probabilistic predictions for multiple market types, and evaluates those predictions against real outcomes as games complete. Unlike a static model or notebook-based project, this system is designed to function as a persistent, self-updating analytics engine.

Key capabilities include:

automated daily data ingestion
rolling feature generation across multiple time horizons
multi-model inference system
real-time evaluation against sportsbook pricing
continuous historical re-scoring of all predictions

Launch Live App

Problem Context

MLB presents a uniquely difficult environment for predictive modeling due to the interaction of high variance, sparse events, and strong contextual dependency between players.

Unlike many structured prediction problems, outcomes are heavily influenced by:

matchup-specific effects (pitcher vs hitter interaction)
small sample volatility in daily performance
role-dependent variability (bullpen usage, lineup order, etc.)

This makes it difficult to extract stable signal without careful feature design and strict evaluation discipline.

Data Sources

The system is built on three primary data streams that together form a full representation of game context, player performance, and market expectations. Statcast data provides high-resolution pitch-level tracking, which is aggregated into player and team-level features such as rolling hitting performance, strikeout rates, and batted-ball quality metrics. The MLB Stats API provides structured game context, including confirmed lineups, starting pitchers, and official box scores used for evaluation. Sportsbook odds data provides the market baseline, allowing the system to compare model-implied probabilities against real pricing in order to estimate expected value.

Feature Engineering

Feature engineering is the core driver of model performance in this system. Raw MLB data is transformed into structured predictive signals through multiple layers of aggregation and normalization designed to reduce noise and isolate true performance signal.

Rolling Performance Windows

Each player is evaluated across multiple rolling time horizons (5, 7, 10, 14, 21, and 30 games), allowing the model to capture both short-term form and long-term baseline skill. Short windows are more reactive but higher variance, while longer windows provide stability and reduce sensitivity to small sample fluctuations.

Handedness Splits

Player performance is separated by opponent handedness to capture matchup-specific effects that are not visible in aggregate statistics. This is especially important in MLB due to the frequency of platoon advantages and pitcher-batter interaction asymmetry.

True Talent Estimation

Empirical Bayes shrinkage is applied to reduce noise in small sample performance estimates by pulling extreme values toward league averages in proportion to sample size. This improves stability and reduces overreaction to short-term streaks.

Matchup Modeling

Pitcher and hitter true talent estimates are combined into a unified interaction signal using a log5-style transformation, producing a probabilistic representation of expected matchup outcomes.

Data Leakage Controls

Early versions of the system included features derived from season-level aggregates that unintentionally introduced future information leakage. After identifying and removing these features, the system experienced lower in-sample accuracy but significantly improved out-of-sample performance.

Modeling System

The system uses a multi-model architecture designed around different prediction targets rather than a single unified model.

Game Outcome Model

A gradient boosted model (XGBoost) is used to estimate the probability of a home team win. The feature set includes team-level rolling performance metrics, starting pitcher indicators, bullpen strength, and home field advantage. Model selection is governed by a chronological validation process where new models are only deployed if they outperform the current production model.

NRFI Model

The No-Run First Inning model evaluates multiple algorithms, including logistic regression, random forest, and gradient boosting. Logistic regression is typically selected due to its stability under highly correlated feature sets and its strong generalization performance.

Player Projection Models

Each player statistic is modeled using two complementary approaches: a direct prediction model and a rate-based model scaled by expected playing time. These are combined using a weighted ensemble to balance responsiveness and stability. For sparse outcomes such as home runs. Poisson-based loss functions are used to better model discrete event distributions.

Model Performance Summary

The system is evaluated continuously against MLB box scores in a live environment where all historical predictions are re-scored over time. These numbers are all of the 2026 season as of 6/23/2026. If you are interested in my full testing statistics on each model feel free to reach out.

Game Outcome Model

608–562 record
52.2% accuracy
1,080+ evaluated games

NRFI Model

549–512 record
51.5% accuracy

Player-level performance

Strengths:

Walks (Hitters)
Innings Pitched (Pitchers)

Weaknesses:

Plate Appearances (Hitters)
Earned Runs (Pitchers)

These statistics are all of the 2026 season as of 6/23/2026. If you are interested in my full testing statistics on each model feel free to reach out.

Fantasy Baseball Applications

The same automated pipeline powering TheBullpenBet extends naturally into fantasy baseball. After building the original pipeline, extending it to fantasy was a natural fit, the original models already generate the player predictions that fantasy decisions depend on. The system produces both daily and season-long projections. This helps with making decsions for daily fantasy lineups, redraft trade valuations, and dynasty league rankings. This also shows the week's biggest risers and fallers, giving a quick read on which players' underlying metrics are trending in either direction.

Key Features

Daily fantasy projections driven by rolling Statcast features and matchup models, updated automatically each day as lineups confirm.
Season-long projections built from calibrated ML predictions that combine true talent estimates, handedness splits, and opponent context — not just raw counting stats.
Interactive dashboards for hitters, pitchers, and favorable matchup identification, designed to support decisions across all fantasy formats.
Continuously evaluated against official MLB box scores using the same re-scoring infrastructure as the main models, so projection accuracy is measured in a live environment rather than a static backtest.

What separates these projections from publicly available tools is the matchup-modeling layer, pitcher-hitter matchups are modeled using log5-style probability estimates and Statcast batted ball metrics, rather than relying on surface-level splits or aggregate stats.

Key Learnings

This project highlighted several important principles in applied predictive modeling.
Feature leakage was the most significant risk and had the largest impact on real-world performance when corrected.
Loss function selection played a major role in modeling sparse outcomes, with Poisson loss outperforming standard classification approaches for rare events.
Calibration significantly improved the usability of model outputs by aligning predicted probabilities with observed frequencies.

Finally, continuous evaluation through re-scoring all historical predictions provided a more realistic measure of model quality than static test sets.

Launch Live App