About — Ozzy Analytics

What Makes This Different

Most sports models are black boxes. Ours is a transparent, bottom-up simulation that models every plate appearance individually — 10,000 times per game — using real Statcast data, professional scouting metrics, and game-time weather conditions. No shortcuts, no vibes, no "trust me" numbers.

10,000 Simulations per game

8 PA outcome categories

2,000+ Player profiles tracked

30 Team-specific bullpens

The Pipeline

Every day before first pitch, the model runs a complete pipeline:

1

Data Ingestion

Pull confirmed lineups and live odds from every major sportsbook.

2

Player Profiles

Build matchup-specific probabilities for every batter-pitcher combination using 3 years of Statcast + skills metrics.

3

Simulation

Simulate each game 10,000 times PA-by-PA, incorporating park factors, weather, bullpen tiers, and baserunning.

4

Edge Detection

Compare model probabilities to market lines. Only bet when the edge exceeds 6% and passes confidence filters.

Player Profiles

Every player in the model has a detailed outcome profile built from multiple data sources, ensuring accuracy from Opening Day through October.

Marcel projections — Tom Tango's open projection system using 3 years of weighted Statcast history (5/4/3), regression to league average, and age curve adjustments. Provides day-1 profiles for 2,000+ players.
Baseball HQ skills metrics — professional scouting-grade indicators (contact rate, barrel rate, speed scores, swinging-strike rate) blended 50/50 with Marcel. Skills-based rates are more predictive than raw counting stats.
In-season cumulative updates — as real games are played, actual results gradually replace preseason projections. By mid-season, the model is driven primarily by current performance.
Platoon splits — every rate is computed separately vs LHP and vs RHP (batters) and vs LHB and vs RHB (pitchers). The platoon advantage is one of the most reliable effects in baseball.

Matchup Blending

When a specific batter faces a specific pitcher, their individual outcome rates are combined using the multiplicative odds-ratio method:

P(outcome) = (batter_rate × pitcher_rate) / league_rate

This is more accurate than the commonly-used log5 formula, which introduces ~7% systematic compression on the OUT rate — a small error that compounds across 70+ plate appearances per game into meaningful win probability distortion. Our method avoids this while preserving correct behavior when both players are league-average.

Full-Game Simulation

The simulation engine doesn't just predict outcomes — it plays entire baseball games, tracking every baserunner, every out, every scoring opportunity. This captures non-linear interactions that simple models miss.

⚡

Park Factors

6-component park adjustments (HR, 1B, 2B, 3B, BB, K) from Baseball HQ. Coors Field isn't just "hitter-friendly" — it's +30% runs, -11% K, +14% BA.

🌡

Weather

Real-time temperature and wind adjustments. Cold suppresses HR by up to 15%; wind blowing out at Wrigley can add 1-2 expected runs. Based on analysis of 2,391 games.

⛏

Tiered Bullpen

Relievers split into high-leverage and low-leverage tiers. Close games get the team's best arms; blowouts get mop-up guys. Mirrors actual managerial decisions.

📈

Times Through Order

Batters hit better the more they see a pitcher. Hit rates increase +10% on 2nd pass and +20% on 3rd+ pass — calibrated to MLB data.

🏃

Baserunning

Speed-adjusted stolen base attempts, wild pitches, errors, productive outs, and sacrifice flies. Every run-scoring mechanism is modeled.

🏗

Extra Innings

Ghost runner on 2B, high-leverage bullpen always deployed. The model handles the full complexity of modern extra-inning rules.

Team Strength Layer

PA-by-PA simulation captures matchup-level detail but can miss broader team quality signals. An Elo rating system (K=20, home field +24 pts, between-season regression) provides a team-strength prior that's blended 50/50 with simulation output. This combines the best of both approaches: granular matchup modeling and macro team quality.

Edge Detection & Bet Sizing

Finding edges isn't enough — you need to size them correctly and filter out noise. The model uses a multi-layer approach:

Two markets — moneyline (straight win) and run line (dog +1.5 spread). Each market has its own edge parameters tuned via grid search.
Simulation-derived probabilities — cover probabilities for run lines come directly from 10,000 margin distributions, not heuristics. The model knows exactly how often each team wins by 2+ runs.
Alpha shrinkage — model probabilities are blended toward market consensus. The market is usually right; we only bet when the disagreement is significant and justified.
Confidence gating — early-season bets with thin data are down-weighted. Model-market agreement adds confidence; large disagreements are penalized.
6% minimum edge — small edges are more likely to be noise than signal. This threshold was optimized via grid search across multiple seasons.
15% maximum edge — when our model disagrees with the market by more than 15%, the market is almost always right. These bets are filtered out.
Quarter-Kelly sizing — mathematically optimal growth rate, reduced by 75% for safety. No single bet exceeds 5% of bankroll.

How We Use Units

Every pick is sized in units so followers can scale to any bankroll without doing the math. The convention is the standard used across sports betting:

1 unit (1u) = 1% of starting bankroll = $100

Our tracked bankroll started at $10,000 (100u) on Opening Day 2026. Every P&L, wager, and bankroll figure on the site is shown in units first, with the dollar equivalent in parentheses.

Scale to Your Own Roll

Whatever you're working with, 1u = 1% of your bankroll. When a pick reads "CHC ML — 1.5u", you bet 1.5% of your own roll.

Your Bankroll 1 Unit A 1.5u Play A 5u Play

$500$5$7.50$25

$1,000$10$15$50

$5,000$50$75$250

$10,000$100$150$500

Why Our Plays Vary in Size

We don't flat-bet. The model sizes each play using quarter-Kelly based on edge and confidence:

stake = 0.25 × (bp − q) / b

Where b is decimal odds minus 1, p is our model's win probability, and q is 1−p. The result is scaled to bankroll.

Low-confidence edges (just above the 6% minimum) land around 1–3u
Typical plays sit in the 3–7u range
High-conviction spots — strong edge on a fair-priced favorite — can hit 8–12u
Hard cap: no single bet exceeds 5% of bankroll in Kelly terms, though the unit display can be higher on low-vig favorites

Full Kelly maximizes long-run log-growth but has brutal variance. Quarter-Kelly keeps roughly 94% of the growth rate with half the drawdown — the right tradeoff for a public track record that has to survive bad weeks.

Follow at Your Own Pace

Nothing here is advice. If 10u reads too aggressive for your situation, cap your own plays at 2u or 3u and take the ones you like most. The math doesn't care about the absolute size — only the fraction of bankroll. Never bet money you can't afford to lose.

Out-of-Sample Validation

The ultimate test of any model is performance on data it has never seen. The model was developed on 2021–2024 data and validated on the complete 2025 MLB season with zero parameter adjustments:

+8.8% ML ROI (366 bets)

48.4% Win rate at avg +97

0.2428 Brier score

Moneyline bets were profitable out-of-sample on the full 2025 season. The model performed better on unseen 2025 data than on its 2024 training set — the opposite of overfitting.

View full backtest results with every bet →

Under the Hood

This isn't a weekend project. The model represents thousands of hours of research, development, and backtesting across 12 major iterations:

4 years of Statcast data — every pitch thrown in MLB from 2021–2024, totaling billions of data points
Baseball HQ subscription data — professional-grade skills metrics not available in free datasets
Custom Marcel projection system — built from scratch using Tom Tango's open methodology
Bayesian regression framework — outcome-specific shrinkage weights calibrated separately for batters and pitchers
Pre-computed PA arrays — 9.4x simulation speedup enables 10,000 sims per game in under 4 seconds
Rolling backtest framework — no look-ahead bias, only data available before each game is used
Grid-searched betting parameters — alpha, edge thresholds, and confidence gates optimized across multiple seasons

Disclaimer

This model is for educational and entertainment purposes. Past backtest performance does not guarantee future results. Sports betting involves risk — only bet what you can afford to lose. The model's edge, while validated out-of-sample, may not persist as markets adapt.

The Model