I built a Python system to optimise a bracket entry for a World Cup 2026 prediction pool. The pool scores 104 match score predictions, stage advancement for all 48 teams, and five joker picks at 2× points. Below is a summary of the components and findings.
Elo Ratings
Trained on 49,000+ international matches (1872–2026) from the martj42/international_results dataset. K-factor scales by tournament importance (20–60). Home advantage of +100 rating points on non-neutral matches. Goal-difference weighting matches the World Football Elo Ratings methodology.
Poisson Match Prediction
Expected goals per team are derived from the Elo difference :
Win/draw/loss probabilities are computed from independent Poisson distributions truncated at 8 goals, with cached PMF vectors per . The stack uses only the Python standard library — no numpy, scipy, or pandas. All math is hand-rolled.
Backtest
Evaluated across all 22 World Cups (1930–2022), 964 matches. Elo built from prior data only.
| Metric | Value |
|---|---|
| Overall accuracy | 54.8% |
| Winner accuracy (decisive only) | 70.4% |
| Brier score | 0.196 |
| Log loss | 0.991 |
| Champions correct | 3/22 |
The 55% figure matches the rate at which favourites actually win. 44% of matches have teams within 100 Elo. The model is well-calibrated — Brier score is within the published academic range (0.19–0.22, Groll, Ley, Zeileis).
A 225-combination parameter grid search (K-factor, home advantage, Gaussian vs Poisson, recency weighting) improved Brier by 0.4% — negligible. Recency weighting worsened accuracy by 3–5pp.
Monte Carlo Simulation
Seeded tournament runner for the 2026 format (12 groups of 4, top 2 + 8 best third-place advance).
- Deterministic per-matchup randomness via SHA-256 of match ID and team names.
- Truncated Poisson sampling (0–5 goals).
- Three-stage knockout resolution: 90 mins → extra time (0.35× goal rate) → penalties (empirical shootout distribution).
- Tournament modifiers: Germany +60, Croatia +55, England −20, Mexico +55. Multiplicative factors: knockout pedigree ×1.03, home continent ×1.02, generation decay ×0.96.
Bracket Optimizer
Simulated annealing over 104 match scores, 5 joker placements, and 32 knockout team predictions. Six mutation strategies:
- Score shift: ±1 on one team
- Score swap: exchange two match scores
- Joker swap: move a joker
- KO team flip: swap home/away in one knockout match
- Macro-swap: swap group winner/runner-up, rebuild entire knockout bracket
- Multi-flip: shift 2–5 random scores simultaneously
Cooling from , decay per step, 10,000 iterations. Parallel evaluation via ProcessPoolExecutor across CPU cores: 2,500 tournament seeds per candidate. Contrarian objective: . Spine Lock prevents macro-swaps on groups where #1 leads #2 by Elo.
| Metric | Value |
|---|---|
| Random baseline EV | ~57 pts |
| Optimizer EV | ~113 pts |
| 2022 deterministic backtest | 154 pts (38% of max 640) |
| Main bottleneck | 53% of KO matches have 0 correct teams |
Synthetic Opponent Field
Generates opponent brackets with cognitive biases to estimate win probability against the pool:
- Chalk bias (0.6): convert close matches to favourite wins, avoid draws
- Local bias (0.25): boost CONCACAF host teams one round
- Recency bias (0.15): boost recent champions
Empirical Score Distributions
Frequency tables from all 964 World Cup matches, split by group (727) and knockout (237). Used as sampling weights instead of the Poisson model for deterministic bracket generation.
Improvement Paths
- Dixon-Coles corrections: low-score adjustment + time-decayed parameter estimation.
- Player-level data: aggregated FIFA ratings or market values per squad.
- Bookmaker odds: blend model probabilities with market-implied odds.
- Form metrics: rolling goal difference, xG differentials, streak indicators.
Code
~3,500 lines of Python, zero external dependencies, MIT-licensed.
elo.py— Elo system + Poisson predictionwl/simulation.py— Monte Carlo tournament runnerwl/optimizer.py— Simulated annealing optimizerwl/field.py— Synthetic opponent generationwl/empirical.py— Empirical score distributionsbacktest.py— Historical validation (22 World Cups)score_pool.py— Pool scoring function
CLI commands: ratings, predict, simulate, backtest, optimize, entry.