Poisson#
The Poisson model is a widely-used statistical approach for predicting football match outcomes by modeling the number of goals scored by each team as independent Poisson-distributed events.
This model is particularly effective because it captures the randomness inherent in goal-scoring, allowing users to accurately estimate match result probabilities, goal totals, and various betting markets, including over/under goals and Asian handicaps.
Its simplicity, interpretability, and robustness have made it a foundational tool in football analytics and betting.
[1]:
import penaltyblog as pb
Get data from football-data.co.uk#
[2]:
fb = pb.scrapers.FootballData("ENG Premier League", "2019-2020")
df = fb.get_fixtures()
df.head()
[2]:
date | datetime | season | competition | div | time | team_home | team_away | fthg | ftag | ... | b365_cahh | b365_caha | pcahh | pcaha | max_cahh | max_caha | avg_cahh | avg_caha | goals_home | goals_away | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||||||||
1565308800---liverpool---norwich | 2019-08-09 | 2019-08-09 20:00:00 | 2019-2020 | ENG Premier League | E0 | 20:00 | Liverpool | Norwich | 4 | 1 | ... | 1.91 | 1.99 | 1.94 | 1.98 | 1.99 | 2.07 | 1.90 | 1.99 | 4 | 1 |
1565395200---bournemouth---sheffield_united | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Bournemouth | Sheffield United | 1 | 1 | ... | 1.95 | 1.95 | 1.98 | 1.95 | 2.00 | 1.96 | 1.96 | 1.92 | 1 | 1 |
1565395200---burnley---southampton | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Burnley | Southampton | 3 | 0 | ... | 1.87 | 2.03 | 1.89 | 2.03 | 1.90 | 2.07 | 1.86 | 2.02 | 3 | 0 |
1565395200---crystal_palace---everton | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Crystal Palace | Everton | 0 | 0 | ... | 1.82 | 2.08 | 1.97 | 1.96 | 2.03 | 2.08 | 1.96 | 1.93 | 0 | 0 |
1565395200---tottenham---aston_villa | 2019-08-10 | 2019-08-10 17:30:00 | 2019-2020 | ENG Premier League | E0 | 17:30 | Tottenham | Aston Villa | 3 | 1 | ... | 2.10 | 1.70 | 2.18 | 1.77 | 2.21 | 1.87 | 2.08 | 1.80 | 3 | 1 |
5 rows × 111 columns
Train the model#
[3]:
clf = pb.models.PoissonGoalsModel(
df["goals_home"], df["goals_away"], df["team_home"], df["team_away"]
)
clf.fit()
The model’s parameters#
[4]:
clf
[4]:
Module: Penaltyblog
Model: Poisson
Number of parameters: 41
Log Likelihood: -1057.712
AIC: 2197.424
Team Attack Defence
------------------------------------------------------------
Arsenal 1.133 -0.937
Aston Villa 0.84 -0.618
Bournemouth 0.813 -0.65
Brighton 0.777 -0.837
Burnley 0.87 -0.91
Chelsea 1.349 -0.806
Crystal Palace 0.543 -0.922
Everton 0.899 -0.795
Leicester 1.306 -1.084
Liverpool 1.536 -1.283
Man City 1.721 -1.206
Man United 1.286 -1.216
Newcastle 0.755 -0.766
Norwich 0.391 -0.521
Sheffield United 0.761 -1.163
Southampton 1.052 -0.719
Tottenham 1.218 -0.953
Watford 0.706 -0.669
West Ham 1.014 -0.688
Wolves 1.031 -1.125
------------------------------------------------------------
Home Advantage: 0.229
[5]:
clf.get_params()
[5]:
{'attack_Arsenal': np.float64(1.1331228901941925),
'attack_Aston Villa': np.float64(0.8398879945877571),
'attack_Bournemouth': np.float64(0.8130891418922739),
'attack_Brighton': np.float64(0.7765246201514078),
'attack_Burnley': np.float64(0.8703020733730807),
'attack_Chelsea': np.float64(1.3487330505653232),
'attack_Crystal Palace': np.float64(0.542542658860153),
'attack_Everton': np.float64(0.8994384134087898),
'attack_Leicester': np.float64(1.3056835701150769),
'attack_Liverpool': np.float64(1.5361418198993209),
'attack_Man City': np.float64(1.7212739991171064),
'attack_Man United': np.float64(1.2855299321448665),
'attack_Newcastle': np.float64(0.7545391466661135),
'attack_Norwich': np.float64(0.39112619748378996),
'attack_Sheffield United': np.float64(0.7614765214466674),
'attack_Southampton': np.float64(1.0516113492381278),
'attack_Tottenham': np.float64(1.217789270936845),
'attack_Watford': np.float64(0.706464518130879),
'attack_West Ham': np.float64(1.0135091468268111),
'attack_Wolves': np.float64(1.0312136849518139),
'defense_Arsenal': np.float64(-0.9373592173792837),
'defense_Aston Villa': np.float64(-0.61838890813481),
'defense_Bournemouth': np.float64(-0.6497773955489574),
'defense_Brighton': np.float64(-0.8366635726155827),
'defense_Burnley': np.float64(-0.9096437372541103),
'defense_Chelsea': np.float64(-0.8056792271120288),
'defense_Crystal Palace': np.float64(-0.9217221459400613),
'defense_Everton': np.float64(-0.7950817796413925),
'defense_Leicester': np.float64(-1.0841102810145857),
'defense_Liverpool': np.float64(-1.2833717774714317),
'defense_Man City': np.float64(-1.2063222477241144),
'defense_Man United': np.float64(-1.2156318569089062),
'defense_Newcastle': np.float64(-0.7660202258760153),
'defense_Norwich': np.float64(-0.5206704779032895),
'defense_Sheffield United': np.float64(-1.162693150527101),
'defense_Southampton': np.float64(-0.7187475334394907),
'defense_Tottenham': np.float64(-0.9533484235323105),
'defense_Watford': np.float64(-0.6693845588122723),
'defense_West Ham': np.float64(-0.6878743729022483),
'defense_Wolves': np.float64(-1.125240717318774),
'home_advantage': np.float64(0.22925183510740513)}
Predict Match Outcomes#
[6]:
probs = clf.predict("Liverpool", "Wolves")
probs
[6]:
Module: Penaltyblog
Class: FootballProbabilityGrid
Home Goal Expectation: [1.89677094]
Away Goal Expectation: [0.77712187]
Home Win: 0.6385320878305208
Draw: 0.21487358807665044
Away Win: 0.1465943221683105
1x2 Probabilities#
[7]:
probs.home_draw_away
[7]:
[np.float64(0.6385320878305208),
np.float64(0.21487358807665044),
np.float64(0.1465943221683105)]
[8]:
probs.home_win
[8]:
np.float64(0.6385320878305208)
[9]:
probs.draw
[9]:
np.float64(0.21487358807665044)
[10]:
probs.away_win
[10]:
np.float64(0.1465943221683105)
Probablity of Total Goals >1.5#
[11]:
probs.total_goals("over", 1.5)
[11]:
np.float64(0.7465632504485541)
Probability of Asian Handicap 1.5#
[12]:
probs.asian_handicap("home", 1.5)
[12]:
np.float64(0.3844258513443399)
Probability of both teams scoring#
[13]:
probs.both_teams_to_score
[13]:
np.float64(0.4592035336970112)