Bivariate Poisson Model#
The Bivariate Poisson Model is an extension of the standard Poisson model that accounts for the correlation between the number of goals scored by each team in a football match.
Unlike independent Poisson models, which assume team goal distributions are unrelated, the Bivariate Poisson approach introduces a dependency structure that better captures real-world interactions, such as defensive and offensive interplay between teams.
This results in more accurate probability estimates for match outcomes, correct score predictions, and betting markets like Asian handicaps and total goals.
The model is particularly useful for improving forecasting accuracy in competitive matches where team performances are not entirely independent.
[1]:
import penaltyblog as pb
Get data from football-data.co.uk#
[2]:
fb = pb.scrapers.FootballData("ENG Premier League", "2019-2020")
df = fb.get_fixtures()
df.head()
[2]:
date | datetime | season | competition | div | time | team_home | team_away | fthg | ftag | ... | b365_cahh | b365_caha | pcahh | pcaha | max_cahh | max_caha | avg_cahh | avg_caha | goals_home | goals_away | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||||||||
1565308800---liverpool---norwich | 2019-08-09 | 2019-08-09 20:00:00 | 2019-2020 | ENG Premier League | E0 | 20:00 | Liverpool | Norwich | 4 | 1 | ... | 1.91 | 1.99 | 1.94 | 1.98 | 1.99 | 2.07 | 1.90 | 1.99 | 4 | 1 |
1565395200---bournemouth---sheffield_united | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Bournemouth | Sheffield United | 1 | 1 | ... | 1.95 | 1.95 | 1.98 | 1.95 | 2.00 | 1.96 | 1.96 | 1.92 | 1 | 1 |
1565395200---burnley---southampton | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Burnley | Southampton | 3 | 0 | ... | 1.87 | 2.03 | 1.89 | 2.03 | 1.90 | 2.07 | 1.86 | 2.02 | 3 | 0 |
1565395200---crystal_palace---everton | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Crystal Palace | Everton | 0 | 0 | ... | 1.82 | 2.08 | 1.97 | 1.96 | 2.03 | 2.08 | 1.96 | 1.93 | 0 | 0 |
1565395200---tottenham---aston_villa | 2019-08-10 | 2019-08-10 17:30:00 | 2019-2020 | ENG Premier League | E0 | 17:30 | Tottenham | Aston Villa | 3 | 1 | ... | 2.10 | 1.70 | 2.18 | 1.77 | 2.21 | 1.87 | 2.08 | 1.80 | 3 | 1 |
5 rows × 111 columns
Train the Model#
[3]:
clf = pb.models.BivariatePoissonGoalModel(
df["goals_home"], df["goals_away"], df["team_home"], df["team_away"]
)
clf.fit()
The model’s parameters#
[4]:
clf
[4]:
Module: Penaltyblog
Model: Bivariate Poisson
Number of parameters: 42
Log Likelihood: -1059.876
AIC: 2203.752
Team Attack Defence
------------------------------------------------------------
Arsenal 1.134 -0.987
Aston Villa 0.835 -0.652
Bournemouth 0.802 -0.687
Brighton 0.767 -0.877
Burnley 0.882 -0.939
Chelsea 1.356 -0.85
Crystal Palace 0.535 -0.957
Everton 0.885 -0.843
Leicester 1.325 -1.119
Liverpool 1.55 -1.347
Man City 1.745 -1.249
Man United 1.294 -1.274
Newcastle 0.747 -0.803
Norwich 0.361 -0.552
Sheffield United 0.756 -1.212
Southampton 1.059 -0.751
Tottenham 1.221 -1.003
Watford 0.705 -0.699
West Ham 1.014 -0.724
Wolves 1.025 -1.187
------------------------------------------------------------
Home Advantage: 0.237
Correlation: -3.0
[5]:
clf.get_params()
[5]:
{'attack_Arsenal': np.float64(1.133560802574037),
'attack_Aston Villa': np.float64(0.8350898400984498),
'attack_Bournemouth': np.float64(0.8015404096015637),
'attack_Brighton': np.float64(0.7674328208660928),
'attack_Burnley': np.float64(0.8822591220552471),
'attack_Chelsea': np.float64(1.3562926053643534),
'attack_Crystal Palace': np.float64(0.5348767383835888),
'attack_Everton': np.float64(0.8854463751643209),
'attack_Leicester': np.float64(1.3253210988334698),
'attack_Liverpool': np.float64(1.5500067231127181),
'attack_Man City': np.float64(1.7450410219723662),
'attack_Man United': np.float64(1.2938830262372865),
'attack_Newcastle': np.float64(0.7470789004929902),
'attack_Norwich': np.float64(0.36056444644586966),
'attack_Sheffield United': np.float64(0.7561475016812931),
'attack_Southampton': np.float64(1.0594558601023647),
'attack_Tottenham': np.float64(1.2214463479299582),
'attack_Watford': np.float64(0.7054564312709098),
'attack_West Ham': np.float64(1.014009505646218),
'attack_Wolves': np.float64(1.0250904221669037),
'defense_Arsenal': np.float64(-0.9867989766505458),
'defense_Aston Villa': np.float64(-0.6517511449587307),
'defense_Bournemouth': np.float64(-0.6874367252280075),
'defense_Brighton': np.float64(-0.877458942752178),
'defense_Burnley': np.float64(-0.9385627948200023),
'defense_Chelsea': np.float64(-0.8499169474665187),
'defense_Crystal Palace': np.float64(-0.9570569580061927),
'defense_Everton': np.float64(-0.8426424728271178),
'defense_Leicester': np.float64(-1.1187653346121664),
'defense_Liverpool': np.float64(-1.3469873675965822),
'defense_Man City': np.float64(-1.2485881916058201),
'defense_Man United': np.float64(-1.2737712763058457),
'defense_Newcastle': np.float64(-0.8028015712987764),
'defense_Norwich': np.float64(-0.5523345592912768),
'defense_Sheffield United': np.float64(-1.2120340284112339),
'defense_Southampton': np.float64(-0.7508665627531523),
'defense_Tottenham': np.float64(-1.003192069120226),
'defense_Watford': np.float64(-0.6985873243764104),
'defense_West Ham': np.float64(-0.7242617861788704),
'defense_Wolves': np.float64(-1.1868096051415054),
'home_advantage': np.float64(0.23692061130011463),
'correlation_log': np.float64(-2.999999999999999),
'lambda3': np.float64(0.049787068367863986)}
Predict Match Outcomes#
[6]:
probs = clf.predict("Liverpool", "Wolves")
probs
[6]:
Module: Penaltyblog
Class: FootballProbabilityGrid
Home Goal Expectation: [1.82233333]
Away Goal Expectation: [0.72477288]
Home Win: 0.6357320764164407
Draw: 0.2213346734110996
Away Win: 0.14290631670926676
1x2 Probabilities#
[7]:
probs.home_draw_away
[7]:
[np.float64(0.6357320764164407),
np.float64(0.2213346734110996),
np.float64(0.14290631670926676)]
[8]:
probs.home_win
[8]:
np.float64(0.6357320764164407)
[9]:
probs.draw
[9]:
np.float64(0.2213346734110996)
[10]:
probs.away_win
[10]:
np.float64(0.14290631670926676)
Probablity of Total Goals >1.5#
[11]:
probs.total_goals("over", 1.5)
[11]:
np.float64(0.7356970375004634)
Probability of Asian Handicap 1.5#
[12]:
probs.asian_handicap("home", 1.5)
[12]:
np.float64(0.3756034531458142)
Probability of both teams scoring#
[13]:
probs.both_teams_to_score
[13]:
np.float64(0.45978389607673564)
Train the model with more recent data weighted to be more important#
[14]:
weights = pb.models.dixon_coles_weights(df["date"], 0.001)
clf = pb.models.BivariatePoissonGoalModel(
df["goals_home"], df["goals_away"], df["team_home"], df["team_away"], weights
)
clf.fit()
[15]:
clf
[15]:
Module: Penaltyblog
Model: Bivariate Poisson
Number of parameters: 42
Log Likelihood: -881.987
AIC: 1847.973
Team Attack Defence
------------------------------------------------------------
Arsenal 1.146 -1.008
Aston Villa 0.813 -0.675
Bournemouth 0.813 -0.685
Brighton 0.758 -0.882
Burnley 0.877 -0.963
Chelsea 1.363 -0.852
Crystal Palace 0.526 -0.94
Everton 0.88 -0.858
Leicester 1.302 -1.095
Liverpool 1.548 -1.321
Man City 1.752 -1.292
Man United 1.321 -1.3
Newcastle 0.773 -0.798
Norwich 0.292 -0.555
Sheffield United 0.756 -1.197
Southampton 1.082 -0.784
Tottenham 1.222 -1.026
Watford 0.726 -0.701
West Ham 1.03 -0.739
Wolves 1.02 -1.215
------------------------------------------------------------
Home Advantage: 0.245
Correlation: -3.0
[ ]: