Negative Binomial Model#
The Negative Binomial Goals Model is a statistical approach used to predict football match outcomes, especially useful when goal-scoring data exhibit overdispersion—variance greater than the mean — which standard Poisson models struggle to capture.
By incorporating an additional parameter to account for this variability, the Negative Binomial model provides more accurate probability estimates for betting markets such as match results, total goals, and Asian handicaps.
Its ability to handle highly variable scoring patterns makes it a valuable tool for more realistic and robust football predictions and betting analysis.
[1]:
import penaltyblog as pb
Get data from football-data.co.uk#
[2]:
fb = pb.scrapers.FootballData("ENG Premier League", "2019-2020")
df = fb.get_fixtures()
df.head()
[2]:
date | datetime | season | competition | div | time | team_home | team_away | fthg | ftag | ... | b365_cahh | b365_caha | pcahh | pcaha | max_cahh | max_caha | avg_cahh | avg_caha | goals_home | goals_away | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||||||||
1565308800---liverpool---norwich | 2019-08-09 | 2019-08-09 20:00:00 | 2019-2020 | ENG Premier League | E0 | 20:00 | Liverpool | Norwich | 4 | 1 | ... | 1.91 | 1.99 | 1.94 | 1.98 | 1.99 | 2.07 | 1.90 | 1.99 | 4 | 1 |
1565395200---bournemouth---sheffield_united | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Bournemouth | Sheffield United | 1 | 1 | ... | 1.95 | 1.95 | 1.98 | 1.95 | 2.00 | 1.96 | 1.96 | 1.92 | 1 | 1 |
1565395200---burnley---southampton | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Burnley | Southampton | 3 | 0 | ... | 1.87 | 2.03 | 1.89 | 2.03 | 1.90 | 2.07 | 1.86 | 2.02 | 3 | 0 |
1565395200---crystal_palace---everton | 2019-08-10 | 2019-08-10 15:00:00 | 2019-2020 | ENG Premier League | E0 | 15:00 | Crystal Palace | Everton | 0 | 0 | ... | 1.82 | 2.08 | 1.97 | 1.96 | 2.03 | 2.08 | 1.96 | 1.93 | 0 | 0 |
1565395200---tottenham---aston_villa | 2019-08-10 | 2019-08-10 17:30:00 | 2019-2020 | ENG Premier League | E0 | 17:30 | Tottenham | Aston Villa | 3 | 1 | ... | 2.10 | 1.70 | 2.18 | 1.77 | 2.21 | 1.87 | 2.08 | 1.80 | 3 | 1 |
5 rows × 111 columns
Train the model#
[3]:
clf = pb.models.NegativeBinomialGoalModel(
df["goals_home"], df["goals_away"], df["team_home"], df["team_away"]
)
clf.fit()
The model’s parameters#
[4]:
clf
[4]:
Module: Penaltyblog
Model: Negative Binomial
Number of parameters: 42
Log Likelihood: -1058.161
AIC: 2200.323
Team Attack Defence
------------------------------------------------------------
Arsenal 1.132 -0.935
Aston Villa 0.841 -0.618
Bournemouth 0.815 -0.649
Brighton 0.776 -0.836
Burnley 0.869 -0.908
Chelsea 1.346 -0.809
Crystal Palace 0.542 -0.922
Everton 0.899 -0.796
Leicester 1.305 -1.081
Liverpool 1.537 -1.286
Man City 1.721 -1.207
Man United 1.287 -1.213
Newcastle 0.754 -0.765
Norwich 0.392 -0.52
Sheffield United 0.761 -1.163
Southampton 1.05 -0.719
Tottenham 1.22 -0.952
Watford 0.708 -0.668
West Ham 1.014 -0.686
Wolves 1.031 -1.125
------------------------------------------------------------
Home Advantage: 0.228
Dispersion: 169.346
[5]:
clf.get_params()
[5]:
{'attack_Arsenal': np.float64(1.1318184252081676),
'attack_Aston Villa': np.float64(0.8410063036949371),
'attack_Bournemouth': np.float64(0.8146683441148156),
'attack_Brighton': np.float64(0.7764979411735493),
'attack_Burnley': np.float64(0.8688673699944128),
'attack_Chelsea': np.float64(1.3458143148336632),
'attack_Crystal Palace': np.float64(0.5424131018043771),
'attack_Everton': np.float64(0.8988762107645227),
'attack_Leicester': np.float64(1.3054952339728547),
'attack_Liverpool': np.float64(1.5372109102620741),
'attack_Man City': np.float64(1.7212414135168892),
'attack_Man United': np.float64(1.2866458632709807),
'attack_Newcastle': np.float64(0.7536019331172694),
'attack_Norwich': np.float64(0.3917266275415292),
'attack_Sheffield United': np.float64(0.7614969746976284),
'attack_Southampton': np.float64(1.0497226370648858),
'attack_Tottenham': np.float64(1.2201655557604885),
'attack_Watford': np.float64(0.7077285582936104),
'attack_West Ham': np.float64(1.0139352315932835),
'attack_Wolves': np.float64(1.031067049320062),
'defence_Arsenal': np.float64(-0.9345884934078494),
'defence_Aston Villa': np.float64(-0.617765273844357),
'defence_Bournemouth': np.float64(-0.6494334697777882),
'defence_Brighton': np.float64(-0.8355570494688354),
'defence_Burnley': np.float64(-0.9079517500454989),
'defence_Chelsea': np.float64(-0.808783576668979),
'defence_Crystal Palace': np.float64(-0.9216013109469254),
'defence_Everton': np.float64(-0.7962299637683887),
'defence_Leicester': np.float64(-1.0814302729541432),
'defence_Liverpool': np.float64(-1.2857220836264767),
'defence_Man City': np.float64(-1.2072699089858834),
'defence_Man United': np.float64(-1.2127221097660073),
'defence_Newcastle': np.float64(-0.7652960967925935),
'defence_Norwich': np.float64(-0.5196332971694073),
'defence_Sheffield United': np.float64(-1.1626348508558566),
'defence_Southampton': np.float64(-0.7186375293496271),
'defence_Tottenham': np.float64(-0.9519522095011826),
'defence_Watford': np.float64(-0.6683202768010704),
'defence_West Ham': np.float64(-0.6861901852523249),
'defence_Wolves': np.float64(-1.124515844873471),
'home_advantage': np.float64(0.22801810499398117),
'dispersion': np.float64(169.3456461442496)}
Predict Match Outcomes#
[6]:
probs = clf.predict("Liverpool", "Wolves")
probs
[6]:
Module: Penaltyblog
Class: FootballProbabilityGrid
Home Goal Expectation: [1.89783388]
Away Goal Expectation: [0.77518386]
Home Win: 0.6381424457118116
Draw: 0.21493549027932218
Away Win: 0.14688631509986005
1x2 Probabilities#
[7]:
probs.home_draw_away
[7]:
[np.float64(0.6381424457118116),
np.float64(0.21493549027932218),
np.float64(0.14688631509986005)]
[8]:
probs.home_win
[8]:
np.float64(0.6381424457118116)
[9]:
probs.draw
[9]:
np.float64(0.21493549027932218)
[10]:
probs.away_win
[10]:
np.float64(0.14688631509986005)
Probablity of Total Goals >1.5#
[11]:
probs.total_goals("over", 1.5)
[11]:
np.float64(0.7449383696657509)
Probability of Asian Handicap 1.5#
[12]:
probs.asian_handicap("home", 1.5)
[12]:
np.float64(0.384774084343017)
Probability of both teams scoring#
[13]:
probs.both_teams_to_score
[13]:
np.float64(0.45696238689736385)