Negative Binomial Model#

The Negative Binomial Goals Model is a statistical approach used to predict football match outcomes, especially useful when goal-scoring data exhibit overdispersion—variance greater than the mean — which standard Poisson models struggle to capture.

By incorporating an additional parameter to account for this variability, the Negative Binomial model provides more accurate probability estimates for betting markets such as match results, total goals, and Asian handicaps.

Its ability to handle highly variable scoring patterns makes it a valuable tool for more realistic and robust football predictions and betting analysis.

[1]:
import penaltyblog as pb

Get data from football-data.co.uk#

[2]:
fb = pb.scrapers.FootballData("ENG Premier League", "2019-2020")
df = fb.get_fixtures()

df.head()
[2]:
date datetime season competition div time team_home team_away fthg ftag ... b365_cahh b365_caha pcahh pcaha max_cahh max_caha avg_cahh avg_caha goals_home goals_away
id
1565308800---liverpool---norwich 2019-08-09 2019-08-09 20:00:00 2019-2020 ENG Premier League E0 20:00 Liverpool Norwich 4 1 ... 1.91 1.99 1.94 1.98 1.99 2.07 1.90 1.99 4 1
1565395200---bournemouth---sheffield_united 2019-08-10 2019-08-10 15:00:00 2019-2020 ENG Premier League E0 15:00 Bournemouth Sheffield United 1 1 ... 1.95 1.95 1.98 1.95 2.00 1.96 1.96 1.92 1 1
1565395200---burnley---southampton 2019-08-10 2019-08-10 15:00:00 2019-2020 ENG Premier League E0 15:00 Burnley Southampton 3 0 ... 1.87 2.03 1.89 2.03 1.90 2.07 1.86 2.02 3 0
1565395200---crystal_palace---everton 2019-08-10 2019-08-10 15:00:00 2019-2020 ENG Premier League E0 15:00 Crystal Palace Everton 0 0 ... 1.82 2.08 1.97 1.96 2.03 2.08 1.96 1.93 0 0
1565395200---tottenham---aston_villa 2019-08-10 2019-08-10 17:30:00 2019-2020 ENG Premier League E0 17:30 Tottenham Aston Villa 3 1 ... 2.10 1.70 2.18 1.77 2.21 1.87 2.08 1.80 3 1

5 rows × 111 columns

Train the model#

[3]:
clf = pb.models.NegativeBinomialGoalModel(
    df["goals_home"], df["goals_away"], df["team_home"], df["team_away"]
)
clf.fit()

The model’s parameters#

[4]:
clf
[4]:
Module: Penaltyblog

Model: Negative Binomial

Number of parameters: 42
Log Likelihood: -1058.161
AIC: 2200.323

Team                 Attack               Defence
------------------------------------------------------------
Arsenal              1.132                -0.935
Aston Villa          0.841                -0.618
Bournemouth          0.815                -0.649
Brighton             0.776                -0.836
Burnley              0.869                -0.908
Chelsea              1.346                -0.809
Crystal Palace       0.542                -0.922
Everton              0.899                -0.796
Leicester            1.305                -1.081
Liverpool            1.537                -1.286
Man City             1.721                -1.207
Man United           1.287                -1.213
Newcastle            0.754                -0.765
Norwich              0.392                -0.52
Sheffield United     0.761                -1.163
Southampton          1.05                 -0.719
Tottenham            1.22                 -0.952
Watford              0.708                -0.668
West Ham             1.014                -0.686
Wolves               1.031                -1.125
------------------------------------------------------------
Home Advantage: 0.228
Dispersion: 169.346
[5]:
clf.get_params()
[5]:
{'attack_Arsenal': np.float64(1.1318184252081676),
 'attack_Aston Villa': np.float64(0.8410063036949371),
 'attack_Bournemouth': np.float64(0.8146683441148156),
 'attack_Brighton': np.float64(0.7764979411735493),
 'attack_Burnley': np.float64(0.8688673699944128),
 'attack_Chelsea': np.float64(1.3458143148336632),
 'attack_Crystal Palace': np.float64(0.5424131018043771),
 'attack_Everton': np.float64(0.8988762107645227),
 'attack_Leicester': np.float64(1.3054952339728547),
 'attack_Liverpool': np.float64(1.5372109102620741),
 'attack_Man City': np.float64(1.7212414135168892),
 'attack_Man United': np.float64(1.2866458632709807),
 'attack_Newcastle': np.float64(0.7536019331172694),
 'attack_Norwich': np.float64(0.3917266275415292),
 'attack_Sheffield United': np.float64(0.7614969746976284),
 'attack_Southampton': np.float64(1.0497226370648858),
 'attack_Tottenham': np.float64(1.2201655557604885),
 'attack_Watford': np.float64(0.7077285582936104),
 'attack_West Ham': np.float64(1.0139352315932835),
 'attack_Wolves': np.float64(1.031067049320062),
 'defence_Arsenal': np.float64(-0.9345884934078494),
 'defence_Aston Villa': np.float64(-0.617765273844357),
 'defence_Bournemouth': np.float64(-0.6494334697777882),
 'defence_Brighton': np.float64(-0.8355570494688354),
 'defence_Burnley': np.float64(-0.9079517500454989),
 'defence_Chelsea': np.float64(-0.808783576668979),
 'defence_Crystal Palace': np.float64(-0.9216013109469254),
 'defence_Everton': np.float64(-0.7962299637683887),
 'defence_Leicester': np.float64(-1.0814302729541432),
 'defence_Liverpool': np.float64(-1.2857220836264767),
 'defence_Man City': np.float64(-1.2072699089858834),
 'defence_Man United': np.float64(-1.2127221097660073),
 'defence_Newcastle': np.float64(-0.7652960967925935),
 'defence_Norwich': np.float64(-0.5196332971694073),
 'defence_Sheffield United': np.float64(-1.1626348508558566),
 'defence_Southampton': np.float64(-0.7186375293496271),
 'defence_Tottenham': np.float64(-0.9519522095011826),
 'defence_Watford': np.float64(-0.6683202768010704),
 'defence_West Ham': np.float64(-0.6861901852523249),
 'defence_Wolves': np.float64(-1.124515844873471),
 'home_advantage': np.float64(0.22801810499398117),
 'dispersion': np.float64(169.3456461442496)}

Predict Match Outcomes#

[6]:
probs = clf.predict("Liverpool", "Wolves")
probs
[6]:
Module: Penaltyblog

Class: FootballProbabilityGrid

Home Goal Expectation: [1.89783388]
Away Goal Expectation: [0.77518386]

Home Win: 0.6381424457118116
Draw: 0.21493549027932218
Away Win: 0.14688631509986005

1x2 Probabilities#

[7]:
probs.home_draw_away
[7]:
[np.float64(0.6381424457118116),
 np.float64(0.21493549027932218),
 np.float64(0.14688631509986005)]
[8]:
probs.home_win
[8]:
np.float64(0.6381424457118116)
[9]:
probs.draw
[9]:
np.float64(0.21493549027932218)
[10]:
probs.away_win
[10]:
np.float64(0.14688631509986005)

Probablity of Total Goals >1.5#

[11]:
probs.total_goals("over", 1.5)
[11]:
np.float64(0.7449383696657509)

Probability of Asian Handicap 1.5#

[12]:
probs.asian_handicap("home", 1.5)
[12]:
np.float64(0.384774084343017)

Probability of both teams scoring#

[13]:
probs.both_teams_to_score
[13]:
np.float64(0.45696238689736385)