{ "cells": [ { "cell_type": "markdown", "id": "81215b75-f5f8-4c17-9cd4-08b1b0ed4234", "metadata": {}, "source": [ "# Bivariate Weibull Count Plus Copula Model\n", "\n", "The Bivariate Weibull Count Plus Copula Model is a sophisticated statistical framework designed for predicting football match outcomes by jointly modeling the number of goals scored by both teams. \n", "\n", "It extends traditional Poisson-based approaches by incorporating Weibull-distributed goal counts, which allow for greater flexibility in capturing goal-scoring patterns, particularly when goal distributions exhibit overdispersion. \n", "\n", "Additionally, the copula component explicitly models the dependence between home and away team goal counts, improving prediction accuracy for correct scores, match outcomes, and betting markets like total goals and Asian handicaps. \n", "\n", "This advanced model is particularly valuable for football analytics when capturing nuanced relationships between teams' performances is crucial for better forecasting." ] }, { "cell_type": "code", "execution_count": 1, "id": "1f931497-c1f9-4cb4-969a-058676e42a24", "metadata": { "tags": [] }, "outputs": [], "source": [ "import penaltyblog as pb" ] }, { "cell_type": "markdown", "id": "4a1b5c76-8f47-4f59-8351-d5add2f69309", "metadata": {}, "source": [ "## Get data from football-data.co.uk" ] }, { "cell_type": "code", "execution_count": 2, "id": "949b129d-e4e5-4975-8318-dd601d918e90", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
datedatetimeseasoncompetitiondivtimeteam_hometeam_awayfthgftag...b365_cahhb365_cahapcahhpcahamax_cahhmax_cahaavg_cahhavg_cahagoals_homegoals_away
id
1565308800---liverpool---norwich2019-08-092019-08-09 20:00:002019-2020ENG Premier LeagueE020:00LiverpoolNorwich41...1.911.991.941.981.992.071.901.9941
1565395200---bournemouth---sheffield_united2019-08-102019-08-10 15:00:002019-2020ENG Premier LeagueE015:00BournemouthSheffield United11...1.951.951.981.952.001.961.961.9211
1565395200---burnley---southampton2019-08-102019-08-10 15:00:002019-2020ENG Premier LeagueE015:00BurnleySouthampton30...1.872.031.892.031.902.071.862.0230
1565395200---crystal_palace---everton2019-08-102019-08-10 15:00:002019-2020ENG Premier LeagueE015:00Crystal PalaceEverton00...1.822.081.971.962.032.081.961.9300
1565395200---tottenham---aston_villa2019-08-102019-08-10 17:30:002019-2020ENG Premier LeagueE017:30TottenhamAston Villa31...2.101.702.181.772.211.872.081.8031
\n", "

5 rows × 111 columns

\n", "
" ], "text/plain": [ " date datetime \\\n", "id \n", "1565308800---liverpool---norwich 2019-08-09 2019-08-09 20:00:00 \n", "1565395200---bournemouth---sheffield_united 2019-08-10 2019-08-10 15:00:00 \n", "1565395200---burnley---southampton 2019-08-10 2019-08-10 15:00:00 \n", "1565395200---crystal_palace---everton 2019-08-10 2019-08-10 15:00:00 \n", "1565395200---tottenham---aston_villa 2019-08-10 2019-08-10 17:30:00 \n", "\n", " season competition \\\n", "id \n", "1565308800---liverpool---norwich 2019-2020 ENG Premier League \n", "1565395200---bournemouth---sheffield_united 2019-2020 ENG Premier League \n", "1565395200---burnley---southampton 2019-2020 ENG Premier League \n", "1565395200---crystal_palace---everton 2019-2020 ENG Premier League \n", "1565395200---tottenham---aston_villa 2019-2020 ENG Premier League \n", "\n", " div time team_home \\\n", "id \n", "1565308800---liverpool---norwich E0 20:00 Liverpool \n", "1565395200---bournemouth---sheffield_united E0 15:00 Bournemouth \n", "1565395200---burnley---southampton E0 15:00 Burnley \n", "1565395200---crystal_palace---everton E0 15:00 Crystal Palace \n", "1565395200---tottenham---aston_villa E0 17:30 Tottenham \n", "\n", " team_away fthg ftag \\\n", "id \n", "1565308800---liverpool---norwich Norwich 4 1 \n", "1565395200---bournemouth---sheffield_united Sheffield United 1 1 \n", "1565395200---burnley---southampton Southampton 3 0 \n", "1565395200---crystal_palace---everton Everton 0 0 \n", "1565395200---tottenham---aston_villa Aston Villa 3 1 \n", "\n", " ... b365_cahh b365_caha pcahh \\\n", "id ... \n", "1565308800---liverpool---norwich ... 1.91 1.99 1.94 \n", "1565395200---bournemouth---sheffield_united ... 1.95 1.95 1.98 \n", "1565395200---burnley---southampton ... 1.87 2.03 1.89 \n", "1565395200---crystal_palace---everton ... 1.82 2.08 1.97 \n", "1565395200---tottenham---aston_villa ... 2.10 1.70 2.18 \n", "\n", " pcaha max_cahh max_caha \\\n", "id \n", "1565308800---liverpool---norwich 1.98 1.99 2.07 \n", "1565395200---bournemouth---sheffield_united 1.95 2.00 1.96 \n", "1565395200---burnley---southampton 2.03 1.90 2.07 \n", "1565395200---crystal_palace---everton 1.96 2.03 2.08 \n", "1565395200---tottenham---aston_villa 1.77 2.21 1.87 \n", "\n", " avg_cahh avg_caha goals_home \\\n", "id \n", "1565308800---liverpool---norwich 1.90 1.99 4 \n", "1565395200---bournemouth---sheffield_united 1.96 1.92 1 \n", "1565395200---burnley---southampton 1.86 2.02 3 \n", "1565395200---crystal_palace---everton 1.96 1.93 0 \n", "1565395200---tottenham---aston_villa 2.08 1.80 3 \n", "\n", " goals_away \n", "id \n", "1565308800---liverpool---norwich 1 \n", "1565395200---bournemouth---sheffield_united 1 \n", "1565395200---burnley---southampton 0 \n", "1565395200---crystal_palace---everton 0 \n", "1565395200---tottenham---aston_villa 1 \n", "\n", "[5 rows x 111 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fb = pb.scrapers.FootballData(\"ENG Premier League\", \"2019-2020\")\n", "df = fb.get_fixtures()\n", "\n", "df.head()" ] }, { "cell_type": "markdown", "id": "9257f0fc-5f2b-402f-9209-d005d14880be", "metadata": {}, "source": [ "## Train the Model" ] }, { "cell_type": "code", "execution_count": 3, "id": "7d39d92f-6fa0-4a2a-8a48-22d214e38efc", "metadata": { "tags": [] }, "outputs": [], "source": [ "clf = pb.models.WeibullCopulaGoalsModel(\n", " df[\"goals_home\"], df[\"goals_away\"], df[\"team_home\"], df[\"team_away\"]\n", ")\n", "clf.fit()" ] }, { "cell_type": "markdown", "id": "63a12589-0066-431f-8444-92e2944b55a4", "metadata": {}, "source": [ "## The model's parameters" ] }, { "cell_type": "code", "execution_count": 4, "id": "ffe48c5e-3e8c-4a99-be9f-a1b46307c981", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Module: Penaltyblog\n", "\n", "Model: Bivariate Weibull Count + Copula\n", "\n", "Number of parameters: 43\n", "Log Likelihood: -1049.502\n", "AIC: 2185.004\n", "\n", "Team Attack Defence \n", "------------------------------------------------------------\n", "Arsenal 1.163 -0.828 \n", "Aston Villa 0.81 -0.466 \n", "Bournemouth 0.783 -0.504 \n", "Brighton 0.741 -0.705 \n", "Burnley 0.825 -0.786 \n", "Chelsea 1.344 -0.674 \n", "Crystal Palace 0.475 -0.791 \n", "Everton 0.853 -0.649 \n", "Leicester 1.317 -0.994 \n", "Liverpool 1.559 -1.179 \n", "Man City 1.709 -1.132 \n", "Man United 1.292 -1.156 \n", "Newcastle 0.715 -0.608 \n", "Norwich 0.338 -0.355 \n", "Sheffield United 0.716 -1.067 \n", "Southampton 1.024 -0.583 \n", "Tottenham 1.245 -0.835 \n", "Watford 0.7 -0.665 \n", "West Ham 1.017 -0.566 \n", "Wolves 1.034 -1.029 \n", "------------------------------------------------------------\n", "Home Advantage: 0.237\n", "Weibull Shape: 1.191\n", "Kappa: -1.124" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "clf" ] }, { "cell_type": "code", "execution_count": 5, "id": "fc93ec32-d113-4155-a516-abfe58dc8469", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "{'attack_Arsenal': np.float64(1.16259374410796),\n", " 'attack_Aston Villa': np.float64(0.8095022336740563),\n", " 'attack_Bournemouth': np.float64(0.7828356246312403),\n", " 'attack_Brighton': np.float64(0.7411382844943274),\n", " 'attack_Burnley': np.float64(0.8245955288804583),\n", " 'attack_Chelsea': np.float64(1.3435746341571566),\n", " 'attack_Crystal Palace': np.float64(0.47529592595775816),\n", " 'attack_Everton': np.float64(0.8527277199218508),\n", " 'attack_Leicester': np.float64(1.3171425299831485),\n", " 'attack_Liverpool': np.float64(1.5585738201391832),\n", " 'attack_Man City': np.float64(1.7093313894060342),\n", " 'attack_Man United': np.float64(1.2918401696521093),\n", " 'attack_Newcastle': np.float64(0.7147581606014619),\n", " 'attack_Norwich': np.float64(0.33772678732379247),\n", " 'attack_Sheffield United': np.float64(0.7159069282535363),\n", " 'attack_Southampton': np.float64(1.023930037749901),\n", " 'attack_Tottenham': np.float64(1.2449775431354324),\n", " 'attack_Watford': np.float64(0.699657704066482),\n", " 'attack_West Ham': np.float64(1.0172106325524315),\n", " 'attack_Wolves': np.float64(1.033576507630148),\n", " 'defense_Arsenal': np.float64(-0.8281738204952019),\n", " 'defense_Aston Villa': np.float64(-0.46550368389038743),\n", " 'defense_Bournemouth': np.float64(-0.5042771094115648),\n", " 'defense_Brighton': np.float64(-0.7054618304504835),\n", " 'defense_Burnley': np.float64(-0.7855322875689625),\n", " 'defense_Chelsea': np.float64(-0.6738309989097833),\n", " 'defense_Crystal Palace': np.float64(-0.7906810455804004),\n", " 'defense_Everton': np.float64(-0.6491088057506019),\n", " 'defense_Leicester': np.float64(-0.9943647106590421),\n", " 'defense_Liverpool': np.float64(-1.1793125500974093),\n", " 'defense_Man City': np.float64(-1.1320386579970085),\n", " 'defense_Man United': np.float64(-1.1559852251791218),\n", " 'defense_Newcastle': np.float64(-0.6081339959788542),\n", " 'defense_Norwich': np.float64(-0.3553524059816674),\n", " 'defense_Sheffield United': np.float64(-1.0673273897843734),\n", " 'defense_Southampton': np.float64(-0.5825445597267889),\n", " 'defense_Tottenham': np.float64(-0.8354532263823686),\n", " 'defense_Watford': np.float64(-0.6653714381676077),\n", " 'defense_West Ham': np.float64(-0.5660042508403601),\n", " 'defense_Wolves': np.float64(-1.0294936203432046),\n", " 'home_advantage': np.float64(0.23662029387143488),\n", " 'shape': np.float64(1.1914362553954994),\n", " 'kappa': np.float64(-1.1239649112332324)}" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "clf.get_params()" ] }, { "cell_type": "markdown", "id": "43bb1f12-7010-421b-bf93-bb8e1dba2df6", "metadata": {}, "source": [ "## Predict Match Outcomes" ] }, { "cell_type": "code", "execution_count": 6, "id": "3a047b77-707d-46b6-bcf8-57f3356efee3", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "Module: Penaltyblog\n", "\n", "Class: FootballProbabilityGrid\n", "\n", "Home Goal Expectation: [2.15050026]\n", "Away Goal Expectation: [0.86438583]\n", "\n", "Home Win: 0.6321733110211908\n", "Draw: 0.2082763227463461\n", "Away Win: 0.15955036614422274" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "probs = clf.predict(\"Liverpool\", \"Wolves\")\n", "probs" ] }, { "cell_type": "markdown", "id": "2a5274e7-d13e-455b-8e77-a6f51ba6f830", "metadata": {}, "source": [ "### 1x2 Probabilities" ] }, { "cell_type": "code", "execution_count": 7, "id": "cc1d6199-c35e-4ea3-bf82-a89c31a7277d", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "[np.float64(0.6321733110211908),\n", " np.float64(0.2082763227463461),\n", " np.float64(0.15955036614422274)]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "probs.home_draw_away" ] }, { "cell_type": "code", "execution_count": 8, "id": "eef96983-d83d-4c39-bd49-47cb4a704ab4", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "np.float64(0.6321733110211908)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "probs.home_win" ] }, { "cell_type": "code", "execution_count": 9, "id": "e08561b2-07ed-47b3-89d7-14c0a05cf854", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "np.float64(0.2082763227463461)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "probs.draw" ] }, { "cell_type": "code", "execution_count": 10, "id": "594e21a7-9a75-49a3-b3e8-50fa4bd8ac51", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "np.float64(0.15955036614422274)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "probs.away_win" ] }, { "cell_type": "markdown", "id": "9996be1b-acf8-4305-9bf0-6e4832505d47", "metadata": {}, "source": [ "### Probablity of Total Goals >1.5" ] }, { "cell_type": "code", "execution_count": 11, "id": "8da5ea91-ff28-4c6d-b6bf-0d5ef417da2b", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "np.float64(0.8056653102057293)" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "probs.total_goals(\"over\", 1.5)" ] }, { "cell_type": "markdown", "id": "5a0876d3-9d69-4b63-ae8a-d2b3b8f40aa6", "metadata": {}, "source": [ "### Probability of Asian Handicap 1.5" ] }, { "cell_type": "code", "execution_count": 12, "id": "280e7570-5010-4b39-8104-71ca27e4005a", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "np.float64(0.38394326568912934)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "probs.asian_handicap(\"home\", 1.5)" ] }, { "cell_type": "markdown", "id": "f1205e38-8afc-45fc-ba5f-59292aad9e21", "metadata": {}, "source": [ "## Probability of both teams scoring" ] }, { "cell_type": "code", "execution_count": 13, "id": "1b63af09-9383-4c5a-ae1c-dadb1a57193a", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "np.float64(0.4978154468351224)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "probs.both_teams_to_score" ] }, { "cell_type": "markdown", "id": "8d8ace53-efc0-4227-bbdc-8c54e8b1e05c", "metadata": {}, "source": [ "## Train the model with more recent data weighted to be more important" ] }, { "cell_type": "code", "execution_count": 14, "id": "c5fd1d29-cdac-4b70-b4a7-04f64cb87eea", "metadata": { "tags": [] }, "outputs": [], "source": [ "weights = pb.models.dixon_coles_weights(df[\"date\"], 0.001)\n", "\n", "clf = pb.models.WeibullCopulaGoalsModel(\n", " df[\"goals_home\"], df[\"goals_away\"], df[\"team_home\"], df[\"team_away\"], weights\n", ")\n", "clf.fit()" ] }, { "cell_type": "code", "execution_count": 15, "id": "954c880f-e861-406c-9ef5-b3b05bfa7f6e", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "Module: Penaltyblog\n", "\n", "Model: Bivariate Weibull Count + Copula\n", "\n", "Number of parameters: 43\n", "Log Likelihood: -873.221\n", "AIC: 1832.443\n", "\n", "Team Attack Defence \n", "------------------------------------------------------------\n", "Arsenal 1.136 -0.816 \n", "Aston Villa 0.746 -0.455 \n", "Bournemouth 0.757 -0.47 \n", "Brighton 0.692 -0.669 \n", "Burnley 0.781 -0.772 \n", "Chelsea 1.313 -0.639 \n", "Crystal Palace 0.43 -0.739 \n", "Everton 0.809 -0.628 \n", "Leicester 1.252 -0.932 \n", "Liverpool 1.516 -1.117 \n", "Man City 1.674 -1.142 \n", "Man United 1.283 -1.146 \n", "Newcastle 0.704 -0.56 \n", "Norwich 0.229 -0.32 \n", "Sheffield United 0.676 -1.016 \n", "Southampton 1.01 -0.577 \n", "Tottenham 1.207 -0.822 \n", "Watford 0.689 -0.634 \n", "West Ham 0.998 -0.546 \n", "Wolves 0.985 -1.022 \n", "------------------------------------------------------------\n", "Home Advantage: 0.244\n", "Weibull Shape: 1.189\n", "Kappa: -1.151" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "clf" ] }, { "cell_type": "code", "execution_count": null, "id": "22cb6d4c-bc32-4a75-b803-b96c959184a8", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.1" } }, "nbformat": 4, "nbformat_minor": 5 }