{ "cells": [ { "cell_type": "markdown", "id": "81215b75-f5f8-4c17-9cd4-08b1b0ed4234", "metadata": {}, "source": [ "# Zero-inflated Poisson\n", "\n", "The Zero-Inflated Poisson (ZIP) model is an extension of the standard Poisson model designed to handle an excess of zero-goal outcomes in football match data. \n", "\n", "Traditional Poisson models may struggle with matches that end in goalless draws more frequently than expected, often due to defensive tactics or low-quality attacking play. \n", "\n", "The ZIP model addresses this by introducing a separate process that accounts for the probability of an excess number of zeros, improving predictions for match results, goal distributions, and betting markets like correct scores and over/under goals. \n", "\n", "This makes it particularly useful for leagues or teams where 0-0 results occur more often than a simple Poisson distribution would suggest." ] }, { "cell_type": "code", "execution_count": 1, "id": "1f931497-c1f9-4cb4-969a-058676e42a24", "metadata": { "tags": [] }, "outputs": [], "source": [ "import penaltyblog as pb" ] }, { "cell_type": "markdown", "id": "4a1b5c76-8f47-4f59-8351-d5add2f69309", "metadata": {}, "source": [ "## Get data from football-data.co.uk" ] }, { "cell_type": "code", "execution_count": 2, "id": "949b129d-e4e5-4975-8318-dd601d918e90", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", " | date | \n", "datetime | \n", "season | \n", "competition | \n", "div | \n", "time | \n", "team_home | \n", "team_away | \n", "fthg | \n", "ftag | \n", "... | \n", "b365_cahh | \n", "b365_caha | \n", "pcahh | \n", "pcaha | \n", "max_cahh | \n", "max_caha | \n", "avg_cahh | \n", "avg_caha | \n", "goals_home | \n", "goals_away | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
1565308800---liverpool---norwich | \n", "2019-08-09 | \n", "2019-08-09 20:00:00 | \n", "2019-2020 | \n", "ENG Premier League | \n", "E0 | \n", "20:00 | \n", "Liverpool | \n", "Norwich | \n", "4 | \n", "1 | \n", "... | \n", "1.91 | \n", "1.99 | \n", "1.94 | \n", "1.98 | \n", "1.99 | \n", "2.07 | \n", "1.90 | \n", "1.99 | \n", "4 | \n", "1 | \n", "
1565395200---bournemouth---sheffield_united | \n", "2019-08-10 | \n", "2019-08-10 15:00:00 | \n", "2019-2020 | \n", "ENG Premier League | \n", "E0 | \n", "15:00 | \n", "Bournemouth | \n", "Sheffield United | \n", "1 | \n", "1 | \n", "... | \n", "1.95 | \n", "1.95 | \n", "1.98 | \n", "1.95 | \n", "2.00 | \n", "1.96 | \n", "1.96 | \n", "1.92 | \n", "1 | \n", "1 | \n", "
1565395200---burnley---southampton | \n", "2019-08-10 | \n", "2019-08-10 15:00:00 | \n", "2019-2020 | \n", "ENG Premier League | \n", "E0 | \n", "15:00 | \n", "Burnley | \n", "Southampton | \n", "3 | \n", "0 | \n", "... | \n", "1.87 | \n", "2.03 | \n", "1.89 | \n", "2.03 | \n", "1.90 | \n", "2.07 | \n", "1.86 | \n", "2.02 | \n", "3 | \n", "0 | \n", "
1565395200---crystal_palace---everton | \n", "2019-08-10 | \n", "2019-08-10 15:00:00 | \n", "2019-2020 | \n", "ENG Premier League | \n", "E0 | \n", "15:00 | \n", "Crystal Palace | \n", "Everton | \n", "0 | \n", "0 | \n", "... | \n", "1.82 | \n", "2.08 | \n", "1.97 | \n", "1.96 | \n", "2.03 | \n", "2.08 | \n", "1.96 | \n", "1.93 | \n", "0 | \n", "0 | \n", "
1565395200---tottenham---aston_villa | \n", "2019-08-10 | \n", "2019-08-10 17:30:00 | \n", "2019-2020 | \n", "ENG Premier League | \n", "E0 | \n", "17:30 | \n", "Tottenham | \n", "Aston Villa | \n", "3 | \n", "1 | \n", "... | \n", "2.10 | \n", "1.70 | \n", "2.18 | \n", "1.77 | \n", "2.21 | \n", "1.87 | \n", "2.08 | \n", "1.80 | \n", "3 | \n", "1 | \n", "
5 rows × 111 columns
\n", "