Ranked Probability Scores#

The Ranked Probability Score (RPS) is a metric used to evaluate the accuracy of probabilistic forecasts, particularly valuable for assessing football predictions due to its suitability for outcomes with an inherent order (such as win, draw, and loss).

It quantifies how closely predicted probabilities align with actual results, rewarding forecasts that assign probabilities accurately across all possible outcomes.

A lower RPS indicates more accurate and reliable predictions, making it an excellent metric for assessing and comparing the performance of football forecasting models.

[1]:
import penaltyblog as pb

rps_average#

The rps_average function takes one or more sets of probabilites and observed outcomes and calculates the average ranked probability scores across all the sets.

[2]:
predictions = [[0.8, 0.1, 0.1], [0.2, 0.1, 0.7], [0.1, 0.1, 0.8]]
observed = [0, 2, 1]
rps_score = pb.metrics.rps_average(predictions, observed)

rps_score
[2]:
0.13833333333333334

rps_array#

The rps_array function takes one or more sets of probabilites and observed outcomes and returns the individual ranked probability scores across all the sets.

Examples below taken from Solving the problem of inadequate scoring rules for assessing probabilistic football forecast models

[3]:
predictions = [
    [1, 0, 0],
    [0.9, 0.1, 0],
    [0.8, 0.1, 0.1],
    [0.5, 0.25, 0.25],
    [0.35, 0.3, 0.35],
    [0.6, 0.3, 0.1],
    [0.6, 0.25, 0.15],
    [0.6, 0.15, 0.25],
    [0.57, 0.33, 0.1],
    [0.6, 0.2, 0.2],
]

observed = [0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0]

pb.metrics.rps_array(predictions, observed)
[3]:
array([0.     , 0.005  , 0.025  , 0.15625, 0.1225 , 0.185  , 0.09125,
       0.11125, 0.09745, 0.1    ])