Skip to content

The Application of Statistical Models to English Premier League Football Data

Notifications You must be signed in to change notification settings

reesnj/Modelling-Football-Data

Repository files navigation

The Application of Statistical Models to the 2019/20 English Premier League Football Season

Author: Nathan Rees (784823)

A collection of R scripts to support a final year Mathematics project, interested in applying statistical models to EPL football data. The analysis is split into the following two sections.

Logistic Regression

This section strives to construct a model to predict the probability of a home win, using logistic regression. The model utilises the following predictive variables; ability of the home team, ability of the away team, geographical distance between each team, and the impacts of Covid-19. The raw data, R script, and outputs can all be found within the folder titled "logistic_regression".

Survival Analysis

This section applies survival analysis techniques to investigate the time taken to score the first goal in a football match. The analysis considers the effects of the following independent variables; ability of the reference team, ability of the adverse team, location of the match, and the impacts of Covid-19. The raw data, R script, and outputs can all be found within the folder titled "survival_analysis".

About

The Application of Statistical Models to English Premier League Football Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages