Pitcher Model

This repository contains a script that is deployed as a Google Cloud Function and is currently scheduled to run daily (at 7AM). The entrypoint main.py triggers a sequence to get the current days schedule and run a predictive model to predict the number of strike outs and walks thrown in each game by the starting pitcher. The model predictions are then written to a database and displayed on my personal website.

Files

exports.py handles writing to the database and is also equipped to send a summary email upon model completion (currently deactivated).
download_game_level_data.py handles the ETL pipeline for extracting game level variables including the season and historical stats of the pitcher, the pitchers team, the batting team, and the specific batters in the lineup. Relevent data is stored in a highly compressed format in the /data directory.
main.py calls ETL pipeline and trains model on most up to data data to make predictions for todays game. Current best performing model is an implementation of Gradient Boosting Trees (xgboost). To get an accurate estimate of model error, the model is first run holding out the previous two weeks of data and MSE is saved to a table before training on all past data to generate predictions.

Notebooks

strike_zone_models.ipynb demonstrates that various types of models we can fit to subsets of pitch data.
xgboost_BayesOptimization.ipynb visualizes the feature selection and meta-parameter selection for our model predictions.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
example		example
figs		figs
images		images
README.md		README.md
base_models.py		base_models.py
download_data.py		download_data.py
download_game_level_data.py		download_game_level_data.py
exports.py		exports.py
main.py		main.py
performance.parquet		performance.parquet
predict_table.parquet		predict_table.parquet
requirements.txt		requirements.txt
test_sz_models.py		test_sz_models.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pitcher Model

Files

Notebooks

Visualizations

Example Strike Zones

P(Strike | count, call_history)

About

Releases

Packages

Languages

TimCSheehan/pitcher_model_deploy

Folders and files

Latest commit

History

Repository files navigation

Pitcher Model

Files

Notebooks

Visualizations

Example Strike Zones

P(Strike | count, call_history)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages