Skip to content

An ETL project looking at box office data, IMDB ratings data, and Metacritic scores.

Notifications You must be signed in to change notification settings

3065923/Movie_Database

 
 

Repository files navigation

Movie_Database

We like movies! While this is true for most people, what factors actually lead to the average person cracking open their wallets and dropping their hard-earned cash on a movie ticket? Is this determined by genre? Is the MPAA rating a factor? What about the critical reaction to the film? What choices can producers make when deciding what movies to produce to maximize their profits at both the domestic and worldwide box office? Using data cribbed from Box Office Mojo, IMDB, and Metacritic, can we create a database to determine what factors to lead to the highest chance of profitability.

Data we are using

Box Office Mojo CSV pulled from Kaggle - From this CSV we will pull data on the Movie Title, Movie Year, Budget, Domestic Revenue, International Revenue, Worldwide Revenue, MPAA Rating, Runtime, & Genre

IMDB Movie Data CSV pulled from Kaggle -

Metacritic Movie Reviews – 8934 values IMDB Movie Data - 1000 values Box Office Mojo – 2476 movies

Limitations Only US films between 2006-2016 What could be IMDB scraping to find more data for a larger sample set Assumptions PG-13 movies and action genre will be the most profitable, mostly because of the superhero movies Postgres ERD/Histograms

About

An ETL project looking at box office data, IMDB ratings data, and Metacritic scores.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%