Skip to content

akuyper/STAT359_CV

Repository files navigation

STATS 359 - Divvy Bike Data Science Project

Control Variables Team

Olivier Gabison, Mimi Wang, Austin Kim, Akshya Dhinakaran, Sam Dailley

Introduction

The goal of our project is to compile, clean, and analyze divvy bike data in Chicago, USA. Our responsibility was to determine which control variables we need to include in the datasheets that we are analzying and then sending it out for other groups to use. We have written R scripts to run on the divvy bike historical datasets in order to create datasheets that can be used for further analysis later.

Directions

You can follow along these steps to build off or use our scripts in the future:

  1. git clone [email protected]:austinkim118/STAT359_CV.git
  2. After cloning the Github repo, move all the datasets being used into the cloned folder
  3. Modify the R scripts so it outputs datasheets to your desired location (and modify any file names as needed)
  4. Run your script!

Size of dataset issues

Because the dataset files were so large (went as high as 35 GB), we were unable to run our scripts locally on our machine. To get around this, we split our data into years and ran the R scripts on Quest.

Important Things to Know

  • Due to the extremely large sizes of our datasets, we ran all our R scripts on Quest, Northwestern's high performance computing system.
  • To download data from the City of Chicago (historical divvy bike data, community areas), you need to request an API token from the website by creating an account here.
  • You can modify the R scripts to adjust the boundaries for the communuity areas. However, when doing so, keep in mind that you need all the shape files in downloaded in the same folder for the scripts to work.

Citations

Divvy Bike Official Website

Data Sets Used

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages