Skip to content
/ big-data-PS1 Public template

Problem Set 1 for Big Data and Economics

License

Notifications You must be signed in to change notification settings

tcastriotta/big-data-PS1

Repository files navigation

Problem Set 1 for Big Data Class

Due: 9/24/2023

Group problem set

For this project you will work in groups of 2. You can choose your pair. This will test your skills for collaborating on Github.

Problem Set description

In this problem set, you will experiment with cleaning a messy dataset and producing some basic results. You will replicate graphs from the paper Diversifying Society’s Leaders? The Determinants and Causal Effects of Admission to Highly Selective Private Colleges by Raj Chetty, David J. Deming, and John N. Friedman (and a variety of co-authors). This working paper uses anonymized admissions data from 139 elite colleges linked to income tax records to ask whether these highly selective schools show a preference for high-income students beyond SAT/ACT scores and the effect of attending one of these schools on future earnings.

See a non-technical summary and a New York Times report for more information. Please familiarize yourself with this working paper.

I have provided starter code, which you will use to complete this project. The code includes:

  • housekeeping.R
  • download_data.R
  • clean_data.R
  • PS1_writeup.Rmd

The problem set questions are in ps1_writeup.Rmd.

Grading

I will be grading this problem set based on the following criteria:

  • Quality of code (33%): Is it well-commented? Is it easy to follow? Can I run it?
  • Quality of graphs (33%): Are they well-labeled? Do they have titles? Do they have legends? Are they formatted well?
  • Quality of answers (33%): Are they clear? Do they answer the question?

Submitting project

In order to submit this project, you will need to:

  1. Stage your changes using RStudio, GitHub Desktop, or git from the command line
  2. Commit these changes
  3. Pull from the repository (always pull before you push)
  4. Push your changes to the repository

ChatGPT/GitHub CoPilot

I encourage you to actively use generative AI to assist with writing code for this assignment.

About

Problem Set 1 for Big Data and Economics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published