Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
github-classroom[bot] authored Mar 20, 2024
0 parents commit 20b768a
Show file tree
Hide file tree
Showing 49 changed files with 118,375 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/.DS_Store
/.ipynb_checkpoints
55 changes: 55 additions & 0 deletions ATMS 523 HW 5.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# ATMS 523

## Module 4 Project

Submit this code as a pull request back to GitHub Classroom by the date and time listed in Canvas.

For this assignment, use the dataset called `radar_parameters.csv` provided in the GitHub repository in the folder `homework`.

The provided data has

## Dataset Description

The training data consists of polarimetric radar parameters calculated from a disdrometer (an instrument that measures rain drop sizes, shapes, and rainfall rate) measurements from several years in Huntsville, Alabama. A model called `pytmatrix` is used to calculate polarimetric radar parameters from the droplet observations, which can be used as a way to compare what a remote sensing instrument would see and rainfall.

## Data columns

Features (radar measurements):

`Zh` - radar reflectivity factor (dBZ) - use the formula $dBZ = 10\log_{10}(Z)$

`Zdr` - differential reflectivity

`Ldr` - linear depolarization ratio

`Kdp` - specific differential phase

`Ah` - specific attenuation

`Adp` - differential attenuation

Target :

`R` - rain rate

## Your assignment

1. Split the data into a 70-30 split for training and testing data.

2. Using the split created in (1), train a multiple linear regression dataset using the training dataset, and validate it using the testing dataset. Compare the $R^2$ and root mean square errors of model on the training and testing sets to a baseline prediction of rain rate using the formula $Z = 200 R^{1.6}$.

3. Repeat 1 doing a grid search over polynomial orders, using a grid search over orders 0-21, and use cross-validation of 7 folds. For the best polynomial model in terms of $R^2$, does it outperform the baseline and the linear regression model in terms of $R^2$ and root mean square error?

4. Repeat 1 with a Random Forest Regressor, and perform a grid_search on the following parameters:

```python
{'bootstrap': [True, False],
'max_depth': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, None],
'max_features': ['auto', 'sqrt'],
'min_samples_leaf': [1, 2, 4],
'min_samples_split': [2, 5, 10],
'n_estimators': [200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2000]}
```
Can you beat the baseline, or the linear regression, or best polynomial model with the best optimized Random Forest Regressor in terms of $R^2$ and root mean square error?


1,341 changes: 1,341 additions & 0 deletions BicycleWeather.csv

Large diffs are not rendered by default.

Loading

0 comments on commit 20b768a

Please sign in to comment.