generated from swnesbitt/ATMS-523-Module-5
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 20b768a
Showing
49 changed files
with
118,375 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
/.DS_Store | ||
/.ipynb_checkpoints |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
# ATMS 523 | ||
|
||
## Module 4 Project | ||
|
||
Submit this code as a pull request back to GitHub Classroom by the date and time listed in Canvas. | ||
|
||
For this assignment, use the dataset called `radar_parameters.csv` provided in the GitHub repository in the folder `homework`. | ||
|
||
The provided data has | ||
|
||
## Dataset Description | ||
|
||
The training data consists of polarimetric radar parameters calculated from a disdrometer (an instrument that measures rain drop sizes, shapes, and rainfall rate) measurements from several years in Huntsville, Alabama. A model called `pytmatrix` is used to calculate polarimetric radar parameters from the droplet observations, which can be used as a way to compare what a remote sensing instrument would see and rainfall. | ||
|
||
## Data columns | ||
|
||
Features (radar measurements): | ||
|
||
`Zh` - radar reflectivity factor (dBZ) - use the formula $dBZ = 10\log_{10}(Z)$ | ||
|
||
`Zdr` - differential reflectivity | ||
|
||
`Ldr` - linear depolarization ratio | ||
|
||
`Kdp` - specific differential phase | ||
|
||
`Ah` - specific attenuation | ||
|
||
`Adp` - differential attenuation | ||
|
||
Target : | ||
|
||
`R` - rain rate | ||
|
||
## Your assignment | ||
|
||
1. Split the data into a 70-30 split for training and testing data. | ||
|
||
2. Using the split created in (1), train a multiple linear regression dataset using the training dataset, and validate it using the testing dataset. Compare the $R^2$ and root mean square errors of model on the training and testing sets to a baseline prediction of rain rate using the formula $Z = 200 R^{1.6}$. | ||
|
||
3. Repeat 1 doing a grid search over polynomial orders, using a grid search over orders 0-21, and use cross-validation of 7 folds. For the best polynomial model in terms of $R^2$, does it outperform the baseline and the linear regression model in terms of $R^2$ and root mean square error? | ||
|
||
4. Repeat 1 with a Random Forest Regressor, and perform a grid_search on the following parameters: | ||
|
||
```python | ||
{'bootstrap': [True, False], | ||
'max_depth': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, None], | ||
'max_features': ['auto', 'sqrt'], | ||
'min_samples_leaf': [1, 2, 4], | ||
'min_samples_split': [2, 5, 10], | ||
'n_estimators': [200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2000]} | ||
``` | ||
Can you beat the baseline, or the linear regression, or best polynomial model with the best optimized Random Forest Regressor in terms of $R^2$ and root mean square error? | ||
|
||
|
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.