Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task3 #3

Merged
merged 5 commits into from
Oct 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
cleaned_data.csv
processed_data.csv
processed_data.csv
df_woe.csv
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,12 @@ This notebook focuses on feature engineering to enhance the dataset for modeling
- **Encoding Categorical Variables**: Applying Weight of Evidence (WOE) transformation to categorical features for better model interpretability.
- **Handling Missing Values**: Implementing strategies for filling or removing missing values in the dataset.
- **Normalization/Standardization**: Scaling numerical features to ensure they are on a similar scale, improving model performance.

### 4. `Default_Estimator_and_WOE_Binning.ipynb`
This notebook focuses on feature engineering using the RFMS (Recency, Frequency, Monetary, Seniority) formalism and applying Weight of Evidence (WoE) binning for customer risk classification. The main steps include:
- **RFMS Feature Engineering**: Calculating Recency, Frequency, Monetary, and Seniority features from the transaction data.
- **Risk Label Assignment**: Classifying customers as 'good' or 'bad' based on their RFMS score.
- **WoE Binning**: Transforming RFMS features using WoE based on the RiskLabel.
- **Information Value (IV) Calculation**: Evaluating the importance of each RFMS feature using IV to assess predictive power.

## Requirements
To run the notebooks, you will need:
Expand All @@ -49,7 +54,7 @@ To run the notebooks, you will need:


## Conclusion
The outputs from the EDA and feature engineering notebooks will be utilized in subsequent modeling tasks to develop a robust credit scoring model. Your contributions and feedback are welcome!
The outputs from the EDA, feature engineering and WOE binning notebooks will be utilized in subsequent modeling tasks to develop a robust credit scoring model. Your contributions and feedback are welcome!

---

721 changes: 721 additions & 0 deletions notebooks/Default_Estimator_and_WOE_Binning.ipynb

Large diffs are not rendered by default.