yohannestayez · yohannestayez · Oct 4, 2024 · Oct 4, 2024 · Oct 4, 2024 · Oct 4, 2024
diff --git a/README.md b/README.md
@@ -20,19 +20,33 @@ This notebook performs Exploratory Data Analysis (EDA) on the cleaned dataset. T
 - **Correlation Analysis**: Understanding relationships between numerical features.
 - **Outlier Detection**: Using box plots to identify potential outliers.
 
+
+### 3. `Feature_Engineering.ipynb`
+This notebook focuses on feature engineering to enhance the dataset for modeling. Key tasks include:
+- **Aggregate Features**: Creating new features such as total transaction amount, average transaction amount, transaction count, and standard deviation of transaction amounts for each customer.
+- **Time-Based Features**: Extracting features from the transaction timestamp (hour, day, month, year).
+- **Encoding Categorical Variables**: Applying Weight of Evidence (WOE) transformation to categorical features for better model interpretability.
+- **Handling Missing Values**: Implementing strategies for filling or removing missing values in the dataset.
+- **Normalization/Standardization**: Scaling numerical features to ensure they are on a similar scale, improving model performance.
+
+
 ## Requirements
 To run the notebooks, you will need:
 - Python 3.x
 - Pandas
+- NumPy
 - Matplotlib
 - Seaborn
+- Scikit-learn
+- Scorecardpy
 
 
 ## Getting Started
 1. Clone the repository or download the project files.
 2. Install the required packages listed in `requirements.txt`.
 3. Open the notebooks in Jupyter Notebook or any compatible IDE.
-4. Execute the cells in the order provided to complete the data cleaning and EDA.
+4. Execute the cells in the order provided to complete the data cleaning, EDA, and feature engineering processes.
+
 
 ## Conclusion
 The outputs from the EDA and feature engineering notebooks will be utilized in subsequent modeling tasks to develop a robust credit scoring model. Your contributions and feedback are welcome!