The primary objective of this project is to conduct data research using the Kaggle dataset known as the Breast Cancer Dataset. The dataset focuses on breast cancer, the most prevalent cancer among women globally. With a goal to contribute to cancer detection, this project aims to address the challenge of classifying tumors into malignant (cancerous) or benign (non-cancerous) using Deep Learning techniques.
Breast cancer accounts for 25% of all cancer cases, affecting over 2.1 million people in 2015 alone. The dataset provides various features related to tumor characteristics, such as radius, texture, perimeter, area, smoothness, compactness, concavity, concave points, symmetry, and fractal dimension. The key challenge lies in accurately classifying tumors based on these features.
To execute this project, the following Python libraries are required:
- Pandas
- Numpy
- Matplotlib
- Seaborn
- Scikit-learn
- TensorFlow
- Keras
- Objectives and Goal
- Setup
- Data Description
- Feature Engineering
- Tools and Functions
- Exploratory Data Analysis
- Data Preprocessing
- Artificial Neural Network (ANN)
- Conclusions
This sets the stage for a comprehensive analysis of the Breast Cancer Dataset, including feature engineering, exploratory data analysis, and the implementation of an Artificial Neural Network for predictive modeling.