Vaccine Usage Analysis and Prediction

Project Overview

This project aims to predict the likelihood of individuals taking an H1N1 flu vaccine based on their characteristics and attitudes. By analyzing a dataset containing various features related to behaviors, perceptions, and demographics, we build a predictive model that can help healthcare professionals and policymakers target vaccination campaigns more effectively.

Problem Statement

The goal is to predict the probability of individuals taking an H1N1 flu vaccine using various machine learning algorithms. The dataset includes information about individuals' behaviors, perceptions, and demographics. We aim to determine which model provides the best accuracy for this prediction task.

Dataset

The dataset used for this project includes features related to:

Demographics (age, sex, income, etc.)
Behaviors (frequency of hand washing, doctor visits, etc.)
Perceptions (worry about H1N1, awareness of vaccine, etc.)

Algorithms and Performance

We compared several machine learning algorithms to determine which provides the best accuracy for predicting vaccine acceptance. The algorithms and their accuracies are as follows:

Logistic Regression: 84.6%
Decision Tree: 77%
XGBoost: 83%
AdaBoost: 85%
Random Forest: 84%

Chosen Model

The Random Forest algorithm was selected for its balance of accuracy and interpretability. While AdaBoost provided slightly higher accuracy, the Random Forest model was chosen for its robustness and performance consistency.

Implementation

Dependencies

The project requires the following libraries:

Python 3.7+
Pandas
NumPy
Scikit-learn
Matplotlib
Seaborn

Setup

Clone the repository:

git clone https://github.com/yourusername/vaccine-usage-analysis.git
cd vaccine-usage-analysis

Install the required libraries:
```
pip install -r requirements.txt
```

Data Preprocessing

The dataset is preprocessed to handle missing values, encode categorical variables, and normalize numerical features. Key preprocessing steps include:

Filling missing values with appropriate imputation techniques.
Encoding categorical features using one-hot encoding.
Normalizing numerical features to ensure they have a similar scale.

Model Training

The preprocessed data is split into training and testing sets. The Random Forest model is trained using the training set and evaluated on the testing set. Hyperparameter tuning is performed using GridSearchCV to optimize the model's performance.

Evaluation

The model's performance is evaluated using accuracy, precision, recall, and F1-score. The confusion matrix is also analyzed to understand the model's predictions better.

Results

The Random Forest model achieved an accuracy of 84%, making it a reliable choice for predicting vaccine acceptance. Other models' performances are compared to highlight the strengths and weaknesses of each approach.

Visualizations

The following visualizations are included in the analysis:

Count of H1N1 worry levels
Number of health workers
Sex distribution of vaccine consumers
H1N1 awareness levels
Frequency of hand washing

Conclusion

The project successfully demonstrates the use of machine learning algorithms to predict vaccine acceptance. The insights gained can help healthcare professionals and policymakers design more effective vaccination campaigns.

Future Work

Future improvements can include:

Incorporating additional features to enhance model accuracy.
Exploring more advanced algorithms and ensemble techniques.
Deploying the model as a web application for real-time predictions.

Repository Structure

data/: Contains the dataset.
notebooks/: Jupyter notebooks for data exploration and model training.
scripts/: Python scripts for preprocessing and model training.
results/: Results and visualizations.
README.md: Project documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Dashboard.ipynb		Dashboard.ipynb
Project07_ran_forst.ipynb		Project07_ran_forst.ipynb
README.md		README.md
coronavirus.png		coronavirus.png
dashboard_data.csv		dashboard_data.csv
fff.png		fff.png
github.png		github.png
hw.png		hw.png
instagram.png		instagram.png
linkedin.png		linkedin.png
project07_encoded.csv		project07_encoded.csv
race.png		race.png
requirements.txt		requirements.txt
sex.png		sex.png
trial.ipynb		trial.ipynb
trial.py		trial.py
updated trial .ipynb		updated trial .ipynb
vaccine02.png		vaccine02.png
vaccine_.png		vaccine_.png
workout.ipynb		workout.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vaccine Usage Analysis and Prediction

Project Overview

Problem Statement

Dataset

Algorithms and Performance

Chosen Model

Implementation

Dependencies

Setup

Data Preprocessing

Model Training

Evaluation

Results

Visualizations

Conclusion

Future Work

Repository Structure

About

Releases

Packages

Languages

thiruppu/H1N1-Vaccine-Prediction-using-Machine-Learning

Folders and files

Latest commit

History

Repository files navigation

Vaccine Usage Analysis and Prediction

Project Overview

Problem Statement

Dataset

Algorithms and Performance

Chosen Model

Implementation

Dependencies

Setup

Data Preprocessing

Model Training

Evaluation

Results

Visualizations

Conclusion

Future Work

Repository Structure

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages