Detection of spam emails and spam urls by classification with machine learning
The data sets used in the project were taken from the addresses below
- Spam Email Data Set ➡️ https://www.kaggle.com/mfaisalqureshi/spam-email?select=spam.csv
- Spam Url Data Set ➡️ https://www.kaggle.com/shivamb/spam-url-prediction
Thank you to the friends who provided these datasets.
- This dataset (spam.csv) contains a total of
5572 data
. There are%83 safe and %17 spam
. - This dataset (url_spam_classification.csv) contains a total of
148.303 data
. There are%68 safe and %32 spam
.
Classification algorithms used for spam email detection :
* Decision Tree
* K-Nearest Neighbors
* Random Forest
* Support Vector Machine
Classification algorithms used for spam url detection :
* Decision Tree
* Stochastic Gradient Descent
* Multinomial Naive Bayes
* Linear Support Vector
The Success Rate was calculated as % : 93.89806173725772 with the K-Nearest-Neighbors
The Success Rate was calculated as % : 97.05671213208902 with Random Forest
The Success Rate was calculated as % : 96.76956209619526 with Decision Tree
The Success Rate was calculated as % : 97.98994974874373 with Support Vector Machine
The Success Rate was calculated as % : 98.80245981227749 with Decision Tree
The Success Rate was calculated as % : 98.52465206602655 with LinearSVC
The Success Rate was calculated as % : 95.27457115114899 with Stochastic Gradient Descent
The Success Rate was calculated as % : 91.10206063221491 with Multinomial Naive Bayes