Message Spam dectection using Machine learning

Introduction

The Message Spam Detection project is designed to identify and classify messages as either spam or non-spam using machine learning techniques. The project utilizes the Naive Bayes classifier for its effectiveness in text classification tasks. Additionally, a web interface is developed using Flask to provide users with a seamless experience for interacting with the spam detection system.

Dataset

The dataset used for this project is the SMS Spam Collection dataset from the UCI Machine Learning Repository. The dataset contains 5,574 messages, of which 4,827 are non-spam and 747 are spam. The messages are labeled as either spam or non-spam, and the dataset is split into a training set and a test set.

Preprocessing

The preprocessing of the dataset involves the following steps:

Tokenization: Splitting the messages into individual words
Removing stop words: Eliminating common words that do not provide meaningful information
Stemming: Reducing words to their root form
Vectorization: Converting the messages into numerical vectors

Model

The Naive Bayes classifier is used to classify the messages as spam or non-spam. The model is trained on the training set and evaluated on the test set. The performance of the model is measured using metrics such as accuracy, precision, recall, and F1 score.

For detailed code, refer to the following link:

https://github.com/Abhigyann-Singh/Message-Phishing-ML-Detection/blob/main/Mlmodels/spamsms.ipynb

Web Interface

The web interface is developed using Flask, a Python web framework. The interface allows users to input a message and receive a prediction of whether the message is spam or non-spam. The interface also provides visualizations of the model's performance metrics.

Conclusion

The Message Spam Detection project demonstrates the effectiveness of machine learning techniques in classifying messages as spam or non-spam. The Naive Bayes classifier achieves high accuracy in identifying spam messages, and the web interface provides a user-friendly experience for interacting with the spam detection system.

Acknowledgements

UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/sms+spam+collection
Flask: https://flask.palletsprojects.com/en/2.0.x/
Scikit-learn: https://scikit-learn.org/stable/
Matplotlib: https://matplotlib.org/
Pandas: https://pandas.pydata.org/
NLTK: https://www.nltk.org/
NumPy: https://numpy.org/
Seaborn: https://seaborn.pydata.org/

Nalin Angrish

LinkedIn: https://www.linkedin.com/in/nalin-angrish-7b5b3b1b3/

My Profile

Abhigyan Singh

https://www.linkedin.com/in/abhigyan-singh-3995361ab/

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Mlmodels		Mlmodels
__pycache__		__pycache__
static		static
templates		templates
app.py		app.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Message Spam dectection using Machine learning

Introduction

Dataset

Preprocessing

Model

Web Interface

Conclusion

Acknowledgements

My Profile

About

Releases

Packages

Languages

Abhigyann-Singh/Message-Phishing-ML-Detection

Folders and files

Latest commit

History

Repository files navigation

Message Spam dectection using Machine learning

Introduction

Dataset

Preprocessing

Model

Web Interface

Conclusion

Acknowledgements

My Profile

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages