The objective of this project is to develop an intelligent email classification system using machine learning and deep learning models.
SmartMailGuard is a system designed to categorize emails using Naïve Bayes, LSTM, and other Transformer architectures.
Using these different models and algorithms we can compare and grade their effectiveness on datasets of varying sizes and on the type of classification: Binary (Spam/Not-Spam) or Multiclass.
83k Dataset Link(For Binary Classification): Kaggle
3k Dataset Link(For Multiclass Classification): Kaggle
Dataset for AutoLabeler: Kaggle
├── Binary Classification
│ ├── Naive_Bayes_Final.ipynb
│ ├── Naive_Bayes_enron_dataset.ipynb
│ ├── Naive_Bayes_sklearn.ipynb
│ ├── lstmemailclassification.ipynb
│ └── RNN_spam_not_spam.ipynb
├── Coursera Notes
│ ├── Course1
│ ├── Course2
│ ├── Course5
├── Multi Intent Classification
│ ├── Decision Tree
| │ ├── decision-tree-grid-search.ipynb
| │ ├── decision-tree.ipynb
│ ├── Random Forest Classifier
| │ ├── RandomForestClassifier-grid_search.ipynb
| │ ├── RandomForestClassifier.ipynb
│ ├── Support Vector Machine
| │ ├── SVM_grid_search.ipynb
| │ ├── SVM_multiclass_classifier.ipynb
| ├── AutoLabeler.ipynb
│ ├── Multiclass.ipynb
│ ├── multiclass-bert-Finaldataset.ipynb
│ ├── multiclass-bert-Finaldataset-from-scratch.ipynb
│ └── multinomial_combined.ipynb
├── SmartMailGuard Report
│ ├── SmartMailGuard Report.pdf
└── README.md
- Install Python 3.1.
- Install Pip and verify its installation using the following terminal command:
pip --version
- Optional: Install Jupyter using the following command:
pip install jupyter lab
Alternatively, Google Colaboratory and Kaggle can also be used to run the notebooks (with some RAM limitations).
- Run the following command to install all the dependencies:
pip install pandas pytorch scikit-learn tensorflow transformers
- Clone the repository:
git clone https://github.com/aitwehrrg/SmartMailGuard.git
- Run any of the models (
.ipynb
) as Jupyter notebooks.
- CoC and Project X for providing this opportunity.
- Course on Deep Learning Specialization by DeepLearning.AI
- Long Short-Term Memory
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Attention is all you need
- Kaggle datasets
- HuggingFace Transformer Models