Implement Email Classification Using NLP #79

darshbaxi · 2024-03-22T11:56:50Z

Dataset to be Used - Spam_classification.csv (located within the "Round-2 Dataset" folder.)

Tasks:

Data Preprocessing:
Tokenization: Split the text of each email into individual words or tokens.
Normalization: Convert all text to lowercase, remove punctuation, and handle special cases (like email addresses or URLs).
Stopword Removal: Remove common words that don't carry much meaning (e.g., "the", "is", "and").
Feature Extraction: Represent each email as a numerical vector using techniques like bag-of-words, TF-IDF (Term Frequency-Inverse Document Frequency), or word embeddings.

Model Creation:
Implement machine learning or deep learning models for email classification and create a PR with the maximum accuracy you can achieve.

Submission Format - Single Colab File with only the best model showing accuracy as metrics(Remove unnecessary models). Including your thought process (in comments or markdown cells) as to why you did certain steps to increase the accuracy would given an edge.

darshbaxi added hard Hard Level Question worth 7 points always open Open throughout the competition round-2 labels Mar 22, 2024

Invincible1602 linked a pull request Mar 23, 2024 that will close this issue

Implement Email Classification Using NLP #82

Open

NeonKazuha mentioned this issue Mar 23, 2024

Email Classification #79 #83

Open

Robinaditya1045 linked a pull request Mar 23, 2024 that will close this issue

Implemented a model for spam email detection. #88

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Email Classification Using NLP #79

Implement Email Classification Using NLP #79

darshbaxi commented Mar 22, 2024

Implement Email Classification Using NLP #79

Implement Email Classification Using NLP #79

Comments

darshbaxi commented Mar 22, 2024