You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dataset to be Used - Spam_classification.csv (located within the "Round-2 Dataset" folder.)
Tasks:
Data Preprocessing:
Tokenization: Split the text of each email into individual words or tokens.
Normalization: Convert all text to lowercase, remove punctuation, and handle special cases (like email addresses or URLs).
Stopword Removal: Remove common words that don't carry much meaning (e.g., "the", "is", "and").
Feature Extraction: Represent each email as a numerical vector using techniques like bag-of-words, TF-IDF (Term Frequency-Inverse Document Frequency), or word embeddings.
Model Creation:
Implement machine learning or deep learning models for email classification and create a PR with the maximum accuracy you can achieve.
Submission Format - Single Colab File with only the best model showing accuracy as metrics(Remove unnecessary models). Including your thought process (in comments or markdown cells) as to why you did certain steps to increase the accuracy would given an edge.
The text was updated successfully, but these errors were encountered:
Dataset to be Used - Spam_classification.csv (located within the "Round-2 Dataset" folder.)
Tasks:
Data Preprocessing:
Tokenization: Split the text of each email into individual words or tokens.
Normalization: Convert all text to lowercase, remove punctuation, and handle special cases (like email addresses or URLs).
Stopword Removal: Remove common words that don't carry much meaning (e.g., "the", "is", "and").
Feature Extraction: Represent each email as a numerical vector using techniques like bag-of-words, TF-IDF (Term Frequency-Inverse Document Frequency), or word embeddings.
Model Creation:
Implement machine learning or deep learning models for email classification and create a PR with the maximum accuracy you can achieve.
Submission Format - Single Colab File with only the best model showing accuracy as metrics(Remove unnecessary models). Including your thought process (in comments or markdown cells) as to why you did certain steps to increase the accuracy would given an edge.
The text was updated successfully, but these errors were encountered: