Skip to content

A simple spam classifier built with Bayes' theorem.

License

Notifications You must be signed in to change notification settings

Kyle-L/Spam-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spam Classifier

Run on Repl.it

Table of Contents

Overview

As a simple assignment for my Introduction to Artificial Intelligence course (CSE 486) at Miami University, we were tasked with a simple spam classifier with Bayes' theorem.

Local Setup

If you would like to setup the project on your local machine, you can use the following instructions!

  1. Download the repo.
$ git clone [email protected]:Kyle-L/Spam-Classifier.git
  1. Install Pipenv using pip, install pip if you haven't already.
$ pip install pipenv
  1. Setup a virtual environment with Pipenv.
$ python -m venv env
  1. (on Windows) Start the virtual environment
$ ./env/Scripts/activate
  1. (on Unix / Linux / MAC OS) Start the virtual environment
$ source env/bin/activate
  1. Install the requirements
$ pip install -r classifier/requirements.txt
  1. Run the classifier!
$ python classifier compare data/training_set_small.csv data/test_set.csv

Congrats! You are setup!

The Data Sets

The expected input is a tab delimited file where the first column indicates whether or not a message is spam (1 = spam, 0 = ham) and the two column is the message. No header is expected.

An example input into the program is as follows...

0	Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...
0	Ok lar... Joking wif u oni...
1	Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)T&C's apply 08452810075over18's
0	U dun say so early hor... U c already then say...

Provided in the repo are three data sets. A small training set, a large training set, and a small test set.

Remote Setup

If you would like to use the chess engine remotely, we can use the online IDE Repl.it!

Simply select the following badge or visit the following link: https://repl.it/github/Kyle-L/Spam-Classifier

Run on Repl.it

Once, it has opened, all you need do is select run!

License

The source code is licensed under a MIT License.

GitHub license

Releases

No releases published

Packages

No packages published

Languages