Skip to content

Latest commit

 

History

History
41 lines (30 loc) · 1.51 KB

README.md

File metadata and controls

41 lines (30 loc) · 1.51 KB

NaiveBayes Build Status Hex.pm

Features

  • works with all types of tokens, not just text
  • allows purging of low-frequency tokens (for performance)
  • uses log probabilities to avoid underflow
  • allows prior distribution on classes to be assumed uniform
  • customizable constant value for Laplacian smoothing
  • allows for multiple categories
  • optional binarized mode

Usage

{:ok, nbayes} = NaiveBayes.new

tokenize = fn s ->
  s |> String.downcase |> String.replace(~r/[^a-z ]/, "") |> String.split(~r/\s+/)
end

nbayes |> NaiveBayes.train( tokenize.("You need to buy some Viagra"), "SPAM" )
nbayes |> NaiveBayes.train( tokenize.("This is not spam, just a letter to Bob."), "HAM" )
nbayes |> NaiveBayes.train( tokenize.("Hey Oasic, Do you offer consulting?"), "HAM" )
nbayes |> NaiveBayes.train( tokenize.("You should buy this stock"), "SPAM" )

results = nbayes |> NaiveBayes.classify( tokenize.("Now is the time to buy Viagra cheaply and discreetly") )

IO.inspect results
# => %{"HAM" => 0.4832633319857435, "SPAM" => 0.5167366680142564}

See the docs or test/naive_bayes_test.ex for more examples.

Installation

def deps do
  [{:naive_bayes, "~> 0.1.3"}]
end