Clustering algorithms for Clojure

Two clustering algorithms for Clojure: k-means and hierarchical.

Usage

  
    (ns my-namespace (:use cluster))

k-means

Currently we expose two clustering algorithms: k-means and hierarchical. Use the k-means algorithm like so:

  
    ;; kcluster --
    ;;   :vectors - a sequence of vectors which you want clustered
    ;;   :count - number of clusters to find
    ;;   :range-start - lower limit for the randomized cluster nodes
    ;;   :range-end - upper limit for the randomized cluster nodes

    (kcluster [[1 2 3] [3 4 5] [5 6 7]] 2 0 7)

So, range-start and range-end may need a bit of clarification. A k-means algorithm works by randomly
placing a number of nodes amonst the nodes you want clustered, then moving those nodes until they fall
into the center of a cluster. Those random nodes need upper and lower limits. Usually these are just
the highest and lowest possible values for numbers in the vectors which you’re clustering.

The return value of kcluster is a tuple. The first value is a sequence of vectors which contain the
indices of the clustered vectors. So if you passed in five vectors the first return value might look like:
[[0 3 4] [1 2]]. The second value contains the final vectors for the cluster nodes.

Hierarchical

  
    ;; hcluster --
    ;;   :nodes - a sequence of maps in the form: { :vec [1 2 3] }

    (hcluster [{:vec [1 2 3]} {:vec [3 4 5]} {:vec [7 9 9]}])

The return value of hcluster is a tree of Maps. It might look something like this, for the above input:

  
    {:vec (9/2 6 13/2)
     :right {:vec [7 9 9]},
     :left  {:right {:vec [3 4 5]}, 
             :left  {:vec [1 2 3]}, 
             :vec (2 3 4)}}

Known Bugs

Passing vectors of all the same number to either clustering function will cause a division-by-zero error due
to my sucking at implementing Pearson correctly.

To Do

Fix Pearson
Add more similarity functions and allow use to choose which to use

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
internal		internal
test		test
LICENSE		LICENSE
README.textile		README.textile
cluster.clj		cluster.clj
run-tests.clj		run-tests.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clustering algorithms for Clojure

Usage

k-means

Hierarchical

Known Bugs

To Do

About

Releases

Packages

Languages

License

tyler/clojure-cluster

Folders and files

Latest commit

History

Repository files navigation

Clustering algorithms for Clojure

Usage

k-means

Hierarchical

Known Bugs

To Do

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages