Skip to content
/ mixturs Public

Unofficial implementation of Dirichlet Process Mixture Model Split/Merge algorithm.

License

Notifications You must be signed in to change notification settings

egordm/mixturs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mixturs

Crates.io License Docs Coverage

Unofficial implementation of Dirichlet Process Mixture Model Sub-Clusters algorithm.

FeaturesUsageExamples

Image Demo

To use as a library, add the following to your Cargo.toml. Executable builds can be found at https://github.com/EgorDm/mixturs/releases.

[dependencies]
mixturs = "0.1"

Features

  • Cluster points without knowing the number of clusters in advance
  • Fastest CPU implementation of the algorithm
  • Python bindings to cluster numpy data
  • Command line tool for generating segmented images from JPG/PNG input files

Examples

// Load data into a col major matrix
let x: DMatrix<f64> = ...;

// Set model options
let mut model_options = ModelOptions::<NIW>::default(dim);
model_options.alpha = 100.0;
model_options.outlier = None;

// Set fit options
let mut fit_options = FitOptions::default();
fit_options.init_clusters = 1;

// Configure callbacks
let mut callback = MonitoringCallback::from_data(
    EvalData::from_sample(&x, Some(&y), 1000)
);
callback.add_metric(AIC);
callback.add_callback(PlotCallback::new(
    3,
    "examples/data/plot/synthetic_2d".into(),
    EvalData::from_sample(&x, None, 1000)
));
callback.set_verbose(true);

// Fit the model
let mut model = Model::from_options(model_options);
model.fit(
    x.clone_owned(),
    &fit_options,
    Some(callback),
);
import numpy as np
from mixtupy import *

# Load data
x = ...

# Configure model
mo = ModelOptions(2)
model = Model(mo)

# Fit model
fo = FitOptions()
fo.iters = 200
fo.aic = True
model.fit(x, fo, y=y)

# Predict data point labels
probs, labels = model.predict(x)

# Extract cluster parameters
print(model.cluster_weights())
print(model.cluster_params())

Reference

[1] J. Chang and J. W. Fisher III, “Parallel Sampling of DP Mixture Models using Sub-Cluster Splits,” in Advances in Neural Information Processing Systems, 2013.

[2] O. Dinari, A. Yu, O. Freifeld, and J. Fisher, “Distributed MCMC Inference in Dirichlet Process Mixture Models Using Julia,” in 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

About

Unofficial implementation of Dirichlet Process Mixture Model Split/Merge algorithm.

Resources

License

Stars

Watchers

Forks