Skip to content

Latest commit

 

History

History
60 lines (36 loc) · 4.1 KB

README.md

File metadata and controls

60 lines (36 loc) · 4.1 KB

Build Status Documentation Status

The factorial single-cell latent variable model (slalom)

+++ Note that this repository is deprecated - please move to our new implementation in pyro+++

MuVi generalizes slalom to the multiview case, scales better and recovers drivers of variability more reliably, thanks to a new implementation of the sparsity-inducing prior. Paper published at AISTATS 2023, pdf here; single-view version (slalom update) is described here and published at KDD Health '21.

What is slalom?

slalom is a scalable modelling framework for single-cell RNA-seq data that uses gene set annotations to dissect single-cell transcriptome heterogeneity, thereby allowing identification of biological drivers of cell-to-cell variability and model confounding factors.

Philosophy

Observed heterogeneity in single-cell profiling data is multi-factorial. slalom provides an efficient framework for unravelling this heterogeneity by simultaneously inferring latent factors that reflect annotated factors from pathway databases, as well as unannotated factors that capture variation outside the annotation. slalom builds on sparse factor analysis models, for which this implementation provides efficient approximate inference using Variational Bayes, allowing the application of slalom to very large datasets containing up to 100,000 cells.

Implementations

We provide two implementations of the slalom model: an R/C++ implementation that is available on Bioconductor and a Python implementation. Both the R and Pyhton packages implement the model described in the accompanying publication [1].

Software by Florian Buettner, Davis McCarthy and Oliver Stegle.

R implementation

The slalom R package is available on Bioconductor, so the most reliable way to install the package is to use the usual Bioconductor method:

## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite("slalom")

The source code for the R package can be found in the R_package folder of this repository.

The vignette supplied with the R package provides an overview of usage of that implementation of slalom.

Python implementation

Installation requirements python implementation:

slalom requires Python 2.7 or newer with

  • scipy, h5py, numpy, matplotlib, scikit-learn, re

slalom can be installed via pip with pip install slalom. For best results, we recommend the ANACONDA python distribution.

How to use slalom?

The current software version should be considered as beta. More extensive documentation, tutorials and examples will be available soon.

For an illustration of how slalom can be applied to mESC data considered in Buettner et al. [1], we have prepared a notebook. Along with other notebooks, this illustrates example analyses/workflows with slalom that you can read, download and adapt for your own analyses. These notebooks can be viewed and downloaded from here or here.

Documentation of the code can be found here.

References:

[1] Buettner, F.,Pratanwanich, N., McCarthy, D.J., Marioni, J., Stegle, O. f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq, 2017, Genome Biology.

License

See Apache License (Version 2.0, January 2004).