thunder-extraction

algorithms for feature extraction from spatio-temporal data

Source or feature extraction is the process of identifying spatial features of interest from data that varies over space and time. It can be either unsupervised or supervised, and is common in biological data analysis problems, like identifying neurons in calcium imaging data.

This package contains a collection of approaches for solving this problem. It defines a set of algorithms in the scikit-learn style, each of which can be fit to data, and return a model that can be used to transform new data. Compatible with Python 2.7+ and 3.4+. Works well alongside thunder and supprts parallelization via spark, but can be used as a standalone package on local numpy arrays.

installation

pip install thunder-extraction

example

# generate data
from extraction.utils import make_gaussian
data = make_gaussian()

# fit a model
from extraction import NMF
model = NMF().fit(data)

# extract sources by transforming data
sources = model.transform(data)

usage

Analysis starts by import and constructing an algorithm

from extraction import NMF
algorithm = NMF(k=10)

Algorithms can be fit to data in the form of a thunder images object or an t,x,y(,z) numpy array

model = algorithm.fit(data)

The model is a collection of identified features that can be used to extract temporal signals from new data

signals = model.transform(data)

api

algorithms

All algorithms have the following methods

`algorithm.fit(data, opts)`

Fits the algorithm to the data, which should be a collection of time-varying images. It can either be a thunder images object, or a numpy array with shape t,x,y(,z).

For many algorithms, fit will take the optional arguments chunk_size and padding, which allows the algorithm to be performed on smaller chunks of the data, either in serial (if running locally) or in parallel (if running on a cluster).

A chunk is defined a subset of the image in space, including all time points. The chunk_size is the size of each chunk in pixels, and padding is the amount by which to pad the chunks in each dimension. For example, given a (100,100,500) data set, we could set chunk_size=(50,50) resulting in four chunks each of which are (50,50,500).

model

The result of fitting an algorithm is a model. Every model has the following properties and methods.

`model.regions`

The spatial regions identified during fitting.

`model.transform(data)`

Transform a new data set using the model, by averaging pixels within each of the regions. As with fitting, data can either be a thunder images object, or a numpy array with shape t,x,y(,z). It will return a thunder series object, which can be converted to a numpy array by calling toarray().

`model.merge(overlap=0.5, max_iter=2, k_nearest=10)`

Merge overlapping regions in the model, by greedily comparing nearby regions and merging those that are similar to one another more than the specified overlap. Repeats greedy merging process max_iter times. Only considers k_nearest neighbors to speed up computation.

list of algorithms

Here are all the algorithms currently available.

`NMF(k=5, max_iter=20, max_size='full', min_size=20, percentile=95, overlap=0.1)`

Local non-negative matrix factorization followed by thresholding to yield binary spatial regions. Applies factorization either to image blocks or to the entire image.

The algorithm takes the following parameters.

k number of components to estimate per block
max_size maximum size of each region
min_size minimum size for each region
max_iter maximum number of algorithm iterations
percentile value for thresholding (higher means more thresholding)
overlap value for determining whether to merge (higher means fewer merges)

The fit method takes the following options.

block_size a size in megabytes like 150 or a size in pixels like (10,10), if None will use full image

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
extraction		extraction
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
example.ipynb		example.ipynb
example.py		example.py
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tmp.json		tmp.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

thunder-extraction

installation

example

usage

api

algorithms

`algorithm.fit(data, opts)`

model

`model.regions`

`model.transform(data)`

`model.merge(overlap=0.5, max_iter=2, k_nearest=10)`

list of algorithms

`NMF(k=5, max_iter=20, max_size='full', min_size=20, percentile=95, overlap=0.1)`

About

Releases

Packages

Languages

License

thunder-project/thunder-extraction

Folders and files

Latest commit

History

Repository files navigation

thunder-extraction

installation

example

usage

api

algorithms

algorithm.fit(data, opts)

model

model.regions

model.transform(data)

model.merge(overlap=0.5, max_iter=2, k_nearest=10)

list of algorithms

NMF(k=5, max_iter=20, max_size='full', min_size=20, percentile=95, overlap=0.1)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`algorithm.fit(data, opts)`

`model.regions`

`model.transform(data)`

`model.merge(overlap=0.5, max_iter=2, k_nearest=10)`

`NMF(k=5, max_iter=20, max_size='full', min_size=20, percentile=95, overlap=0.1)`

Packages