Skip to content

Commit

Permalink
defaulting to soft signal extraction without overlaps. Updated readme…
Browse files Browse the repository at this point in the history
… file.
  • Loading branch information
marius10p committed Mar 27, 2017
1 parent 385ec15 commit 550fefb
Show file tree
Hide file tree
Showing 7 changed files with 244 additions and 40 deletions.
59 changes: 30 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
UPDATE (Christmas 2016): number of clusters determined automatically, but do specify the "diameter" of an average cell for best results. You can do this with
either db(iexp).diameter or ops.diameter.

# Suite2p: fast, accurate and complete two-photon pipeline#

Registration, cell detection, spike extraction and manual GUI.
Registration, cell detection, spike extraction and visualization GUI.

Details in [http://biorxiv.org/content/early/2016/06/30/061507](http://biorxiv.org/content/early/2016/06/30/061507).

Expand All @@ -13,25 +10,47 @@ This code was written by Marius Pachitariu and members of the lab of Kenneth Har

### I. Introduction ###

This is a complete automated pipeline for processing two-photon Calcium imaging recordings. It is very simple, very fast and yields a large set of active ROIs. A GUI further provides point-and-click capabilities for refining the results in minutes. The pipeline includes the following steps
This is a complete, automated pipeline for processing two-photon Calcium imaging recordings. It is simple, fast and yields a large set of active ROIs. A GUI further provides point-and-click capabilities for refining the results in minutes. The pipeline includes the following steps


1) X-Y subpixel registration --- using a version of the phase correlation algorithm and subpixel translation in the FFT domain. If a GPU is available, this completes in 20 minutes per 1h of recordings at 30Hz and 512x512 resolution.

2) SVD decomposition --- this provides the basis for a number of pixel-level visualization tools.
2) SVD decomposition --- this provides the input to cell detection and accelerates the algorithm.

3) Cell detection --- using clustering methods in a low-dimensional space. The clustering algorithm provides a positive mask for each ROI identified, and allows for overlaps between masks.

3) Cell detection --- using clustering methods in a low-dimensional space of the fluorescence activity. The pixel clustering algorithm directly provides a positive mask for each ROI identified and stands in contrast to more "black-box" methods that have been previously proposed. The mask is used to weigh the pixels of the ROI before taking the average over pixel ROIs.
4) Signal extraction --- by default, all overlapping pixels are discarded when computing the signal inside each ROI, to avoid using "demixing" approaches, which can be biased. The neuropil signal is also computed independently for each ROI, as a weighted pixel average, pooling from a large area around each ROI, but excluding all pixels assigned to ROIs during cell detection. The neuropil subtraction coefficient is estimated by maximizing the skewness of F - coef*Fneu (F is the ROI signal, Fneu is the neuropil signal). The user is encouraged to also try varying this coefficient, to make sure that any scientific results do not depend crucially on it.

4) Manual curation --- the output of the cell detection algorithm can be visualized and further refined using the included GUI. The GUI is designed to make cell sorting a fun and enjoyable experience.
5) Manual curation --- the output of the cell detection algorithm can be visualized and further refined using the included GUI. The GUI is designed to make cell sorting a fun and enjoyable experience. It also includes an automatic classifier that gradually refines itself based on the manual labelling provided by the user. This allows the automated classifier to adapt for different types of data, acquired under different conditions.

5) Spike deconvolution --- cell and neuropil traces are further processed to obtain an estimate of spike times and spike "amplitudes". The amplitudes here are understood as proportional to the number of spikes in a burst/bin. Under low SNR conditions, the deconvolution is still useful for temporally-localizing responses, and for providing an estimate of the neuropil contamination coefficient (estimated together, see paper for how this works).
6) Spike deconvolution --- cell and neuropil traces are further processed to obtain an estimate of spike times and spike "amplitudes". The amplitudes are proportional to the number of spikes in a burst/bin. Even under low SNR conditions, where transients might be hard to identify, the deconvolution is still useful for temporally-localizing responses. The cell traces are corrected using the ICA-derived neuropil contamination coefficients, and baselined using the minimum of the overly-smoothed trace.

### II. Getting started ###

The toolbox runs in Matlab (+ one mex file) and currently only supports tiff file inputs. To begin using the toolbox, you will need to make local copies (in a separate folder) of two included files: master_file and make_db. It is important that you make local copies of these files, otherwise updating the repository will overwrite them (and you can lose your files). The make_db file assembles a database of experiments that you would like to be processed in batch. It also adds per-session specific information that the algorithm requires such as the number of imaged planes and channels. The master_file sets general processing options that are applied to all sessions included in make_db, UNLESS the option is over-ridden in the make_db file. The global and session-specific options are described in detail below.

For spike deconvolution, you need to run mex -largeArrayDims SpikeDetection/deconvL0.c (or .cpp under Linux/Mac)

Below we describe the outputs of the pipeline first, and then describe the options for setting it up, and customizing it. Importantly, almost all options have pre-specified defaults. Any options specified in master_file in ops0 overrides these defaults. Furthermore, any option specified in the make_db file (experiment specific) overrides both the defaults and the options set in master_file. This allows for flexibility in processing different experiments with different options. The only critical option that you'll need to set is ops0.diameter, or db(N).diameter. This gives the algorithm the scale of the recording, and the size of ROIs you are trying to extract. We recommend as a first run to try the pipeline after setting the diameter option. Depending on the results, you can come back and try changing some of the other options.

Note: some of the options are not specified in either the example master_file or the example make_db file. These are usually more specialized features.

### III. Outputs. ###

The output is a struct called dat which is saved into a mat file in ResultsSavePath using the same subfolder structure, under a name formatted like F_M150329_MP009_2015-04-29_plane1. It contains all the information collected throughout the processing, and contains the ROI and neuropil traces in Fcell and FcellNeu, and whether each ROI j is a cell or not in stat(j).iscell. stat(j) contains information about each ROI j and can be used to recover the corresponding pixels for each ROI in stat.ipix. The centroid of the ROI is specified in stat as well. Here is a summary of where the important results are:

cell traces are in dat.Fcell
neuropil traces are in dat.FcellNeu
manual, GUI overwritten "iscell" labels are in dat.cl.iscell

stat(icell) contains all other information:
iscell: automated label, based on anatomy
neuropilCoefficient: neuropil subtraction coefficient, based on maximizing the skewness of the corrected trace (ICA)
st: are the deconvolved spike times (in frames)
c: are the deconvolved amplitudes
kernel: is the estimated kernel


### III. Input-output file paths ###

RootStorage --- the root location where the raw tiff files are stored.
Expand All @@ -48,19 +67,6 @@ All of these filepaths are completed with separate subfolders per animal and exp

\RootStorage\mouse_name\session\block\*.tif(f)

The output is a struct called dat which is saved into a mat file in ResultsSavePath using the same subfolder structure, under a name formatted like F_M150329_MP009_2015-04-29_plane1_Nk650. It contains all the information collected throughout the processing, and contains the fluorescence traces in dat.F.Fcell and whether a given ROI is a cell or not in dat.F.iscell. dat.stat contains information about each ROI and can be used to recover the corresponding pixels for each ROI in dat.stat.ipix. The centroid of the ROI is specified in dat.stat as well. Here is a summary of where the important results are:

cell traces are in dat.F.Fcell
neuropil traces are in dat.F.FcellNeu
neuropil subtraction coefficient is dat.cl.dcell{i}.B(3)
baseline is dat.cl.dcell{i}.B(2)
anatomical cell criterion is in dat.cl.isroi
manual overwritten cell labels are in dat.cl.iscell
dat.cl.dcell{i}.st are the deconvolved spike times (in frames)
dat.cl.dcell{i}.c are the deconvolved amplitudes
dat.cl.dcell{i}.kernel is the estimated kernel


### IV. Options for registration ###

showTargetRegistration --- whether to show an image of the target frame immediately after it is computed.
Expand Down Expand Up @@ -139,13 +145,6 @@ MaxNpix --- maximum number of pixels per ROI. Automatically set by diameter, if

MinNpix --- minimum number of pixels per ROI. Automatically set by diameter, if provided.

Compact --- a compactness criterion for how close pixels are to the center of the ROI. 1 is the lowest possible value, achieved by perfect disks. Best to leave this to a high value (i.e. 2) before the manual sorting stage.

parent --- these are criteria imposed on the parent cluster (before separating connected regions). These are not currently used during the automated step, but are available in the GUI.

parent.minPixRelVar --- significant regions need to have at least >1/10 the mean variance of all regions

parent.MaxRegions --- if there are more non-significant regions than this number, this parent ROI is probably very spread out over many small components and its connected regions are not good cells: it will be discarded.

### IX. Example database entry ###

Expand All @@ -158,3 +157,5 @@ db(i).expts = [5 6]; % which experiments to process together

Other (hidden) options are described in make_db_example.m, and at the top of run_pipeline.m (set to reasonable defaults), and get_signals_and_neuropil.m (neuropil "surround" option).



4 changes: 3 additions & 1 deletion cellDetection/getNeuropilBasis.m
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
function S = getNeuropilBasis(ops, Ly, Lx, type)

ops.neuropilRange = getOr(ops, 'neuropilRange', 3);

TileFactor = getOr(ops, {'TileFactor'}, 1); % this option can be overwritten by the user
nTiles = ceil(TileFactor * (Ly+Lx)/2 / (3 * ops.diameter)); % neuropil is modelled as nTiles by nTiles
nTiles = ceil(TileFactor * (Ly+Lx)/2 / (ops.neuropilRange * ops.diameter)); % neuropil is modelled as nTiles by nTiles

S = zeros(Ly, Lx, nTiles, nTiles, 'single');

Expand Down
2 changes: 1 addition & 1 deletion cellDetection/sourcery.m
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@
%
% ADD NEUROPIL INTO REGRESSION HERE
LtL = full(L'*L);
codes = ([LtL LtS; LtS' StS]+ 0e-3 * eye(icell+nBasis))\[LtU; StU];
codes = ([LtL LtS; LtS' StS]+ 1e-3 * eye(icell+nBasis))\[LtU; StU];
neu = codes(icell+1:end,:);
codes = codes(1:icell,:);
% codes = (LtL+ 1e-3 * eye(icell))\LtU;
Expand Down
16 changes: 9 additions & 7 deletions master_file_example.m
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@

ops0.useGPU = 1; % if you can use an Nvidia GPU in matlab this accelerates registration approx 3 times. You only need the Nvidia drivers installed (not CUDA).
ops0.fig = 1; % turn off figure generation with 0
% ops0.diameter = 12; % most important parameter. Set here, or individually per experiment in make_db file

% root paths for files and temporary storage (ideally an SSD drive. my SSD is C:/)
ops0.RootStorage = '//zserver4/Data/2P'; % Suite2P assumes a folder structure, check out README file
Expand All @@ -44,6 +45,7 @@
ops0.sig = 0.5; % spatial smoothing length in pixels; encourages localized clusters
ops0.nSVDforROI = 1000; % how many SVD components for cell clustering
ops0.NavgFramesSVD = 5000; % how many (binned) timepoints to do the SVD based on
ops0.signalExtraction = 'raw'; % how to extract ROI and neuropil signals: 'raw', 'regression'

% spike deconvolution options
ops0.imageRate = 30; % imaging rate (cumulative over planes!). Approximate, for initialization of deconvolution kernel.
Expand Down Expand Up @@ -82,10 +84,10 @@
% cell traces are in dat.Fcell
% neuropil traces are in dat.FcellNeu
% manual, GUI overwritten "iscell" labels are in dat.cl.iscell

% stat(icell) contains all other information
% autoamted iscell label, based on anatomy
% neuropil subtraction coefficient
% st are the deconvolved spike times (in frames)
% c are the deconvolved amplitudes
% kernel is the estimated kernel
%
% stat(icell) contains all other information:
% iscell: automated label, based on anatomy
% neuropilCoefficient: neuropil subtraction coefficient, based on maximizing the skewness of the corrected trace (ICA)
% st: are the deconvolved spike times (in frames)
% c: are the deconvolved amplitudes
% kernel: is the estimated kernel
9 changes: 7 additions & 2 deletions run_pipeline.m
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ function run_pipeline(db, ops0)
ops0.getROIs = getOr(ops0, {'getROIs'}, 1); % whether to run the optimization
ops0.getSVDcomps = getOr(ops0, {'getSVDcomps'}, 0); % whether to save SVD components to disk for later processing
ops0.nSVD = getOr(ops0, {'nSVD'}, 1000); % how many SVD components to save to disk

ops0.signalExtraction = getOr(ops0, 'signalExtraction', 'raw');

ops = build_ops3(db, ops0);
if isfield(ops, 'numBlocks') && ~isempty(ops.numBlocks)
Expand Down Expand Up @@ -82,7 +82,12 @@ function run_pipeline(db, ops0)
[ops, stat, model] = sourcery(ops,U, model);

% extract dF
[ops, stat, Fcell, FcellNeu] = extractSignals(ops, model, stat);
switch ops.signalExtraction
case 'raw'
[ops, stat, Fcell, FcellNeu] = extractSignalsNoOverlaps(ops, model, stat);
case 'regression'
[ops, stat, Fcell, FcellNeu] = extractSignals(ops, model, stat);
end

% apply user-specific clustrules to infer stat.iscell
stat = classifyROI(stat, ops.clustrules);
Expand Down
182 changes: 182 additions & 0 deletions signalExtraction/extractSignalsNoOverlaps.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
function [ops, stat, Fcell, FcellNeu] = extractSignalsNoOverlaps(ops, m, stat)

ops.saveNeuropil = getOr(ops, 'saveNeuropil', 0);

Nk = numel(stat);

Ly = numel(ops.yrange);
Lx = numel(ops.xrange);

% make new set of basis functions (larger coverage)
ops.neuropilRange = 10;

S = getNeuropilBasis(ops, Ly, Lx, 'raisedcosyne'); % 'raisedcosyne', 'Fourier'
S = normc(S);

% S = m.S;

S = reshape(S, [], size(S, ndims(S)));
nBasis = size(S,2);

% initialize mask
maskNeu = ones(size(S,1), 1);

stat = getNonOverlapROIs(stat, Ly, Lx);

for k = 1:Nk
ix = stat(k).ipix(~stat(k).isoverlap);
maskNeu(stat(k).ipix)= 0;
if numel(ix)==0
LtS(k,:) = 0;
else
LtS(k,:) = stat(k).lam(~stat(k).isoverlap)' * S(ix, :);
end
end

% add all pixels within X um
if isfield(ops, 'exclFracCell') && ops.exclFracCell>0
H = fspecial('disk', round(ops.diameter * ops.exclFracCell));
maskNeu = reshape(maskNeu, Ly, Lx);
maskNeu = imfilter(maskNeu, H, 'replicate');
maskNeu = single(maskNeu(:) > 1-1e-3);
end
%% get signals
S = bsxfun(@times, S, maskNeu(:));
StS = S' * S;
StS = StS + 1e-2 * eye(size(StS));

nimgbatch = 2000;

ix = 0;
fclose all;
fid = fopen(ops.RegFile, 'r');

tic
F = zeros(Nk, sum(ops.Nframes), 'single');
Fneu = zeros(Nk, sum(ops.Nframes), 'single');
if ops.saveNeuropil
Ntraces = zeros(nBasis, sum(ops.Nframes), 'single');
end
% S = bsxfun(@times, S, maskNeu(:));

[Ly Lx] = size(ops.mimg1);

mimg1 = ops.mimg1(ops.yrange, ops.xrange);
% find indices of good clusters
while 1
data = fread(fid, Ly*Lx*nimgbatch, '*int16');
if isempty(data)
break;
end
data = reshape(data, Ly, Lx, []);
data = data(ops.yrange, ops.xrange, :);
data = single(data);
NT = size(data,3);

% process the data
data = bsxfun(@minus, data, mimg1);
data = my_conv2(data, ops.sig, [1 2]);
data = bsxfun(@rdivide, data, m.sdmov);
data = single(reshape(data, [], NT));

%
Ftemp = zeros(Nk, NT, 'single');
for k = 1:Nk
ipix = stat(k).ipix(~stat(k).isoverlap)';
if ~isempty(ipix)
Ftemp(k,:) = stat(k).lam(~stat(k).isoverlap)' * data(ipix,:);
end
end
F(:,ix + (1:NT)) = Ftemp;

Tneu = StS\(S' * data);
Ftemp2 = LtS * Tneu;
Fneu(:,ix + (1:NT)) = Ftemp2;

% Fneu(:,ix + (1:NT)) = m.LtS * Fdeconv(1+Nk:end, :); % estimated neuropil
% F(:,ix + (1:NT)) = Fneu(:,ix + (1:NT)) + Fdeconv(1:Nk, :); % estimated ROI signal

if ops.saveNeuropil
Ntraces(:,ix + (1:NT)) = Tneu;
end

ix = ix + NT;
if rem(ix, 3*NT)==0
fprintf('Frame %d done in time %2.2f \n', ix, toc)
end
end
fclose(fid);
%% add the means back in to both neuropil and total
data = my_conv2(mimg1, ops.sig, [1 2]);
data = bsxfun(@rdivide, data, m.sdmov);
data = single(reshape(data, [], 1));

scalefactors = nan(numel(stat),1);
Ftemp = zeros(Nk, 1, 'single');
for k = 1:Nk
ipix = stat(k).ipix(~stat(k).isoverlap)';
if ~isempty(ipix)
Ftemp(k,:) = stat(k).lam(~stat(k).isoverlap)' * data(ipix,1);
scalefactors(k) = mean(m.sdmov(ipix));
end
end

Tneu = StS\(S' * data);
Ftemp2 = LtS * Tneu;

Fneu = bsxfun(@plus, Fneu, Ftemp2); % estimated neuropil
F = bsxfun(@plus, F, Ftemp);

Fneu = bsxfun(@times, Fneu, scalefactors); % estimated neuropil
F = bsxfun(@times, F, scalefactors);

%% get activity stats
indNoNaN = find(~ops.badframes);
ix = cumsum(~ops.badframes) + 1;
ix = ix(ops.badframes);
ix(ix>numel(indNoNaN)) = numel(indNoNaN);

F(:, ops.badframes) = F(:, indNoNaN(ix));
Fneu(:, ops.badframes) = Fneu(:, indNoNaN(ix));

% figure out the ICA coefficients here
ops.fs = getOr(ops, 'fs', ops.imageRate/ops.nplanes);
%
[coefNeu, inomax] = my_ica(F', Fneu', ops.fs, 0.7);
dF = F - bsxfun(@times, Fneu, coefNeu(:));

% dF = F - Fneu;

sd = std(dF, [], 2);
sdN = std(Fneu, [], 2);

sk(:, 1) = skewness(dF, [], 2);
sk(:, 2) = sd./sdN;
sk(:, 3) = (max(dF, [], 2)-median(dF, 2))./sd;
sk(:, 4) = (prctile(dF, 95, 2)-median(dF, 2))./sd;

for j = 1:numel(stat)
stat(j).dFstat = sk(j,:);
stat(j).skew = sk(j,1);
stat(j).std = sk(j,2);
stat(j).maxMinusMed = sk(j,3);
stat(j).top5pcMinusMed = sk(j,4);
stat(j).blockstarts = [0 cumsum(ops.Nframes)];
stat(j).iplane = ops.iplane;
stat(j).neuropilCoefficient = coefNeu(j);
end

%%
csumNframes = [0 cumsum(ops.Nframes)];
Fcell = cell(1, length(ops.Nframes));
FcellNeu = cell(1, length(ops.Nframes));
for i = 1:length(ops.Nframes)
Fcell{i} = F(:, csumNframes(i) + (1:ops.Nframes(i)));
FcellNeu{i} = Fneu(:, csumNframes(i) + (1:ops.Nframes(i)));
end

if getOr(ops, 'saveNeuropil', 0)
S = reshape(S, numel(ops.yrange), numel(ops.xrange), Nbasis);
save(sprintf('%s/NEU_%s_%s_plane%d.mat', ops.ResultsSavePath, ...
ops.mouse_name, ops.date, ops.iplane), 'ops', 'S', 'Ntraces', '-v7.3')
end
12 changes: 12 additions & 0 deletions signalExtraction/getNonOverlapROIs.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
function stat = getNonOverlapROIs(stat, Ly, Lx)


Mask = zeros(Ly, Lx);

for k = 1:numel(stat)
Mask(stat(k).ipix) = Mask(stat(k).ipix) + 1;
end

for k = 1:numel(stat)
stat(k).isoverlap = Mask(stat(k).ipix)>1;
end

0 comments on commit 550fefb

Please sign in to comment.