-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition of a Template objects and refactor of matching engines #1527
Conversation
That looks ok for me! I think that the main reason why it was not implemented in the first place was to run this implementation against the original ones, which uses template arrays. Maybe we can make both options available? Either pass a |
The reason why I used templates as an input instead of the waveform extractor is that the wobble algorithm can support multiple templates per unit, which would be hard (impossible?) to define in a waveform extractor depending on how you want the templates to look for each unit. I'm not totally sure what the use case is to have multiple templates for each unit, but @cwindolf should be able to explain. I agree that all the template matching algorithms should have the same mandatory inputs (obviously not optional parameters), but I think the more fundamental input is the templates themselves rather than the waveform extractor. In general, one might want to perform template matching with templates from another recording, with some alterations, smoothing etc. So, I think we should define template matching algorithms as starting with the templates as a given rather than extracting them in some way from the waveform extractor. |
Yes, there are multiple reasons for not using a waveform extractor
|
Ok, as said I can be convinced. The safe option would be to add two options. Either a waveform extractor (and with sparsity methods that could be standardized among engines), or templates with nbefore/nafter. I'll work on that |
Actually, here is what I'll do. For all methods, we will always provide a waveform_extractor (such that nbefore and nafter can be inferred), and a 'sparsity' field, such that if the waveform_extractor is not sparse, it will be sparsified via the core methods of spikeinterface and the params in this 'sparsity' field (this will allow finer comparisons between engines, because currently, each of them has its own ways of sparsifying). Meanwhile, methods will also have extra fields called templates and sparsity_mask. If these fields are provided, then the templates will be taken from here (bypassing the waveform_extractor), and same for the sparsity_mask. Will it be fine for everybody? |
@yger @pauladkisson I think that there is an even better long-term solution, which is to allow to instantiate a |
Yes, the best would be to have a from_templates() method and thus a single waveform_extractror. With the extra features that such an object should be able to handle several templates per units. But ok for the green light, I'll start with my solution for now |
What do you mean by "factorize" above? My two cents:I am concerned about coupling. Don't we have sparsity objects as stand alone to accomplish this? It seems like an overkill, plus an unnecessary demand for users, to pass an instance of waveform extractor if all we need is:
An intermediate approach would be to have a template class that has the temporal information (which is an attbute of the templates) and possibilty the sparsity as attributes. The WaveformExtractor then can be composed with this TemplateClass to achieve code deduplication: How would that look?For people that don't wnat to deal with WaveformExtractor
And for people that want to use WaveformExtractor:
This covers the two uses cases, reduces duplication and avoids excessive coupling between this module and the multi-responsability-harder-to-understand In brief, the current proposal increases even more the responsabilities of WaveformExtractor -reducing cohesion- and creates more coupling among our code. Plus, it forces our users to deal with that object when they don't strictly need it. I think we can do better. Am I missing something? |
No i guess you are right, I just wanted to make use of existing objects, and because the waveform_extractor has a lot of functions/attributes used by the engines, I thought about reusing it. But we'll see. Now the code is working here with the behavior explained before: if templates/sparsity_mask are manually provided, then fine, they are used by default. If a waveform_extractor is passed, otherwise, templates are obtained from it, and eventually sparsified with the same method for all sorters (that could be specified). This is important, because currently all engines are dealing with their own way to sparsify, such that we can not really compare among themselves |
Back in the game! Actually, we can discuss that during next call. The thing is that for some methods, you might want to have a waveform_extractor as an input (as Sam told me). Because sometimes, having individual waveforms to estimate the noise around the centroids might be needed by matching methods. So I don't know. To me, the best solution should then be a methods that will accept a Template object (like the one you mentioned, with templates, (sparsity mask?) and temporal information n_before/n_after), or a WaveformExtractor object that can be sparse or not |
I can't find anywhere in tridesclous's matching that uses that information (maybe I'm missing it, or maybe Sam is talking about some other method?). In either case, it should be relatively straightforward to compute that sort of thing outside of the matching method, and then pass it in as a method-specific parameter. That would also add more flexibility to accommodate template matching procedures that use templates unassociated with specific spike trains. |
This is used in Circus (non omp). In fact, for every centroid, we use several snippets of waveforms to estimate the noise and optimize the range of the allowed amplitudes. But since omp is much better, not sure I want to maintain that, just making the point. |
Thanks. I see that circus uses the waveform snippets in the I still think this function could be pretty easily extracted out of the matching and pass the 'optimal amplitudes' into circus's matching alg. Maybe we could have another file in the matching submodule (or its own submodule?) called In any case, I look forward to discussing in more detail on Thursday. |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
@alejoe91 @samuelgarcia I've tried to add a TemplatesDictionary object, feel free to let me know what do you think about it. All the matching methods are now working with it, and the best maybe would be to add to the waveform_extractor object an option to directly export such an object. @pauladkisson the sparsification performed internally by wobble could/should be removed then, for now I've left it there, but the plan would be to perform sparsification in a common framework, before sending templates to the engines |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
from spikeinterface.core.job_tools import ChunkRecordingExecutor, fix_job_kwargs | ||
from spikeinterface.core import get_chunk_with_margin | ||
|
||
|
||
def find_spikes_from_templates(recording, method="naive", method_kwargs={}, extra_outputs=False, **job_kwargs): | ||
class TemplatesDictionary(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class should live in core, see #1982
Here are some modifications for the Wobble algorithm, that I'll try to proofread intensively (and maybe steal ideas for the omp :-)). However, when running the algorithm out-of-the-box with default params, on a toy benchmark (32 channels), I do not get results as good as the ones you showed last time (see results below). Is it normal? Should I tweak params? I'll try with Neuropixel-like recordings. For now, all the engine are accepting as input a waveform_extractor. This is convenient, because we could also use the intrinsic sparsity there to factorize this among engines.
I guess we should also have more benchmarks on times, and more unit tests as mentioned in #1514