Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Waveform tools speedup #1799

Merged
merged 18 commits into from
Sep 13, 2023

Conversation

samuelgarcia
Copy link
Member

@samuelgarcia samuelgarcia commented Jul 7, 2023

Waveforms extarctor was extracting waveforms into one file (or one sharemem) per unit.

This lead to "too many file open" in many situtation.
Here I propose another implementation into a unique file.

PROS:

  • easier
  • faster
  • no more "too many" files
  • waveforms folder will be more easy to manipulate because few files.

CONS:

  • need to make bool mask allong spike axis 0 to get waveforms for a unit
  • when sparsity channel are not align inter units.

The idea of this PR is only to add the underlying machinery.
This PR do not change the WaveformExtactor internals.
It will be done in a separted PR using this new extract_waveforms_to_single_buffer() to fix the "too many file open"

@alejoe91 alejoe91 added enhancement New feature or request core Changes to core module labels Jul 18, 2023
…waveform_tools_speedup

# Conflicts:
#	src/spikeinterface/core/tests/test_waveform_tools.py
@samuelgarcia samuelgarcia marked this pull request as ready for review July 26, 2023 15:59
@samuelgarcia
Copy link
Member Author

@alejoe91 @h-mayorquin : if you would have time to review this it would be super usefull for me to continue work working on tridesclous2 and spykingcircus2

Copy link
Member

@alejoe91 alejoe91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @samuelgarcia

A couple of small comments:

  • ditribute -> distribute
  • I would use get_traces_with_margin instead of get_traces. I think it makes the margin handling more readable

src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
@h-mayorquin
Copy link
Collaborator

I started reviewing this, I will finish in the following days.

@samuelgarcia
Copy link
Member Author

@yger : when this will be merge could you rewrite this function here https://github.com/SpikeInterface/spikeinterface/blob/main/src/spikeinterface/sortingcomponents/tools.py#L21
with the use of this new extract_waveforms_to_single_buffer() so that you can explicitly give a sparse mask which is the diagonal instead of a radius.

@samuelgarcia
Copy link
Member Author

@alejoe91 I put back the test of sharemem on window and now it is working!!!

Copy link
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments.

src/spikeinterface/core/waveform_tools.py Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Show resolved Hide resolved
dtype=None,
sparsity_mask=None,
copy=False,
job_name=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't job_name a part of job_kwargs ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

job_name is for ChunkRecordingExecutor and is not part of job_kwargs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouln't it be?
job_kwargs is basically al the keyword arguments of ChunkRecordingExecutor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. certainlly. but not in this PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, here you go:
#1986

src/spikeinterface/core/waveform_tools.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Thanks for bearing with me, @samuelgarcia. I learned while doing this PR and I know that that's taxing in your time but this is when I can pick your brain appart and learn about the design decisions that you are making.

Some pending issues for future PR or issues:

  • I still think it will be a good idea to refer to spikes as spike_vector whenever they are, well, spike_vectors.
  • Isn't the job_name already part of job_kwargs?
  • I wish we had a test for not extracting spikes whose support would go outside of the borders.

src/spikeinterface/core/waveform_tools.py Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Show resolved Hide resolved
src/spikeinterface/core/waveform_tools.py Show resolved Hide resolved
@samuelgarcia
Copy link
Member Author

I think it should just return what the recorder returns by default. Do you think that would be wrong?

When we want to return_scale=True we need another dtype so it is good to control this externally.

@samuelgarcia samuelgarcia merged commit 029fd54 into SpikeInterface:main Sep 13, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Changes to core module enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants