Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the signature on kilosort's delete intermediate files parameters. #1908

Conversation

JoeZiminski
Copy link
Collaborator

This PR extends cleanup after kilosort sorting, closes #1896.

The signature for deleting intermediate files is changed slightly and is now a Tuple with optional entries, recording.dat, temp_wh.dat and matlab_files. The reason for this extension is to give a little more granularity over the selection of what to delete, and to not delete temp_wh.dat by default as it is used for Phy.

For now, have default to delete only matlab_files because my understanding is that recording.dat is not deleted for other sorters and so this might be confusing if the behaviour is different across sorters.

Do you think it is worth pushing up this functionality to the run_sorter level, so that for any sorter recording.dat can be deleted (in the cases it is created). The issue with that approach would be that some intermerdiate files are sorter-specific. Alternatively, each sorter has it's own delete_intermediate_files parameter that takes keywords specific to it's output, and these can be accessed with run_sorter's kwargs? (in which case, happy to extend this functionality to other sorters, I am just not very familiar with the intermediate files they all create).

@JoeZiminski JoeZiminski changed the title Change the signature on kilosort's delete intermediate files. Change the signature on kilosort's delete intermediate files parameters. Aug 3, 2023
@alejoe91 alejoe91 added enhancement New feature or request sorters Related to sorters module labels Aug 22, 2023
@samuelgarcia
Copy link
Member

Hi Joe.
I am not against this but I have the feeling that the the cost of breaking the compatibility is a bit high (and we need to rebuild all the docker images!!).
@alejoe91 what do you think ?

@alejoe91
Copy link
Member

We can rebuild the docker images, but it would be great if we can make this bakward-compatible :)

@JoeZiminski
Copy link
Collaborator Author

That's a good point about backward compatibility!

If the below makes sense, I could rework this PR to:

  1. Revert the kilosort delete_recording_dat signature to how it was previously.

  2. Add delete_tmp_files argument back and allow it to accept True, False, or a Tuple of keywords to delete (as in the current PR). This way it will be backwards compatible but also allow more refined deletions. The only real use-case of this more refined option would be if someone wanted to inspect the matlab files but didn't want to re-write the temp_wh.dat, which seems unlikely. In which case, it may not be worth the additional, slightly confusing complexity in the function arguments.

with backward compatibility considerations, I think the key thing is really to set the delete delete_tmp_files=False so temp_wh.dat is available for Phy.

I see that for other sorters, in some cases the binary is re-written. From searching recording.dat in spikeinterface, this is true for klusta and pykilosort while cominato writes s recording/data_recording.h5`. Do you think it is worth adding similar arguments to remove these intermediate files also? (in different PRs, happy to open).

@alejoe91
Copy link
Member

Thanks @JoeZiminski

I think your proposed solution is perfect. Note that passing a tuple for delete_tmp_files will not be compatible with running on containers until a new version of SI is released.

@samuelgarcia note that the docker images wouldn't need to be rebuilt, since we're only touching the Python wrappers.

@JoeZiminski
Copy link
Collaborator Author

Thanks @alejoe91 @samuelgarcia I have made the discussed changes, let me know if anything further is required. Are there are quick sorter-tests where testing these options could be slotted in?

@zm711
Copy link
Collaborator

zm711 commented Sep 14, 2023

@JoeZiminski

Just as an FYI and to transfer our convo over from #1896. I was just helping someone on the Phy issue tracker (running KS2 + Phy outside of SI). And they tried both the raw file *.dat and the temp_wh.dat and they said that it looked much better with the raw data rather the whitened data for KS2. I haven't confirmed this myself, but I wanted a record of at least one person voting on using raw for KS2.

edited: bold to make clear I haven't tested

@alejoe91
Copy link
Member

@JoeZiminski

Just as an FYI and to transfer our convo over from #1896. I was just helping someone on the Phy issue tracker (running KS2 + Phy outside of SI). And they tried both the raw file *.dat and the temp_wh.dat and they said that it looked much better with the raw data rather the whitened data for KS2. I haven't confirmed this myself, but I wanted a record of at least one person voting on using raw for KS2.

edited: bold to make clear I haven't tested

That's a good point. While spatial whitening is needed by Kilosort, it kind of messes up the spatial structure of the waveforms, making them quite less interpretable. I also prefer looking at "real" signals :)

@JoeZiminski
Copy link
Collaborator Author

Thanks @zm711 and @alejoe91! I have moved my last comment and this discussion to #1954.

I added a final assert check please let me know if anything can be improved, and tested manually. Are there any parameter tests on KS sorting a test for this could be slotted into?

@JoeZiminski
Copy link
Collaborator Author

Hey @zm711, @alejoe91, @samuelgarcia do you think this is good to go? happy to add anything required on the testing end.

Copy link
Collaborator

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple typo fixes in the docstrings for me @JoeZiminski,

but I'll let @alejoe91 and @samuelgarcia let you know about any tests they want added.

src/spikeinterface/sorters/external/kilosort.py Outdated Show resolved Hide resolved
src/spikeinterface/sorters/external/kilosort2_5.py Outdated Show resolved Hide resolved
src/spikeinterface/sorters/external/kilosort3.py Outdated Show resolved Hide resolved
@JoeZiminski
Copy link
Collaborator Author

thanks @zm711 !

@alejoe91
Copy link
Member

Looking great @JoeZiminski

Merging :)

@alejoe91 alejoe91 merged commit f55dfcc into SpikeInterface:main Oct 17, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request sorters Related to sorters module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cleanup of kilosort temporary files causing issues with Phy
4 participants