Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow relative path for export_to_phy #2041

Merged
merged 9 commits into from
Sep 27, 2023
Merged

Conversation

zm711
Copy link
Collaborator

@zm711 zm711 commented Sep 25, 2023

This adds another argument to export_to_phy to allow for just the recording.dat to be given as the dat_path (ie the behavior found in KS1 and KS2) with a default of False (current SI behavior as well as the standard behavior of KS2.5 and KS3).

For #2021.

@h-mayorquin
Copy link
Collaborator

Will this have the same behavior as our relative_to @alejoe91 ?

def to_dict(
self,
include_annotations: bool = False,
include_properties: bool = False,
relative_to: Union[str, Path, None] = None,
folder_metadata=None,
recursive: bool = False,
) -> dict:
"""
Make a nested serialized dictionary out of the extractor. The dictionary produced can be used to re-initialize
an extractor using load_extractor_from_dict(dump_dict)
Parameters
----------
include_annotations: bool
If True, all annotations are added to the dict, by default False
include_properties: bool
If True, all properties are added to the dict, by default False
relative_to: str, Path, or None
If not None, files and folders are serialized relative to this path, by default None
Used in waveform extractor to maintain relative paths to binary files even if the
containing folder / diretory is moved

If that's the case, maybe the name and behavior should be the same?

Or, if it is different, maybe we should add a note in the docstring indicating how it behaves differently from relative_to.

@zm711
Copy link
Collaborator Author

zm711 commented Sep 26, 2023

@alejoe91 just suggested propagating the idea from the WaveformExtractor in #2021.

https://github.com/SpikeInterface/spikeinterface/blob/ff1b8cdb59f0febb21354cb14a3ea2fc7f577b202/src/spikeinterface/core/waveform_extractor.py#L241-L251

So that's why I did that, but happy to change. The only change I made is to change the Path for the binary file for Phy to look for. I don't think it should actually affect any SpikeInterface machinery, but definitely correct me I'm happy to add more documentation or change the argument name.

@alejoe91
Copy link
Member

@zm711 @h-mayorquin let's use use_relative_path for functions exposed to the users. Note that it's different than relative_to, since the latter expects any Path, while use_relative_path is relative to a known folder (waveform folder for WaveformExtractor, phy folder for export_to_phy)

@alejoe91 alejoe91 added the exporters Related to exporters module label Sep 26, 2023
@@ -64,6 +65,8 @@ def export_to_phy(
Dtype to save binary data
verbose: bool
If True, output is verbose
use_relative_path : bool, default: False
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you exapnd on this. When you say "saves the dat_path as a relative path, ... "

I am kind of expecting that there is something after, relative to what?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@h-mayorquin. Great point, that is a lazy docstring. I'll fix that shortly just got to lab so need to set up for the day.

@h-mayorquin
Copy link
Collaborator

@alejoe91 @zm711
I was not aware that we had this functionality / keyword in waveform extractor. I basically just was wondering how it was different and I wanted to have that in the documentation so I myself would not be confused about it and know how is different. But maybe the docstring is not the appropriate place for that to be.

A general question I have here is:
Would using use_relative_path=True as a default cause any other incovenience. If I understand, the user passes a folder (absolute) and all the stuff that we put in that folder is relative to that folder. If that is the case, I am wonderingi f there is any downside to just use relative by default as the only thing that we expose is the folder.

@zm711
Copy link
Collaborator Author

zm711 commented Sep 26, 2023

@h-mayorquin
Here's a copy of the phy folder (a recent running with the new synchrony thanks @alejoe91 :) )
image

So the way phy works is it needs to extract waveforms from a binary file. In this case we have the binary file recording.dat. The absolute path is given as (on my Windows :( a r'c:\user\zm\...\recording.dat'), but this means that if I go to move this file to our storage server the absolute dat_path will no longer work with Phy since it won't find the recording.dat in the right place. Relative just means I save it as dat_path = 'recording.dat' so when I try to activate Phy it just looks in the same folder as the params.py file. Does that make sense? One big annoyance for the relative version of the file is that I need a copy of the recording.dat in the folder which means I need to make another copy (another big file eating up storage). (One other point in case you don't use Phy a lot the way you activate phy is point its template-gui to the params.py which allows Phy to generate its internal file tree, sampling rate, etc. So we only need to write the absolute or relative to the params.py

I'll let @alejoe91 answer:

Would using use_relative_path=True as a default cause any other incovenience. If I understand, the user passes a folder (absolute) and all the stuff that we put in that folder is relative to that folder. If that is the case, I am wonderingi f there is any downside to just use relative by default as the only thing that we expose is the folder.

@alejoe91
Copy link
Member

@h-mayorquin As we move more into server/cloud/remote computing patforms, I think it makes total sense to make use_relative_path=True by default.

Copy link
Member

@alejoe91 alejoe91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zm711 I noticed one thing. In case copy_binary is False, but the recording is already binary, we should get the relative path to the phy output folder. Right?

@h-mayorquin
Copy link
Collaborator

@zm711 Thanks a lot for the explanation about how phy works at that level. This was very useful for me.

@h-mayorquin
Copy link
Collaborator

Any way we can test this by the way?

@zm711
Copy link
Collaborator Author

zm711 commented Sep 26, 2023

phy does have a command something like template-describe that won't activate the gui. So maybe that way. I can do some research?

@zm711
Copy link
Collaborator Author

zm711 commented Sep 26, 2023

@zm711
Copy link
Collaborator Author

zm711 commented Sep 26, 2023

I think what we could do is actually run the test using phylib. We can read in all the export_to_phy into the model in phylib and create a test based on that. phylib should be all the calculations etc without any issues with the gui. I don't know if I have the bandwidth for that this week, but maybe I could draft something soon and then get your feedback later @h-mayorquin ?

@h-mayorquin
Copy link
Collaborator

So here is the test they use on phy for template-describe https://github.com/cortex-lab/phy/blob/8bf8cf751fcbd44210b3c5bbdba8c34a7b2cb1ce/phy/apps/template/tests/test_gui.py#L39-L43

I think that is better if @alejoe91 decides how to handle this as I am unfamiliar with that module. He probably knows what standards and type of tesing we want to (and can) have there.

@alejoe91
Copy link
Member

Yeah I wouldn't start messing with qt dependencies in tests. Let's leave it as is for now and maybe open an issue to add phylib tests later on!

@zm711
Copy link
Collaborator Author

zm711 commented Sep 27, 2023

@alejoe91 The test failure is a problem with compute_synchrony_metrics

for synchrony_size in synchrony_sizes:
>                   synchrony_counts[synchrony_size][unit_id] += np.count_nonzero(spike_complexity >= synchrony_size)
E                   IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

It's also occurred in export_report and not in export_to_phy

@alejoe91
Copy link
Member

@zm711 let me fix the failing test real quick! sorry about that

@zm711
Copy link
Collaborator Author

zm711 commented Sep 27, 2023

No worries. :) Better to find failing tests sooner rather than later. Thanks for the fix.

@alejoe91 alejoe91 merged commit 3826a2d into SpikeInterface:main Sep 27, 2023
8 checks passed
@zm711 zm711 deleted the rel-path-phy branch September 27, 2023 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exporters Related to exporters module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants