Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace load_extractor with load_function from loading.py #3613

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

alejoe91
Copy link
Member

@alejoe91 alejoe91 commented Jan 13, 2025

Created a new loading.py file and moved load function there (kept load_extractor for back compatibility)

The new function will be a magic load of all SI objects: Recording/Sorting/SortingAnalyzer/(WaveformExtractor)

@alejoe91 alejoe91 added the core Changes to core module label Jan 13, 2025
@alejoe91 alejoe91 requested a review from samuelgarcia January 13, 2025 15:10
@alejoe91 alejoe91 changed the title Remote load_extractor and use_times in get_duration/`get_total Remote load_extractor and use_times in get_duration/get_total_duration Jan 13, 2025
readthedocs.yml Outdated Show resolved Hide resolved
Copy link
Collaborator

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again mostly cosmetic things on my part.

@@ -761,62 +761,71 @@ def load(file_path: Union[str, Path], base_folder: Optional[Union[Path, str, boo
* save (...) a folder which contain data + json (or pickle) + metadata.

"""
print(file_path, is_path_remote(file_path))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be prints or warning? in general?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

with open(file_path, "rb") as f:
d = pickle.load(f)
else:
raise ValueError(f"Impossible to load {file_path}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we provide any other info here to help the user. Impossible to load xx. Do we know the most common reason that we could say "try this instead" or is this impossible to know the problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I unified all error messages now :)

f = folder / f"cached.{dump_ext}"
extractor = read_zarr(folder)
else:
# the is spikeinterface<=0.94.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we fix this comment while we are adjusting things :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

raise ValueError(f"This folder is not a cached folder {file_path}")
extractor = BaseExtractor.load(file, base_folder=folder)
else:
error_msg = (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great error message as it says the problem then gives the solution :)

return extractor
extractor = read_zarr(file_path)
else:
raise NotImplementedError("Only zarr format is supported for remote files")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to say. To convert your spikeinterface object to zarr do a xx with format=zarr? That might be getting to specific if the plan is to extend this to more formats in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @zm711

I don't understand this comment. I think it will be easy to extend the comment if and when we support more remote storage formats. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean something like

NotImpletmentedError("We only currently support zarr format for remote files. Your file is format xx. You can convert this file to zarr by doing extractor.save(format=zarr) with necessariy arguments.")

but if the plan is to support a bunch of other formats then it wouldn't make sense to overemphasize zarr in the error message because the message won't stay true once we support more format. If our goal is only to use zarr for remote connection then it might be nice to make a message that states the problem and gives the solution like above. does that make sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this branch is only for remote storage, so we would also need to add how to upload to a cloud bucket? :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we triggered the remote branch then that would mean they were successful in getting it to the cloud right? because we check that it is remote. But this final check is that they uploaded the wrong file type right? Or am I misunderstanding. I think explaining the cloud would be overkill for this message it would be better to just say (see our cloud docs--and then make some docs on si in the cloud :) )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh I see your point! Ok let me give it a try :)

src/spikeinterface/core/baserecording.py Outdated Show resolved Hide resolved
@samuelgarcia
Copy link
Member

I understand the need but I do not like the design.
Having if/else inside a base class that import subclass (zarr extractor) for specific case is weird.
Let me think of more global approach.

@alejoe91 alejoe91 changed the title Remote load_extractor and use_times in get_duration/get_total_duration Replace load_extractor with load_function from loading.py Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Changes to core module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants