Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to postprocess on read-only waveform folders #1957

Merged
merged 10 commits into from
Sep 19, 2023

Conversation

alejoe91
Copy link
Member

@alejoe91 alejoe91 commented Sep 4, 2023

In some cases a waveform extractor folder can be loaded in read-only mode (e.g., precomputed waverorms on the cloud, attached as data).

The waveform extensions always inherit the format from the parent WaveformExtractor. However, if the parent is read-only all postprocessing crashes because it tries to add folders to the parent folder.

This PR adds the is_read_only() method to the WaveformExtractor. In case a waveform extractor is read-only, extensions can be computed, but they will be in memory and not persistent to disk.

EDIT:

Added a small fix to load the similarity extension in the export_to_phy if that's already available!

@alejoe91 alejoe91 added core Changes to core module postprocessing Related to postprocessing module labels Sep 4, 2023
@alejoe91 alejoe91 changed the title Allow to postprocess on read-only waveform folders (WIP) Allow to postprocess on read-only waveform folders Sep 5, 2023
@alejoe91 alejoe91 changed the title (WIP) Allow to postprocess on read-only waveform folders Allow to postprocess on read-only waveform folders Sep 6, 2023
Copy link
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an important feature to support.

I have some comments about nesting that come from clomatic complexityh which is related to code complexity / hader-to-read more bugs:

https://en.wikipedia.org/wiki/Cyclomatic_complexity

This is why I try to decrease this when I see it.

As a more general comment I wonder if there is another way of accomplishing (supporting operating in read only waveforms) without having to keep state and sacrifice caching.

Could we just store the extension folder somewhere else instead of the folder where we usually do? After all, that is like or own in-house format so we can do this right? Like, we can use the local caching of the user (temp) and save that as state so the user can still benefit from caching and we keep the state as the folder of the extensions instead of a flag to see if the original folder of the waveforms is read only.

Maybe there are obvious downsides to this that I had not considered. Just throwing out some other possible solutions.

@@ -399,6 +403,9 @@ def return_scaled(self) -> bool:
def dtype(self):
return self._params["dtype"]

def is_read_only(self) -> bool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another comment.

An advice that I have read is that logical / boolean flags should be designed in such a way that most of the testing you do will test for True.

Reason being, that putting not in front of flags is a little bit harder to read as you need to keep two things in working memory instead of one (like remembering the sign of an expression that people usually mess up in algebraic high school operations).

Here most of the tests are for not is_read_only. On the other hand, "is_read_only" is the common category / flag that the OS usually uses so that seem more important here.

Just a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do is_not_read_only instead! That's a good point!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@h-mayorquin I simplified the if-else statements!

@alejoe91
Copy link
Member Author

@h-mayorquin @zm711 I applied the suggested changes!

Copy link
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one last question about the tests and this should be good to go.

# Extensions already loaded in memory
if extension_name in self._loaded_extensions:
return True
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this else. This will allow you to get rid of the nesting.

@@ -517,15 +524,19 @@ def is_extension(self, extension_name) -> bool:
if self.folder is None:
return extension_name in self._loaded_extensions
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this else either. This will allow you to get rid of the nesting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -76,6 +77,16 @@ def setUp(self):
overwrite=True,
)
self.we2 = we2

# make we read-only
if platform.system() != "Windows":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the test works in windows if this statements is not run? I am wondering if what should be skipped is the test itself if we rely chmod. As @zm711 has pointed out that this is not reliable on windows.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Altought the core tests are running on windows so...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is not that it won't work just that it may introduce a hard to detect bug in the future since it does not behave as the user intends. So if you ever have a test that address writability, Windows behavior would diverge from Linux/Mac. If all the tests are only testing the readability, then they wouldn't "experience" the difference in the file identities. So that's why I said future problem since it may in the future add a hard to detect bug, but won't definitely introduce one.

@alejoe91
Copy link
Member Author

@h-mayorquin updated with the suggestion.

@samuelgarcia only waiting for your review

@samuelgarcia samuelgarcia merged commit 17b0c03 into SpikeInterface:main Sep 19, 2023
8 checks passed
@alejoe91 alejoe91 deleted the postprocessing-read-only branch October 17, 2023 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Changes to core module postprocessing Related to postprocessing module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants