-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zarr: extract time vector once and for all! #2828
Conversation
@@ -72,7 +72,7 @@ def __init__(self, folder_path: Path | str, storage_options: dict | None = None) | |||
time_kwargs = {} | |||
time_vector = self._root.get(f"times_seg{segment_index}", None) | |||
if time_vector is not None: | |||
time_kwargs["time_vector"] = time_vector | |||
time_kwargs["time_vector"] = time_vector[:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm misunderstanding the compression-decompression, but how does switching to a view change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the [:]
the time_vector is set to a zarr.Array
object.
This object performs a decompression/retrieval from file upon slicing, so if I call get_times()
100 times, the data will be accessed from the file (and optionally decompressed) 100 times.
This simple change slices the array, so that time_kwargs["time_vector"]
is now a numpy array and not a zarr object anymore.
Hope this is clear :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah that makes sense. Cool.
As discuss with Alessio this morning. |
If the zarr folder has timestamps, they were re-extracted and uncompressed every time the
get_times()
was called.This PR loads the zarr array as a numpy array at instantiation to speed things up