-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: write xarray
-compatible Zarr files
#176
Comments
from xarray import DataArray
import numpy as np
from pynwb.testing.mock.ecephys import mock_ElectricalSeries
from h5py import Dataset
from pynwb import get_type_map
import json
dset_types = (np.ndarray, Dataset) # etc.
def get_dimension_labels(cls, ndims, dataset_name):
spec = get_type_map().namespace_catalog.get_spec(cls.namespace, cls.neurodata_type)
data = next(x for x in spec["datasets"] if x["name"] == dset_name)
dims = data["dims"]
if isinstance(dims[0], str): # only one shape spec
return dims
for i_dims in dims:
if len(i_dims) == ndims:
return i_dims
def load_dset_as_xarray(obj, dset_name):
dset = obj.fields[dset_name]
cls = obj.__class__
ndims = len(dset.shape)
dim_labels = get_dimension_labels(cls, ndims, dset_name)
coords = dict(num_channels=electrical_series.electrodes.data)
if obj.timestamps is not None:
coords.update(num_times=obj.timestamps)
attrs = {k: v for k, v in obj.fields.items() if not isinstance(v, dset_types)}
return DataArray(dset, dims=dim_labels, coords=coords, attrs=attrs)
electrical_series = mock_ElectricalSeries(timestamps=np.arange(10), rate=None)
load_dset_as_xarray(electrical_series, "data") |
^ Code that solves a related problem and might be helpful |
I think this could be useful to have in some form available in PyNWB. Maybe as a utility method and/or as a method on |
Adding the |
In terms of implementation, I think this will require changes in HDMF as well. Here a rough plan of how this could be implemented:
hdmf-zarr/src/hdmf_zarr/backend.py Line 954 in 6e946da
@rly does that plan sound reasonable or what this also require changes in the |
That sounds reasonable, except that all the building / creation of |
@mavaylon1 could you take a look at this? |
I'll work on the HDMF side |
Can do |
What would you like to see added to HDMF-ZARR?
Xarray supports the Zarr backend, but requires the
_ARRAY_DIMENSIONS
attribute to be set with a list of names for the array dimensions (e.g.[samples, channels]
) - see https://docs.xarray.dev/en/stable/internals/zarr-encoding-spec.html#zarr-encodingIt would be great to add these attributes as default for known data types (e.g.
ElectricalSeries
)@jsiegle
Is your feature request related to a problem?
NWB-Zarr files cannot be opened by
xarray.open_zarr
What solution would you like?
Adding the
_ARRAY_DIMENSIONS
attributes to all "known" neurodata_typesDo you have any interest in helping implement the feature?
Yes.
Code of Conduct
The text was updated successfully, but these errors were encountered: