Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xarray Support #207

Merged
merged 8 commits into from
Aug 12, 2024
Merged

Xarray Support #207

merged 8 commits into from
Aug 12, 2024

Conversation

mavaylon1
Copy link
Contributor

@mavaylon1 mavaylon1 commented Jul 24, 2024

Motivation

Fix #176

What was the reasoning behind this change? Please explain the changes briefly.

How to test the behavior?

Show how to reproduce the new behavior (can be a bug fix or a new feature)

Checklist

  • Did you update CHANGELOG.md with your changes?
  • Have you checked our Contributing document?
  • Have you ensured the PR clearly describes the problem and the solution?
  • Is your contribution compliant with our coding style? This can be checked running ruff from the source directory.
  • Have you checked to ensure that there aren't other open Pull Requests for the same change?
  • Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.

@codecov-commenter
Copy link

codecov-commenter commented Jul 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.13%. Comparing base (8ca5787) to head (91a3e7a).

Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #207      +/-   ##
==========================================
+ Coverage   87.11%   87.13%   +0.02%     
==========================================
  Files           5        5              
  Lines        1172     1174       +2     
  Branches      286      287       +1     
==========================================
+ Hits         1021     1023       +2     
  Misses        100      100              
  Partials       51       51              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mavaylon1 mavaylon1 marked this pull request as ready for review July 29, 2024 20:54
@mavaylon1 mavaylon1 requested a review from stephprince July 29, 2024 20:57
@alejoe91
Copy link
Collaborator

alejoe91 commented Aug 1, 2024

@mavaylon1 thanks for moving on with this request. One question: will electrical series (for example) automatically set their own dimension labels automatically? Or is there any action to be taken by the user to enable this Xarray support?

@mavaylon1
Copy link
Contributor Author

@alejoe91 The labels just need to be in the schema. If it is there, then it will be handled automatically. Since ElectricalSeries has labels as dims, it will be there without user input.

@stephprince
Copy link
Contributor

This looks good on the hdmf-zarr side, but I tried testing out opening an ElectricalSeries Zarr dataset with xarray and I think _ARRAY_DIMENSIONS is not getting added to all datasets.

I've also not worked much with Zarr/xarray so I might be just setting this up incorrectly. From looking into it briefly, I think this call to DatasetBuilder does not have the dimension_labels argument added.

import os
import numpy as np
import xarray

from numcodecs import Blosc
from pynwb.testing.mock.ecephys import mock_Device, mock_ElectrodeGroup, mock_ElectrodeTable
from pynwb.testing.mock.file import mock_NWBFile
from pynwb.ecephys import ElectricalSeries
from hdmf.common import DynamicTableRegion
from hdmf_zarr.nwb import NWBZarrIO
from hdmf_zarr import ZarrDataIO

# create nwb file
device = mock_Device()
group = mock_ElectrodeGroup(device=device)
electrodes_table = mock_ElectrodeTable(group=group)
electrodes = DynamicTableRegion(name="electrodes", data=list(range(5)), table=electrodes_table, description="electrodes")
nwbfile = mock_NWBFile(electrode_groups=[group], 
                       electrodes=electrodes_table, 
                       devices=[device])

data_with_zarr_data_io = ZarrDataIO(
    data=np.random.randn(100, 5),
    chunks=(10, 10),
    fillvalue=0,
    compressor=Blosc(cname='zstd', clevel=3, shuffle=Blosc.SHUFFLE)
)

nwbfile.add_acquisition(
    ElectricalSeries(
        name="ElectricalSeries",
        data=data_with_zarr_data_io,
        electrodes=electrodes,
        rate=30.0
    )
)

# write file
path = "zarr_testing.nwb.zarr"
absolute_path = os.path.abspath(path)
with NWBZarrIO(path=path, mode="w") as io:
    io.write(nwbfile)

# open with xarray
xarray.open_zarr(path)  # fails
xarray.open_zarr("zarr_testing.nwb.zarr/general/extracellular_ephys/electrodes/")  # works
xarray.open_zarr("zarr_testing.nwb.zarr/acquisition/ElectricalSeries")  # fails

Copy link
Contributor

@stephprince stephprince left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this PR since this is working on the hdmf-zarr side if dimension_labels are stored in the builder.

A related issue to add dimension_labels support to the section of the ObjectMapper used when adding ElectricalSeries datasets is already open here: hdmf-dev/hdmf#1163

@mavaylon1 mavaylon1 merged commit 7283d9f into dev Aug 12, 2024
24 checks passed
@mavaylon1 mavaylon1 deleted the xarray branch August 12, 2024 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: write xarray-compatible Zarr files
4 participants