Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of datasets with shape (0,0,...) #77

Merged
merged 2 commits into from
May 29, 2024
Merged

Conversation

rly
Copy link
Contributor

@rly rly commented May 29, 2024

Fix #76.

For an HDF5 dataset with shape (0,0,0), no chunks should be written. This was the previous behavior before #68 and seems reasonable.

For an HDF5 dataset with shape (0,0,0), the chunks key in .zarray was being set to (0,0,0). However, that is not allowed in Zarr. Accessing an array with such chunking results in a ZeroDivisionError: division by zero when trying to access the array. I set it to (1,1,1) instead which is what Zarr does for an array of that shape.

>>> import zarr
>>> a = zarr.empty(shape=(0,0,0))
>>> a.shape
(0, 0, 0)
>>> a.chunks
(1, 1, 1)
>>> a[:]
array([], shape=(0, 0, 0), dtype=float64)
>>> a = zarr.empty(shape=(0,0,0), chunks=(0,0,0))
>>> a.chunks
(0, 0, 0)
>>> a[:]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/rly/mambaforge/envs/test4/lib/python3.11/site-packages/zarr/core.py", line 800, in __getitem__
    result = self.get_basic_selection(pure_selection, fields=fields)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rly/mambaforge/envs/test4/lib/python3.11/site-packages/zarr/core.py", line 926, in get_basic_selection
    return self._get_basic_selection_nd(selection=selection, out=out, fields=fields)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rly/mambaforge/envs/test4/lib/python3.11/site-packages/zarr/core.py", line 966, in _get_basic_selection_nd
    indexer = BasicIndexer(selection, self)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rly/mambaforge/envs/test4/lib/python3.11/site-packages/zarr/indexing.py", line 339, in __init__
    dim_indexer = SliceDimIndexer(dim_sel, dim_len, dim_chunk_len)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rly/mambaforge/envs/test4/lib/python3.11/site-packages/zarr/indexing.py", line 181, in __init__
    self.nchunks = ceildiv(self.dim_len, self.dim_chunk_len)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/rly/mambaforge/envs/test4/lib/python3.11/site-packages/zarr/indexing.py", line 167, in ceildiv
    return math.ceil(a / b)
                     ~~^~~
ZeroDivisionError: division by zero

I also fixed an issue flagged by pyright in LindiReferenceFileSystemStore.__contains__:

error: Method "__contains__" overrides class "Mapping" in an incompatible manner
    Parameter 2 type mismatch: base parameter is type "object", override parameter is type "str"`

@rly rly requested a review from magland May 29, 2024 14:12
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 91.66667% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 79.47%. Comparing base (731fcb5) to head (ffea54a).

Files Patch % Lines
...ndi/LindiH5pyFile/LindiReferenceFileSystemStore.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #77      +/-   ##
==========================================
+ Coverage   79.43%   79.47%   +0.03%     
==========================================
  Files          30       30              
  Lines        2256     2265       +9     
==========================================
+ Hits         1792     1800       +8     
- Misses        464      465       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@magland
Copy link
Collaborator

magland commented May 29, 2024

Looks good to me!

@magland magland merged commit 93487db into main May 29, 2024
6 checks passed
@magland magland deleted the fix_zero_shape branch May 29, 2024 15:26
@rly
Copy link
Contributor Author

rly commented May 29, 2024

Thanks. Released in version 0.3.9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data with shape (0,0,0) triggering assertion error after get chunk refactor
3 participants