Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

76 sharding in zarr #125

Closed
wants to merge 50 commits into from

Conversation

aliddell
Copy link
Member

@aliddell aliddell commented Sep 21, 2023

Implements Zarr v3 with the sharding storage transformer. (Not the sharding codec, which is WIP, but I'm keeping an eye on it.)

Supersedes #101. Supersedes #113. Supersedes #117.
Closes #76. Closes #111. Closes #112. Closes #114. Closes #115.

Changes

Added

Changed

  • Upgrades C-Blosc from v1.21.4 to v1.21.5.

Fixed

  • A bug where enabling multiscale without specifying the tile size would cause an error.
  • Exceptions thrown off the main thread are now caught and logged, and Zarr throws an error in append.
  • Job queue is now cleared after every operation.

@aliddell aliddell marked this pull request as ready for review September 21, 2023 22:06
@aliddell aliddell requested a review from nclack September 21, 2023 22:06
Copy link

@andy-sweet andy-sweet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still making my way through this, but haven't seen anything blocking so far. About to travel for ~45 minutes where I can probably review this with Nathan.

tests/write-zarr-with-defaults.cpp Outdated Show resolved Hide resolved
common::sample_type_to_dtype(SampleType t)

{
static const char* table[] = { "u1", "u2", "i1", "i2",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: slightly risky to look up the sampletype -> dtype correspondence by index rather than having an explicit map or switch/case and performance doesn't seem critical here but I don't feel too strongly.

}

const char*
common::sample_type_to_string(SampleType t) noexcept

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: should probably be defined in acquire-core-libs, but no problem being here for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/writers/blosc.writer.cpp Outdated Show resolved Hide resolved
src/writers/blosc.writer.hh Outdated Show resolved Hide resolved
bytes_per_tile(const ImageDims& tile_shape, const SampleType& type);

size_t
frames_per_chunk(const ImageDims& tile_shape,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: I know this name isn't new, but I wonder if tiles_per_chunk would make these things and their calculations a little easier to reason about.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a tossup for me. If Nathan seconds this, I'll change it.

aliddell added a commit to aliddell/acquire-driver-zarr that referenced this pull request Oct 19, 2023
@aliddell aliddell mentioned this pull request Oct 19, 2023
aliddell added a commit to aliddell/acquire-driver-zarr that referenced this pull request Oct 19, 2023
aliddell added a commit to acquire-project/acquire-core-libs that referenced this pull request Oct 25, 2023
)

Came up
[here](acquire-project/acquire-driver-zarr#125 (comment))
in conversation on the sharding PR. These would be pretty handy to have
everywhere.
aliddell added a commit that referenced this pull request Oct 26, 2023
First half of #125.

Supersedes #113. Supersedes #117.
Closes #112. Closes #114. Closes #115.

## Changes

### Added

- Ship debug libs for C-Blosc on Linux and Mac.

### Changed

- Upgrades C-Blosc from v1.21.4 to v1.21.5.

### Fixed

- A bug where enabling multiscale without specifying the tile size would
cause an error.
- Exceptions thrown off the main thread are now caught and logged, and
Zarr throws an error in `append`.
@aliddell aliddell closed this Nov 7, 2023
aliddell added a commit that referenced this pull request Nov 15, 2023
Implements Zarr v3 with the [sharding storage
transformer](https://web.archive.org/web/20230213221154/https://zarr-specs.readthedocs.io/en/latest/extensions/storage-transformers/sharding/v1.0.html).
(Not the [sharding
codec](https://zarr-specs.readthedocs.io/en/latest/v3/codecs/sharding-indexed/v1.0.html),
which is WIP, but I'm keeping an eye on it.)

Supersedes #101. Supersedes #125.
Closes #76. Closes #111.

## Changes

### Added

- Support for [Zarr
v3](https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html).
- Support for the [sharding storage
transformer](https://web.archive.org/web/20230213221154/https://zarr-specs.readthedocs.io/en/latest/extensions/storage-transformers/sharding/v1.0.html)
in Zarr v3.

## Testing

Since napari doesn't yet support Zarr v3, we can use zarr-python with
matplotlib to do a sanity check. Write out a Zarr v3 dataset using
either `write-zarr-v3-raw` or `write-zarr-v3-compressed` in the `tests`
folder, then run the following Python script:

```python
import os
import numpy as np
import matplotlib.pyplot as plt

# these MUST come before importing zarr
os.environ["ZARR_V3_EXPERIMENTAL_API"] = "1"
os.environ["ZARR_V3_SHARDING"] = "1"

import zarr

def plot_array(input_zarr):
    store3 = zarr.DirectoryStoreV3(input_zarr)
    z3 = zarr.open(store=store3, mode="r")

    for (k3, a3) in z3.arrays():
        for i in range(a3.shape[0]):
            plt.imshow(a3[i, 0, :, :])
            plt.show()

if __name__ == "__main__":
    plot_array("C:/testing/acquire-driver-zarr-write-zarr-v3-compressed.zarr") # change this to point to the dataset you wrote out
```

For best results, you'll want to change the simulated camera used to
something more intelligible, e.g., radial sin.
@aliddell aliddell deleted the 76-sharding-in-zarr branch July 25, 2024 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants