Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error decoding v2 array #2505

Open
jeromekelleher opened this issue Nov 19, 2024 · 2 comments
Open

Error decoding v2 array #2505

jeromekelleher opened this issue Nov 19, 2024 · 2 comments
Labels
bug Potential issues with the zarr-python library

Comments

@jeromekelleher
Copy link
Member

Zarr version

3.0.0a5

Numcodecs version

0.13.1

Python Version

3.10.12

Operating System

Linux

Installation

pip install

Description

Reading a v2 array written with Zarr Python 2 gives an error when creating the Blosc codec.

I've attached the .zarray and one chunk file below to reproduce.

import zarr

z = zarr.open("zarr_bug")
print(z)
print(z[:])

Gives

<Array file://zarr_bug shape=(204714,) dtype=int32>
Traceback (most recent call last):
  File "/home/jk/work/github/vcf-zarr-publication/zarr_bug.py", line 6, in <module>
    print(z[:])
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/core/array.py", line 919, in __getitem__
    return self.get_orthogonal_selection(pure_selection, fields=fields)
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/_compat.py", line 43, in inner_f
    return f(*args, **kwargs)
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/core/array.py", line 1361, in get_orthogonal_selection
    return sync(
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/core/sync.py", line 91, in sync
    raise return_result
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/core/sync.py", line 50, in _runner
    return await coro
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/core/array.py", line 476, in _get_selection
    await self.codec_pipeline.read(
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/codecs/pipeline.py", line 427, in read
    await concurrent_map(
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/core/common.py", line 53, in concurrent_map
    return await asyncio.gather(*[func(*item) for item in items])
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/codecs/pipeline.py", line 260, in read_batch
    chunk_array_batch = await self.decode_batch(
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/codecs/pipeline.py", line 177, in decode_batch
    chunk_array_batch = await ab_codec.decode(
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/abc/codec.py", line 125, in decode
    return await _batching_helper(self._decode_single, chunks_and_specs)
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/abc/codec.py", line 409, in _batching_helper
    return await concurrent_map(
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/core/common.py", line 53, in concurrent_map
    return await asyncio.gather(*[func(*item) for item in items])
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/abc/codec.py", line 422, in wrap
    return await func(chunk, chunk_spec)
  File "/home/jk/.local/lib/python3.10/site-packages/zarr/codecs/_v2.py", line 30, in _decode_single
    compressor = numcodecs.get_codec(self.compressor)
  File "/home/jk/.local/lib/python3.10/site-packages/numcodecs/registry.py", line 42, in get_codec
    config = dict(config)
TypeError: 'Blosc' object is not iterable

Steps to reproduce

Get the attached tar file (renamed to .txt to work around annoying file type restrictions...)

zarr_bug.txt

  • mv zarr_bug.txt zarr_bug.tar
  • tar -xf zarr_bug.tar
  • Run code above

I've included one chunk here and the .zarray as hopefully the simplest way to reproduce.

Additional output

No response

@jeromekelleher jeromekelleher added the bug Potential issues with the zarr-python library label Nov 19, 2024
@jeromekelleher
Copy link
Member Author

Including the .zarray here for ease of reference:

{
    "chunks": [
        10000
    ],
    "compressor": {
        "blocksize": 0,
        "clevel": 7,
        "cname": "zstd",
        "id": "blosc",
        "shuffle": 0
    },
    "dimension_separator": "/",
    "dtype": "<i4",
    "fill_value": -1,
    "filters": null,
    "order": "C",
    "shape": [
        204714
    ],
    "zarr_format": 2
}

@d-v-b
Copy link
Contributor

d-v-b commented Nov 19, 2024

thanks for reporting this bug. looking at the last few lines of the traceback:

  File "/home/jk/.local/lib/python3.10/site-packages/zarr/codecs/_v2.py", line 30, in _decode_single
    compressor = numcodecs.get_codec(self.compressor)
  File "/home/jk/.local/lib/python3.10/site-packages/numcodecs/registry.py", line 42, in get_codec
    config = dict(config)

I'm wondering if the problem comes form calling get_codec(self.compressor) instead of get_codec(make_this_a_dict(self.compressor)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Potential issues with the zarr-python library
Projects
None yet
Development

No branches or pull requests

2 participants