Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tiles are missing when applying zappend to zarr files stored in s3 #104

Open
konstntokas opened this issue Oct 28, 2024 · 1 comment
Open
Labels
bug Something isn't working

Comments

@konstntokas
Copy link

Describe the bug
Zappend has been applied to append multiple zarr files along the time axis. The resulting cube shows random blocks being filled with nan values. Running it multiple times, the missing blocks are changing.
Screenshot from 2024-10-28 09-12-43

To Reproduce
A data ID looks like cubes/aux/era5_small/2016_11.zarr.
Screenshot from 2024-10-28 09-15-25

The following zappend config was used.

config = {
    "target_dir": f"s3://{os.environ['S3_USER_STORAGE_BUCKET']}/{data_id_era5}",
    "target_storage_options": {
        "key": os.environ["S3_USER_STORAGE_KEY"],
        "secret": os.environ["S3_USER_STORAGE_SECRET"],
    },
    "slice_storage_options": {
        "key": os.environ["S3_USER_STORAGE_KEY"],
        "secret": os.environ["S3_USER_STORAGE_SECRET"],
    },
    "force_new": True,
    "disable_rollback": True,
    "logging": "DEBUG",
}



zappend(
    (
        open_zarr(f"s3://{os.environ['S3_USER_STORAGE_BUCKET']}/{data_id}")
        for data_id in data_ids[3]
    ),
    config=config,
)

Expected behavior
Zappend should just append the data cubes along a specific dimension without errors.

Python Environment

  • operating system: DeepESDL (user environment built with conda store in DeepESDL)
  • zappend version, output of zappend --version: 0.8.0
@konstntokas konstntokas added the bug Something isn't working label Oct 28, 2024
@forman
Copy link
Member

forman commented Nov 11, 2024

I don't think that this is actually a bug in zappend code, as it does not deal with individual chunks on its own, this is left to the preprocessing chain and the potential use of zarr / dask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants