Skip to content

How to recover previous zarr store if append_dim fails (e.g. during cluster computation) but metadata has already been written? #8956

Discussion options

You must be logged in to vote

Hi @harryC-space-intelligence!

I understand exactly what you're facing here. This is a challenge so many of of us hit as we scale out our Zarr data processing workflows. In distributed pipelines of a certain size, some failures are inevitable for any number of reasons, and these failures can leave our datasets in a corrupted state.

One solution is to to stage all of the changes locally first, either in memory or on disk, and then copy them up in buik once complete. That is okay for small updates, but, as you noted, it's not efficient or compatible with a big distributed pipeline.

It's important to recognize that what you are asking for is a common feature of actual database systems: what …

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by harryC-space-intelligence
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants