You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #1705 comments out computing the zarr channel "window" metadata as part of the sink. The existing code can use arbitrarily large amounts of memory as it loads an entire channel and computes np.min and np.max. In a large (real customer use case) dataset, this was trying to load ~200 GB per channel, which was causing an OOM kill.
Since this was a work-interrupting bug, the PR just comments out the code so a release could be made. If this should be computed, it either needs to be collected as we add tiles (tricky since overwriting pixels can cause the min and max to contract) or we need to use a tile and frame iterator to walk the data in chunks.
large_image doesn't use these "window" values anywhere; we could also delete the code or make it an option (and the option would need to use a tile iterator).
The text was updated successfully, but these errors were encountered:
PR #1705 comments out computing the zarr channel "window" metadata as part of the sink. The existing code can use arbitrarily large amounts of memory as it loads an entire channel and computes np.min and np.max. In a large (real customer use case) dataset, this was trying to load ~200 GB per channel, which was causing an OOM kill.
Since this was a work-interrupting bug, the PR just comments out the code so a release could be made. If this should be computed, it either needs to be collected as we add tiles (tricky since overwriting pixels can cause the min and max to contract) or we need to use a tile and frame iterator to walk the data in chunks.
large_image doesn't use these "window" values anywhere; we could also delete the code or make it an option (and the option would need to use a tile iterator).
The text was updated successfully, but these errors were encountered: