Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idea: point on-disk representation using OME-NGFF #789

Open
keller-mark opened this issue Nov 14, 2024 · 0 comments
Open

idea: point on-disk representation using OME-NGFF #789

keller-mark opened this issue Nov 14, 2024 · 0 comments

Comments

@keller-mark
Copy link

keller-mark commented Nov 14, 2024

I made this diagram to try to illustrate the point storage idea I was attempting to explain on Wednesday at the Basel Hackathon.

The idea would be to use OME-NGFF for spatially-arranged point storage.

Diagram corresponding to a single chunk in space 0.0.0 (z,y,x)

diagram

Notes

One assumption here is that the Zarr array chunks will not be reshaped frequently. This would enable using relative offsets from the chunk edges to store the X-Y coordinates using a small dtype (e.g., uint8 if 256x256 chunk shapes) to be very efficient to load (e.g., over a network) for a large image. Another assumption is that Zarr's built-in compression will take care of negating the on-disk impact of the null-value padding when chunks are not filled completely (and that there could be a mechanism to somewhere annotate how far into the chunk is non-null).

With a MERFISH dataset, one idea would be for the point NGFF image to use X and Y dimensions at least as large as the underlying microscope image that the points originated from (i.e., the image prior to point detection), so that point x/y coordinates can be stored as integers without loss of information, but this does not necessarily need to be the case.

This is currently just an idea and could be benchmarked against conceptually simpler formats to check whether there is a performance benefit to justify it. I am not sure whether there would be drawbacks or benefits for operations such as querying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant