Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python image file formats #462

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from
Draft

Python image file formats #462

wants to merge 11 commits into from

Conversation

grinic
Copy link
Collaborator

@grinic grinic commented Mar 2, 2023

No description provided.

@tischi
Copy link
Collaborator

tischi commented Mar 9, 2023

@grinic I would say don't worry about the xml/hdf5, because I anyway feel that should be removed from this module and rather discussed here.

@tischi
Copy link
Collaborator

tischi commented Mar 29, 2023

@grinic @manerotoni @tibuch @sebgoti @bugraoezdemir @k-dominik

My current take on this is:

  1. Let's avoid putting all those complex bioformats dependencies into our napari env for the course
  2. Let's explain in this module how one would do this (aics-io, bioformats), but maybe don't even teach this module for beginners and say that this is advanced stuff
  3. Let's convert all images that we need for teaching a beginners course to OME-Zarr and use that in all the python modules

Like this we (teachers and students) learn something forward looking and useful, namely how to read and write OME-Zarr, and we don't waste our time and energy fixing this file format mess.

Alternatively to (3), we could try finding one TIFF flavour that would do the job (see #471), but right now, personally, I'd be more motivated to go for OME-Zarr.

It would be great to have your opinions, because I think we need to make a decision very soon.

@sebgoti
Copy link
Collaborator

sebgoti commented Mar 29, 2023

Sounds good, the other day I tried creating the environment as in the instructions and 1. it takes very long, 2. ran with an error :P

@grinic
Copy link
Collaborator Author

grinic commented Mar 29, 2023

Hi all,
all points sound good to me. I agree we don't have to find a solution that works for all file formats.
I am not sure how to extract voxel size information from OME-ZARR format in Python.
But, following a suggestion by @k-dominik, I could read voxel size from a tif file with the following:

import imageio.v3 as iio
props = iio.improps(image_file)
meta = iio.immeta(image_file)
# extract voxel size in format ZYX
voxelSize = [ meta['spacing'], 1./props.spacing[1], 1./props.spacing[0] ]

@tischi
Copy link
Collaborator

tischi commented Mar 29, 2023

I am not sure how to extract voxel size information from OME-ZARR format in Python.

This is what we have to learn.
@sebgoti @bugraoezdemir do you know?

@sebgoti
Copy link
Collaborator

sebgoti commented Mar 29, 2023

I am not a big expert in OME-ZARR, so I would need to do some digging around. @grinic your suggestion works for TIFF but after reading imageio's documentation I could not find a way of saving images while preserving metadata using that package. I think there is a more cumbersome way of doing it similar to this discussion.

@bugraoezdemir
Copy link
Collaborator

bugraoezdemir commented Mar 29, 2023

For me, the easiest way would be something like:

import json, os

rootpath = 'path/to/OME-Zarr' # if OME-Zarr was created with the series hierarchy, rootpath should be 'path/to/OME-Zarr/0': 
multisclpath = os.path.join(rootpath, '.zattrs')
zarraypath = os.path.join(rootpath, '0', '.zarray') # full resolution level

with open(zarraypath) as f:
    zarray = json.load(f)
    
with open(multisclpath) as f:
    multiscl = json.load(f)['multiscales'][0]

shape = zarray['shape']
axes = multiscl['axes']
scale = multiscl['datasets'][0]['coordinateTransformations'][0]['scale']  # the scale for the full resolution level

@bugraoezdemir
Copy link
Collaborator

One can also use the zarr library, of course.

@bugraoezdemir
Copy link
Collaborator

The same with the zarr library:

import zarr

rootpath = 'path/to/OME-Zarr' # if the OME-Zarr was created with series hierarchy, rootpath should be 'path/to/OME-Zarr/0':
rootgr = zarr.group(rootpath)
multiscale_zattrs = dict(rootgr.attrs)['multiscales'][0]

shape = rootgr['0'].shape # the shape of the full resolution array
axes = multiscale_zattrs['axes']
scale = multiscale_zattrs['datasets'][0]['coordinateTransformations'][0]['scale'] # the scale for the full resolution

I think the json way is cheaper for this purpose.

@manerotoni
Copy link
Collaborator

Hello,
I am a little reluctant about pure ome-zarr for the material. Not many people are familiar with it and whether this is the ring to rule them all. Does ome-zarr now opens without problems in ImageJ? Can I open ome-zarr in imaris? Cellprofiler etc.
I would leave some kind of common tif format with metadata to be on the safe side. We only need the pixel dimensions, anything else is not needed.

I think one important workflow is to get the metadata (eventually convert data) from IJ and enter it in your python script.

Antonio

@tischi
Copy link
Collaborator

tischi commented Mar 29, 2023

I would leave some kind of common tif format with metadata to be on the safe side. We only need the pixel dimensions, anything else is not needed.

I agree, but @manerotoni do you know whether this is possible??!! We started experimenting with OME.TIFF and that does not even behave consistently within Java; so that is out.

@grinic is now looking into it... (let's collect progress here: #471)

@manerotoni
Copy link
Collaborator

If after so many years community and companies have not agreed on a format for the metadata, I doubt that this will be true for ome.zarr. Bioformat helped a lot but the problem still remains, it was just hidden by bioformat.

I agree that we should make sure that the data used in the python course is consistent in order to avoid problems. I would go for a specification of tif and ome-zarr.
That the whole field is complicated could be a nice point for discussion with students. For instance by providing examples that may not work, or where they have to digout the metadata, install additional software etc.

My impression after working more with python is that I first convert proprietary data to a more convenient format.

@tischi
Copy link
Collaborator

tischi commented Mar 31, 2023

@grinic we changed the image_file_formats module quite a lot. you may have issues now merging your branch to master. you can also just give me the python file and I can add it for you.

@grinic
Copy link
Collaborator Author

grinic commented Mar 31, 2023

Ok, here are the python files that were in that branch:
image_file_formats.zip

The additional packages used are aicsimageio and BioformatsReader

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants