Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support single z-stack tif file for input #67

Merged
merged 6 commits into from
May 3, 2024

Conversation

matham
Copy link
Contributor

@matham matham commented Apr 10, 2024

Copy link

codecov bot commented Apr 10, 2024

Codecov Report

Attention: Patch coverage is 76.92308% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 90.56%. Comparing base (17e5ebc) to head (befb748).
Report is 4 commits behind head on main.

❗ Current head befb748 differs from pull request most recent head ed40e38. Consider uploading reports for the commit ed40e38 to get more accurate results

Files Patch % Lines
brainglobe_utils/image_io/load.py 75.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #67      +/-   ##
==========================================
+ Coverage   89.07%   90.56%   +1.49%     
==========================================
  Files          35       35              
  Lines        1355     1389      +34     
==========================================
+ Hits         1207     1258      +51     
+ Misses        148      131      -17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@K-Meech K-Meech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I just put a few small comments below.
Ideally this would also have a corresponding test that calls get_size_image_from_file_paths with a single-file tif stack, similar to the one for getting size from a dir. Would you be happy to add this? Otherwise I can add a test after this is merged.

brainglobe_utils/image_io/load.py Outdated Show resolved Hide resolved
# read just the metadata
tiff = tifffile.TiffFile(file_path)
if not len(tiff.series):
raise ValueError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally these would use ImageIOLoadException rather than ValueError. This would need some restructuring of ImageIOLoadException though - so happy to leave as-is for now, and I'll look into this once it's all merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I left it as is then.

@@ -683,6 +684,30 @@ def get_size_image_from_file_paths(file_path, file_extension="tif"):
Dict of image sizes.
"""
file_path = Path(file_path)
if file_path.name.endswith(".tif") or file_path.name.endswith(".tiff"):
# read just the metadata
tiff = tifffile.TiffFile(file_path)
Copy link
Contributor

@K-Meech K-Meech Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use with TiffFile(file_path) as tiff: for this section? This will ensure the connection is closed correctly, as they require in their docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the change.

f"Found {axes} axes with shape {shape}"
)

image_shape = {name: shape[i] for name, i in indices.items()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could remove the indices = {ax: i for i, ax in enumerate(axes)} line above, and do this in one line with something like: image_shape = {ax: sh for ax, sh in zip(axes, shape)}

Copy link
Contributor Author

@matham matham Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I applied the change.

@matham
Copy link
Contributor Author

matham commented Apr 22, 2024

I added a test as requested.

I had to modify the to_tiff function because by default it seems to set the axis to "qyx" and so the test had the error:

ValueError: Attempted to load ...\pytest-12\test_image_size_tiff_stack0\image_size_tiff_stack.tif but didn't find a xyz-stack. Found qyx axes with shape (4, 4, 4).

My tiff files come via BigStitcher via Fiji, and so have the correct axes set. I'm not sure if that's common. If not and typical imaging tiff files don't have the right axes set, we may have to use the orientation input parameter. Which perhaps we should be relying on exclusively instead of testing the axes names like I did?

Also perhaps get_size_image_from_file_paths is not the function brainreg should call to get the image metadata (the name is certainly not correct anymore as it could be one path 😅 ) and instead there should be corresponding load_xxx_metadata and a load_any_metadata.

@adamltyson
Copy link
Member

My tiff files come via BigStitcher via Fiji, and so have the correct axes set. I'm not sure if that's common. If not and typical imaging tiff files don't have the right axes set, we may have to use the orientation input parameter. Which perhaps we should be relying on exclusively instead of testing the axes names like I did?

I think in general it's rare that this type of data comes with correct metadata. There's a lot of homebuilt systems that output files without any metadata.

@K-Meech
Copy link
Contributor

K-Meech commented Apr 22, 2024

I'd agree that the metadata often isn't set correctly in tiff files. The issue here is it needs to return the image shape in the style: {"x": x_shape, "y": y_shape, "z": z_shape}, so it needs some way to determine the input axis order. I'm not familiar with the orientation parameter - maybe @alessandrofelder has suggestions?

@matham
Copy link
Contributor Author

matham commented Apr 22, 2024

I'd agree that the metadata often isn't set correctly in tiff files.

Sounds like I should remove the check then, and not check the metadata for the order of axes.

The issue here is it needs to return the image shape in the style: {"x": x_shape, "y": y_shape, "z": z_shape}, so it needs some way to determine the input axis order. I'm not familiar with the orientation parameter

I meant this parameter that we provide to brainmapper: https://brainglobe.info/documentation/setting-up/image-definition.html.

However, if we're calling directly into cellfinder core through python, this parameter is not present (assuming in the future we'll use these functions to load cellfinder data - moving cellfinder loading functions into this repo)!? Should the load_xxx functions take an input orientation parameter and re-order its output to match the order cellfinder internally expects?

Cell finder internally expects a certain axis order, but it isn't documented anywhere I think. In fact it took me a while to find this and it's still not quite clear to me why we do this: https://github.com/brainglobe/cellfinder/blob/e6a887f1721b89d51328214b108e0da5401a24a9/cellfinder/core/detect/filters/plane/plane_filter.py#L63. I do know when I tried removing the transpose, the subsequent code in cell splitting failed.

@adamltyson
Copy link
Member

@alessandrofelder do you know why cellfinder seems to need a specific orientation? The orientation (particularly in plane) shouldn't matter at all. The orientation parameters should only be used for atlas registration.

@adamltyson
Copy link
Member

Does anybody know what we need to get this PR merged? I'm slightly confused about the axis thing. As far as I know, we don't use image metadata anywhere in BrainGlobe. In particular, we don't read x/y/z - they are pretty meaningless because everyone has a different idea about what they mean.

@alessandrofelder
Copy link
Member

I agree with @adamltyson 's assertions that AFAIK

  • we never read image metadata
  • orientation shouldn't matter in cellfinder.

I do know when I tried removing the transpose, the subsequent code in cell splitting failed.

Be interesting to know how this failed 🤔 I don't see why removing the transpose would make any difference 😬

@alessandrofelder
Copy link
Member

Does anybody know what we need to get this PR merged?

I'd suggest making the transpose question new separate issue, and merging this?

@adamltyson
Copy link
Member

I'd suggest making the transpose question new separate issue, and merging this?

Does this affect cellfinder though?

@alessandrofelder
Copy link
Member

alessandrofelder commented May 1, 2024

Does this affect cellfinder though?

Not sure I understand what you mean 🤔
We should reproduce and understand why not transposing the plane in the plane filter seems to break cellfinder (I currently don't), but that's independent of this PR, so it should be a different issue?

@adamltyson
Copy link
Member

I just meant does this PR have any impact upon cellfinder? I.e. is there a possibility that merging (and releasing) this, causes cellfinder problems for anyone. Just a naive question based on the comments above.

If not, lets get it merged.

@alessandrofelder
Copy link
Member

OK, I follow now, thanks, and good point.
We should weaken the error raising code that was added in the PR to warning, or logging before releasing this, to avoid people with bad metadata in their images getting stuck.

@alessandrofelder
Copy link
Member

alessandrofelder commented May 3, 2024

@K-Meech @adamltyson in cases where we have a single tiff file containing a stack with not correctly set metadata, should we

  1. assume axis order x-y-z (Cartesian), or
  2. assume axis order z-y-x (numpy)

?

@adamltyson
Copy link
Member

Do we need to set anything? We don't use the metadata (we define it elsewhere).

@alessandrofelder
Copy link
Member

alessandrofelder commented May 3, 2024

The function get_size_image_from_file_paths returns a dict with keys "x", "y", and "z".

IIUC when we read a 3D tiff we can get the shape of the stack through tiff.series[0].shape which will return a 3-tuple.

If there is metadata that contains xyz in any order, we can use that information to decide which shape dimension we assign to each key for the dict we return.

If there is no axis metadata, or the axis metadata is not xyz in any order, we have to make a decision only on the shape. My question is what that decision should be?

@adamltyson
Copy link
Member

Based on usage here and here I think we assume zyx as the default.

As an aside, I'm not sure what we should do with axis in BrainGlobe. It's currently not causing any (major) issues, but we have hundreds of references to axis order and they're not consistent.

I think we should go full numpy and just have axis_0, axis_1 and axis_2. Occasionally if needed we could have e.g. depth to illustrate some specific aspect of the data. However converting everything across 20 repos is going to take ages, for little gain.

Perhaps we should decide a convention, and aim to gradually adopt it?

@alessandrofelder
Copy link
Member

I will assume z,y,x ordering for the shape in that case, consistent with Fiji opening an example tif stack and the values for shape returned by tifffile:
image

@alessandrofelder
Copy link
Member

yep, let's gradually aim to go full numpy, with axis_0 as the "depth" and as the number of 2D tiff files in a folder (when we load that kind of data).

Copy link
Member

@alessandrofelder alessandrofelder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready to be merged now - thanks @matham !
(I made a slight change to accommodate cases where 3D tiff axis metadata is not set)

@alessandrofelder
Copy link
Member

Merging as AFAICT @K-Meech 's requested changes have been addressed.

@alessandrofelder alessandrofelder merged commit 98b02a1 into brainglobe:main May 3, 2024
11 checks passed
@matham matham deleted the io branch May 3, 2024 17:51
@matham
Copy link
Contributor Author

matham commented May 3, 2024

If there's a standard order, It should probably be documented prominently in the code/docs that this is the order used internally!? Because it took me a bit to work out even the order that the input to detection expects. E.g. the docs in detect.py, for main, simply says:

signal_array : numpy.ndarray
        3D array representing the signal data.

But that is probably the first entry point for users that pass data, and from that it's not clear the expected order.

It also wasn't clear in the docs how this all interacts with the brainmapper orientation parameter. Maybe I should open an issue for this!?

@adamltyson
Copy link
Member

If there's a standard order, It should probably be documented prominently in the code/docs that this is the order used internally!?

There is, but only sort of. The way that cellfinder loads the data plane-by-plane, there is an assumption that one axis is lower resolution and slower to load. This however isn't necessary. It should probably be documented though.

It also wasn't clear in the docs how this all interacts with the brainmapper orientation parameter. Maybe I should open an issue for this!?

It doesn't interact at all, only in use via brainmapper. I'm not sure we should confuse the docs by adding in something that's unrelated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants