Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Regression handling RGB in XC modality #4382

Open
fedorov opened this issue Sep 18, 2024 · 83 comments
Open

[Bug] Regression handling RGB in XC modality #4382

fedorov opened this issue Sep 18, 2024 · 83 comments
Assignees
Labels
Awaiting Reproduction Can we reproduce the reported bug? IDC:priority Items that the Imaging Data Commons wants to help sponsor

Comments

@fedorov
Copy link
Member

fedorov commented Sep 18, 2024

Describe the Bug

This sample is no longer rendered correctly: https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.3.6.1.4.1.5962.1.2.0.1677425356.1733

image

For comparison, see how it used to work here: #3354 (comment):

image

To download the files in that study (note - it is ~233GB!), assuming you have python and pip on your system, do the following:

pip install --upgrade idc-index
idc download 1.3.6.1.4.1.5962.1.2.0.1677425356.1733

Steps to Reproduce

Load the sample study referenced in the above into the viewer.

The current behavior

First series shows grayscale with multiple slices arranged in a mosaic. Second does not load at all.

The expected behavior

Both should show up in color as used to work in the past given the screenshot in #3354 (comment)

OS

macOS

Node version

n/a

Browser

Chrome

@fedorov fedorov added IDC:priority Items that the Imaging Data Commons wants to help sponsor Awaiting Reproduction Can we reproduce the reported bug? labels Sep 18, 2024
@sedghi
Copy link
Member

sedghi commented Sep 18, 2024

For instance, if I want to debug this specific study without downloading it, how can I do that in my local OHIF since you said the DICOMweb server is public?

CC @igoroctaviano

@fedorov
Copy link
Member Author

fedorov commented Sep 18, 2024

@sedghi I am not an OHIF developer, so I cannot answer how to do it in OHIF configuration (@igoroctaviano can), but IDC public DICOMweb endpoint is this: https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb (documented in https://learn.canceridc.dev/portal/proxy-policy). Note that the "no downloads" wording in the URL is legacy, and we do not restrict that DICOMweb endpoint for downloads - as discussed in the link above.

@fedorov
Copy link
Member Author

fedorov commented Sep 26, 2024

DICOM PlanarConfiguration will indicate whether color image is ordered as RGBRGB or RRGGBB. Need to investigate.

@fedorov
Copy link
Member Author

fedorov commented Oct 22, 2024

@dclunie I checked, and for Visible Human, we have both options for PlanarConfiguration. Neither one works.

SELECT distinct(PatientID), StudyInstanceUID, SeriesInstanceUID, PlanarConfiguration 
FROM `bigquery-public-data.idc_current.dicom_all` 
where Modality = "XC"
PatientID	StudyInstanceUID	SeriesInstanceUID	PlanarConfiguration
VHP-M	1.3.6.1.4.1.5962.1.2.0.1672334394.26545	1.3.6.1.4.1.5962.1.3.0.2.1672334394.26545	0
VHP-F	1.3.6.1.4.1.5962.1.2.0.1677425356.1733	1.3.6.1.4.1.5962.1.3.0.1.1677425356.1733	1
VHP-M	1.3.6.1.4.1.5962.1.2.0.1672334394.26545	1.3.6.1.4.1.5962.1.3.0.1.1672334394.26545	1
VHP-F	1.3.6.1.4.1.5962.1.2.0.1677425356.1733	1.3.6.1.4.1.5962.1.3.0.2.1677425356.1733	0

At the same time, I checked the following 2 US series, with different options

1 | LiverUS-04 | 1.3.6.1.4.1.14519.5.2.1.1188.2803.279513797246517539833297938175 | 1.3.6.1.4.1.14519.5.2.1.1188.2803.240884965319239940503812066809 | 0 |  
2 | C3N-02436 | 1.3.6.1.4.1.14519.5.2.1.7085.2626.149644418574806774239324529564 | 1.3.6.1.4.1.14519.5.2.1.7085.2626.252950437956329338256830139567 |  

and both work fine:

https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.1188.2803.279513797246517539833297938175&SeriesInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.1188.2803.240884965319239940503812066809

https://viewer.imaging.datacommons.cancer.gov/v3/viewer/?StudyInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.7085.2626.149644418574806774239324529564&SeriesInstanceUIDs=1.3.6.1.4.1.14519.5.2.1.7085.2626.252950437956329338256830139567

So it must be something else, or combination with something else...

@pedrokohler pedrokohler self-assigned this Oct 22, 2024
@dclunie
Copy link

dclunie commented Oct 23, 2024

Must be something else then. I get the impression that color image handling in OHIF is extremely fragile (whether compressed or uncompressed).

@pedrokohler
Copy link
Collaborator

@dclunie what could be causing the imagine not to show as a single image in current v3 version, but rather a bunch of images, while it's shown as a single image in v2?

@igoroctaviano
Copy link
Contributor

@dclunie what could be causing the imagine not to show as a single image in current v3 version, but rather a bunch of images, while it's shown as a single image in v2?

@pedrokohler, this is typically related to a decoding issue. I'd suggest rolling back to commits before any recent RGB/decoding changes, testing them one by one.

@sedghi, it might be a good idea to open an issue to add snapshot testing in CS3D to help prevent these kinds of regressions.

Here’s a similar issue we encountered previously: cornerstonejs/cornerstone3D#829

@sedghi
Copy link
Member

sedghi commented Oct 23, 2024

There's a widespread problem with DICOM color images. Here's the core issue:

  • When dicom servers sends DICOM color images, they can convert the pixel data from one color space (like YBR422) to another (like RGB) as we know, however, these servers frequently don't update the metadata to reflect this conversion. So you end up with a mismatch - the metadata might say "YBR422" even though the actual pixel data has been converted to RGB.
  • Apparently the DICOM standard suggests that the Photometric Interpretation field should reflect the format of the stored data (am I right @dclunie ?). But this raises a question - what should happen when the data gets converted?
  • Due to this widespread inconsistency, we can't reliably ONLY trust the Photometric Interpretation value in the metadata, we should look at the pixelData too. To work around this, we developed a solution that tries to detect the actual pixel data format by analyzing the values themselves (implemented here). While this approach improved our ultrasound image support and made things more consistent, we're still encountering some issues with it as you see in this issue

Take a look at these discussions for instance

https://discourse.orthanc-server.org/t/orthanc-convert-ybr-to-rgb-but-does-not-change-metadata/3533/3

I need help fixing the color settings permanently.

@fedorov
Copy link
Member Author

fedorov commented Oct 23, 2024

@sedghi if you can provide a reproducible example demonstrating DICOM non-compliance of Google Healthcare API DICOMweb implementation, we can escalate through our collaborators at Google to try to get it fixed. I do not know enough about this issue to reproduce it myself.

@sedghi
Copy link
Member

sedghi commented Oct 23, 2024

Can @dclunie clarify the DICOM standard's requirements for Photometric Interpretation? Should we rely on that? since we are seeing inconsistencies with that tag and what actually the server responds in pixelData

@pedrokohler
Copy link
Collaborator

@sedghi @fedorov after lots of testing, I found that this PR was the responsible for making it fail: https://github.com/cornerstonejs/cornerstone3D/pull/829/files

I'll investigate further to see what exactly is happening there

@fedorov
Copy link
Member Author

fedorov commented Oct 23, 2024

@pedrokohler thank you, but let's wait to hear from @dclunie about the question asked by Alireza first!

@dclunie
Copy link

dclunie commented Oct 23, 2024

Very long ... sorry ...

Here are few points to consider wrt. what the standard says about this and how it relates to @sedghi's description of the recent changes, and what might be needed to make this more robust going forwards:

  • when an image is encoded in an uncompressed transfer syntax, all of the DICOM attributes in the Image Pixel Module need to accurately describe the binary contents of the uncompressed PixelData attribute, including PhotometricInterpretation - e.g., if the pixels are RGB, then that's what it needs to say, if they really are YBR (e.g., were encoded that way be the ultrasound machine) then again, that's what PhotometricInterpretation needs to say (usually YBR_FULL); see "Native" format in PS3.5 and Photometric Interpretation in PS3.3.
  • if an image is decompressed, whether from a lossless or lossy transfer syntax, and re-encoded in DICOM, obviously the decompressed file is required to have the correct description of the decompressed PixelData in the Image Pixel Module attributes - the classic example being when a JPEG image in YBR_FULL_422 is color component converted into RGB during decompression; what the standard requires in this respect is obvious; problems arise when the encoder of the decompressed image fails to set Photometric Interpretation correctly - viewers may try to apply heuristics to work around this (e.g., it's YBR_FULL_422 but decompressed so its probably wrong so I will assume RGB) but like any heuristics, they may fail, as we are apparently seeing here
  • if an image is compressed, whether losslessly or lossy, then the image pixel data characteristics (including the meaning of the color components +/- downsampling of the chrominance channels) may or may not be expressed in metadata within the compressed bit stream, but if it is, that is always to be used "to control the decoding"; that said, the DICOM Image Pixel Module attributes are still required to be "consistent with the characteristics of the compressed data stream" (even though in practice they may sometimes not be); for lossy baseline JPEG, the compressed bitstream might not describe the color components, so DICOM Photometric Interpretation of YBR_FULL_422 is really important to signal this, and RGB for compressed data is intended to indicated that the color components have NOT been transformed (as is the case with Leica/Aperio WSI, for example); the presence of a JFIF header is not required in the compressed bitstream, but if present, may signal YCbCr components rather than untransformed RGB components

From an application implementation perspective this can be even more challenging depending on what the underlying toolkits and libraries and codecs do and how much control there is over what they do (e.g., a lossy baseline JPEG decoder might always attempt YCbCr to RGB conversion even if they really were RGB, or always do it if a JFIF header is present, or check for the Adobe marker segment that signals RGB, or whatever).

To restate the obvious, the only way to mitigate against unexpected code changes causing failures is to test - have a good range of test inputs representing as many known good (and bad) variants as possible, and a decoded result to expect, and automate the comparison in regression tests. For properly encoded DICOM images (uncompressed, compressed and decompressed/re-encoded) there should be no question about the correct result. For "bad" images (e.g., RGB Photometric Interpretation for a YCbCr encoded lossy JPEG bitstream signaled by a JFIF header, or a decompressed JPEG that still has YBR_FULL_422 for the Photometric Interpretation) some heuristics may be implemented to work around this and need to be tested too but the heuristic should never prevent the correct handling of a correct image.

In the case of the Cornerstone PR #829 that was mentioned as responsible for the problem, there is a new function isColorConversionRequired() that is supposed to work around an [Orthanc server bug "Orthanc convert YBR to RGB but does not change metadata"](https://discourse.orthanc-server.org/t/orthanc-convert-ybr-to-rgb-but-does-not-change-metadata/3533/17), and which @sedghi also referred to. Unfortunately it seems that insufficient testing was done to assure that this workaround did not undermine handling of "correct" images in which the codec had handled the color component transformation (and chrominance channel upsampling; repeating it would screw things up, if that is what is actually happening.

The heuristic in isColorConversionRequired() seems to be trying to detect uncompressed chrominance downsampling based on the downsampling description in Photometric Interpretation matching the decompressed pixel data length, which is a whole other thing (and as @sedghi described, is often related to funky ultrasound images; some uncompressed or decompressed RLE ultrasound images really do have uncompressed color component downsampling). I don't actually see how the function (and its consequences if it returns true) actually specifically target the Orthanc defect anyway, other than by assuming that if it says YBR_FULL_422 but the decompressed pixel data length is not downsampled (i.e., it just equals columns * rows * samplesperpixel * bytespersample; I think the imageFrame must already exclude the padding byte to even judging by other code therein).

However, since our problem images should be Photometric Interpretation RGB and uncompressed, I am not sure how this PR is causing our images to fail, since isColorConversionRequired() should return false by my reading of it.

I assume there is no issue with the DICOMweb side of things, in that the image requested was returned uncompressed. If it had been requested with an Accept header of image/jpeg rather than application/octet-stream, and the server has lossy JPEG compressed them to satisfy the frame request (perhaps even retrieve rendered), and of course while doing that would very likely color space transform them, with no explicit means of signaling that (except perhaps a JFIF header included in the compressed bitstream). See also the warning about this sort of thing in the description in PS3.18. I.e., the metadata response and the bulk pixel data response would be inconsistent in such cases wrt. the characteristics of the pixel data. This is purely speculative on my part since I have now idea how OHIF is retrieving these images, or under what circumstances the routines in the PR get invoked.

I also see a bunch of stuff in the PR about Palette Color Lookup Tables, but that should not be relevant to us since they won't be in the problem image.

@fedorov
Copy link
Member Author

fedorov commented Oct 24, 2024

@pedrokohler you should also use samples mentioned in #2934! Also samples from DICOM FTP server: ftp://medical.nema.org/MEDICAL/Dicom/DataSets/WG12/.

@pedrokohler
Copy link
Collaborator

@fedorov @dclunie the problem is the isColorConversionRequired function of the PR I sent above.

Once I force it to return true for the dataset above, it renders the image correctly.

@dclunie what else could be checked inside this function so that we would be able to tell that this image requires conversion? For instance, I could check to see if the modality is XC and always return true in this case - note that I don't know if this makes any sense, I'm just giving an example.

@dclunie
Copy link

dclunie commented Oct 25, 2024

Once I force it to return true for the dataset above, it renders the image correctly.

Very interesting observation.

It makes little sense though, since this really is an RGB image, so by definition should not require "color conversion".

I can only assume that whatever action is being (needs to be) triggered by isColorConversionRequired() returning true is doing more than "color conversion" - perhaps related to changing PlanarConfiguration of 1 (color-by-plane) to 0 (color-by-pixel)?

I could check to see if the modality is XC and always return true in this case

Please do not do that - you need to understand the root cause of the problem with what is perfectly good input data and fix it. There is nothing special about the modality or any other attribute that is not directly related to the description of the encoding of the pixel data (i.e., Image Pixel Module attributes and Transfer Syntax).

I think you need to figure out what "color conversion" entails and whether or not it can safely be triggered by PlanarConfiguration == 1, or if it needs to be refactored (to separate "color component conversion" (e.g, from YBR to RGB, +/- chrominance channel upsampling) from "plane organization" (color-by-plane to color-by-pixel).

Who wrote the "color conversion" code? You could talk to them about their design for this.

@sedghi
Copy link
Member

sedghi commented Oct 28, 2024

@dclunie Thank you David for the excellent DICOM color crash course. It will help us improve our color handling stability.

@pedrokohler
Copy link
Collaborator

pedrokohler commented Oct 28, 2024

@fedorov It looks like this was really a cornerstone3D issue but it was already fixed in this PR https://github.com/cornerstonejs/cornerstone3D/pull/1457/files

I'm waiting for Bill Longabaugh to confirm that ViewersV3 does not have the master branch deployed, but rather an older branch that has a previous cs3d version being used.

If this is indeed the case, then we just need to deploy the master branch and the issue should be fixed.

@fedorov
Copy link
Member Author

fedorov commented Oct 28, 2024

I'm waiting for Bill Longabaugh to confirm that idc-slim does not have the master branch deployed, but rather an older branch that has a previous cs3d version being used.

You mean ViewersV3, not idc-slim, right?

@pedrokohler
Copy link
Collaborator

You mean ViewersV3, not idc-slim, right?

My bad. Yes, I mean ViewersV3.

@fedorov
Copy link
Member Author

fedorov commented Nov 5, 2024

Replicating from Slack and to keep @dclunie in the loop. @pedrokohler can you please take a look?

Per @wlongabaugh:

OK, in the past, it appears the > 32 MB slices got sent as roughly ~20 MB slices, and this appeared to be due to a change in the Accept: transfer-syntax=* request header: https://github.com/ImagingDataCommons/IDC-ProjectManagement/issues/1574#issuecomment-1924983645

1:13
OK, I admit I was wrong, and yes it used to work, by getting the > 32 MB slices to work by getting them to be ~20 MB on the wire.

I note the current V3 viewer is sending this:

accept:
multipart/related; type=application/octet-stream; transfer-syntax=*; transfer-syntax=*

1:12
Is the second transfer-syntax=* (and w/o a ";") messing things up?

@sedghi
Copy link
Member

sedghi commented Nov 5, 2024

Checkout extensions/cornerstone/src/initWADOImageLoader.js that is the location that decide on the transfer syntax

and

platform/core/src/utils/generateAcceptHeader.ts

@pedrokohler
Copy link
Collaborator

@fedorov I removed the duplicate and just tested it with transfer-syntax=* and with transfer-syntax=*;, meaning both with and without semicolon. Still same error occurs in both cases.

image

image

@pedrokohler
Copy link
Collaborator

@fedorov either way this error has nothing to do with this ticket. According to @wlongabaugh on Slack this same error is occurring in production right now.

@fedorov
Copy link
Member Author

fedorov commented Nov 6, 2024

Thank you for checking. Let's wait to see what @wlongabaugh says about this.

@sedghi
Copy link
Member

sedghi commented Nov 6, 2024

Sometimes the type should be in quotes. We used to have an option to omit it, as including it can trigger an options request flow on some servers, slowing things down. However, sometimes it requires the type to be in quotes, like multipart/related; type="application/octet-stream"; transfer-syntax=*.

@fedorov
Copy link
Member Author

fedorov commented Nov 6, 2024

@wlongabaugh @pedrokohler note that while accessing IDC DICOM store directly - bypassing the proxy - I am able to visualize both series.

See demo here: https://app.screencast.com/QEePMcwuQOuWb.

Also note the type is indeed in quotes:

image

No server error

image

@pedrokohler
Copy link
Collaborator

After that, can you please proceed with testing against the IDC proxy?

Hmm... not sure what you mean here. Can you elaborate? I mean, I was testing against a proxy all along. You mean another proxy?

@fedorov
Copy link
Member Author

fedorov commented Nov 8, 2024

Sorry I missed that you tested against the proxy. We need to know the behavior with both direct access to GHC and via the proxy. While using proxy, you cannot choose API version, you can only do it while accessing directly, substituting v1 with v1beta1.

Are you able to download that study and make a Google DICOM store for direct testing, or you need help with that?

@pedrokohler
Copy link
Collaborator

Are you able to download that study and make a Google DICOM store for direct testing, or you need help with that?

@fedorov I've never created a google DICOM store and I don't have access to IDC's GCP account, which means I'd need help indeed.

note that while accessing IDC DICOM store directly - bypassing the proxy - I am able to visualize both series.

I tried using the IDC fork's config file and the GCP server you sent in the message from the quote above, but I'm getting this error when I try to login

image

(also I'm not sure that by logging in with the default oidc config I'd be able to access this specific server anyway)

@wlongabaugh
Copy link

I was wondering how @pedrokohler would be able to access GHC directly, as he does indeed not have permissions.

The version of the API we are using is in the proxy runtime configuration file that lives in each tier. E.g. for test it in in gs://webapp-deployment-files-idc-test/proxy/proxy_runtime_config.txt.

The dev and test proxies are running v1beta1. Production proxy is still running v1.

@wlongabaugh
Copy link

@fedorov OK, I have reread my comment you highlighted above. #4382 (comment) Yes, I originally thought it was the other way around, that we needed the transfer-syntax key/value to make it work. But yeah, we needed to drop the key/value to make it work. This all makes sense now. That was my first impression of this when it came up, that there was some change in compression that made it work, back at the start of 2024. So drop the transfer-syntax in V3.

Copy link
Member

sedghi commented Nov 8, 2024

transfer-syntax=* means allow ANY transfer syntax on the server, omitting transfer syntax means LEI

@fedorov
Copy link
Member Author

fedorov commented Nov 8, 2024

I do not know what LEI is. But can you give us background why in the past (it appears that) transfer-syntax=* was not used, and later it was added? What triggered that change?

@sedghi
Copy link
Member

sedghi commented Nov 8, 2024

https://www.dicomlibrary.com/dicom/transfer-syntax/

Explicit VR Little Endian

I believe it's a sensible default for OHIF to follow. It tells the server not to bother with conversion and to send whatever is available, as OHIF can support it. This reduces the server's response time. If there's an issue with using *, it indicates a bug in OHIF.

@wlongabaugh
Copy link

The issue is not a bug in OHIF, but the fact that by providing it in the request, Google will not try to compress the 35MB+ slices it sends out.

@fedorov
Copy link
Member Author

fedorov commented Nov 8, 2024

This reduces the server's response time. If there's an issue with using *, it indicates a bug in OHIF.

@sedghi The issue in this case is not with OHIF per se, but with the specific combination of OHIF/data/user requirements. To summarize the long thread, we need to have the server compress the data before sending. We have good reasons for that. With the current transfer-syntax=* we cannot satisfy our needs.

Can this be somehow addressed at the OHIF level without us making customizations specific to our fork/use case?

I do not think this use case is limited to IDC.

@fedorov
Copy link
Member Author

fedorov commented Nov 8, 2024

@pedrokohler I created a DICOM store that has Visible Female XC study: projects/idc-sandbox-000/locations/us-central1/datasets/fedorov-dev-healthcare/dicomStores/visible-female. I gave you all the necessary permissions. You can use this store to test against DICOM store bypassing proxy, and compare behavior between v1 and v1beta1 GHC API.

@sedghi
Copy link
Member

sedghi commented Nov 8, 2024

@fedorov if you make sure the server already has compressed data stored for that study then the * should work, no? Can't you call an api in google and ask for change in stored data format?

@fedorov
Copy link
Member Author

fedorov commented Nov 8, 2024

First, based on the explanation from @dclunie yesterday, compression in this scenario is done at the HTTP level - it does not change the transfer syntax.

Second, one of our principles in IDC is not to modify the original encoding of the images in the files that are ingested into the DICOM server.

@sedghi
Copy link
Member

sedghi commented Nov 8, 2024

BTW I added all transfer syntax test data we had here cornerstonejs/cornerstone3D#1568

Seems like we only are not supporting

DeflatedExplicitVRLittleEndianTransferSyntax

@wlongabaugh
Copy link

I will reiterate that one way to deal with this is to have the proxy look at the URLs being requested, and if it includes ones of these problematic studies, it can modify the requests headers (ditch transfer-syntax=*) for those studies to kick off the compression.

@fedorov
Copy link
Member Author

fedorov commented Nov 8, 2024

I will reiterate that one way to deal with this is to have the proxy look at the URLs being requested

@wlongabaugh in my list of preferred options, increasing complexity of a proxy that is specific to IDC is not at the top. IDC proxy based solution will never benefit other users of OHIF Viewer, and I think there is a good chance that this situation is not unique to IDC. In our case, we need this because of the proxy limitations on the buffer size. In other cases users may want to have smaller buffer size due to limited network bandwidth.

@sedghi would it be possible to add an option for the client/application to choose whether transfer-syntax=* should be included or not?

I hope the answer is yes, and then the client could perhaps estimate the size of PixelData based on frame size and bit depth, and decide based on that whether it should be included or not and request server compression? This could also be controllable via a config option. Just one idea. Maybe there is a better way to know what is the size of the object in the original encoding.

@sedghi
Copy link
Member

sedghi commented Nov 10, 2024

Yes, setting it is optional, but if you don't, every compressed study will be decompressed, as far as I know. So, perhaps we should have a per-study configuration or something similar.

@dclunie
Copy link

dclunie commented Nov 12, 2024

@sedghi, you wrote:

Yes, setting it (transfer-syntax=*) is optional, but if you don't, every compressed study will be decompressed, as far as I know.

Short version - I don't think that is a compliant way of receiving compressed frames, even if Google does support it, so you should not be asking for transfer-syntax=*

TL;DNR:

The details here are a digression from the topic at hand (getting back uncompressed XC images, and getting large ones through the proxy by deflate applied at the Content-Encoding rather than Transfer Syntax level), ...

... but it is relevant since it has apparently been observed that requesting multipart/related; type="application/octet-stream"; transfer-syntax=* has an undesirable (and relevant) side effect of blocking Content-Encoding ( e.g., as deflate) as distinct from the simpler (and correct)multipart/related; type="application/octet-stream".

If you request application/octet-stream (whether single part or multipart), as I understand it. the standard says you will get back uncompressed bytes, and the Google documentation implies you will get back uncompressed bytes because the default TS is uncompressed (and it will perform decompression as necessary if it can).

However, I don't think it is legal for a DICOMweb server to return in a frame request a compressed bitstream in an application/octet-stream media type. That media type is only to be used for uncompressed data.

Rather, the server is required to return image/jpeg for JPEG transfer syntaxes, etc. Otherwise how would you know (without inspecting the compressed byte stream) what scheme is used to compress the byte stream in the response, if it was not signaled in the media type in the returned Content-Type header field? See DICOM PS3.18 8.7.3.3.2 Compressed Bulkdata Media Types.

I know that the Google server documentation for retrieval of frames says it supports multipart/related; type="application/octet-stream"; transfer-syntax=* and further that

"For application/octet-stream the bulk data will be returned in whatever format it appears in the uploaded DICOM file"

but I think this is fundamentally wrong; returning compressed byte streams as application/octet-stream is not what the standard specifies; I don't think you should be requesting that, even if it (sometimes) works.

Perhaps you should be requesting multipart/related; type="*/*" or similar, if you don't care how it is compressed or if it is uncompressed, and the Content-Type in the response should tell you what it actually is. Perhaps even multipart/related; type="*/*"; transfer-syntax=*

However there is a warning in the Google documentation about image/jpeg defaulting to a .70 TS that is not supported, so if the images are lossless JPEG such as TS .50, then perhaps type="*/*"; transfer-syntax=* will freak it out.

How is OHIF figuring out what compression scheme is used when requesting multipart/related; type="application/octet-stream"; transfer-syntax=* (perhaps there is a transfer-syntax parameter on the Content-Type of the response that you are cuing off)?

I hope you are not using any TransferSyntaxUID value that might be present in the metadata (if they return Group 0x0002 data elements in the metadata, which they shouldn't) as opposed to (trying to) use QIDO AvailableTransferSyntaxUID in CP 1901. I don't think that Google supports AvailableTransferSyntaxUID though. To state the obvious, if Group 0x0002 TransferSyntaxUID is returned in the metadata, its value will not necessarily match whatever is in the returned pixel data if any transcoding has occurred.

I will do some more experiments wrt. what the Google server does or does not support, but the bottom line is that I think a more robust approach by OHIF to getting back (lossless) compressed pixel data if that is the form it is available in may be required if you have felt the need to use multipart/related; type="application/octet-stream"; transfer-syntax=* for that reason and don't have an alternative pathway already.

I would be very interested to hear if you have experience with DICOMweb servers other than Google's that respond to multipart/related; type="application/octet-stream"; transfer-syntax=* with anything other than uncompressed pixel data.

David

PS. I don't think much has changed in this respect since CP 1509 cleaned up the media types, or even the original description in Sup 161; i.e., I don't think we can blame the PS3.18 reorganization for changing anything in this respect wrt. media types for uncompressed versus compressed data. I will go through the development history of WADO-RS and review the discussions of the media types to see if there is anything I have missed.

@sedghi
Copy link
Member

sedghi commented Nov 12, 2024

CC @wayfarer3130

@wayfarer3130
Copy link
Contributor

wayfarer3130 commented Nov 13, 2024

@sedghi, you wrote:

Yes, setting it (transfer-syntax=*) is optional, but if you don't, every compressed study will be decompressed, as far as I know.

Short version - I don't think that is a compliant way of receiving compressed frames, even if Google does support it, so you should not be asking for transfer-syntax=*

TL;DNR:

The details here are a digression from the topic at hand (getting back uncompressed XC images, and getting large ones through the proxy by deflate applied at the Content-Encoding rather than Transfer Syntax level), ...

The deflate CAN be applied at hte content encoding level, but it means to apply default to the multipart/related body as well as to the overall body - that is what static dicomweb does for encoding such images.

... but it is relevant since it has apparently been observed that requesting multipart/related; type="application/octet-stream"; transfer-syntax=* has an undesirable (and relevant) side effect of blocking Content-Encoding ( e.g., as deflate) as distinct from the simpler (and correct)multipart/related; type="application/octet-stream".

If you request application/octet-stream (whether single part or multipart), as I understand it. the standard says you will get back uncompressed bytes, and the Google documentation implies you will get back uncompressed bytes because the default TS is uncompressed (and it will perform decompression as necessary if it can).

However, I don't think it is legal for a DICOMweb server to return in a frame request a compressed bitstream in an application/octet-stream media type. That media type is only to be used for uncompressed data.

Yes, that is actually correct - it should have been
multipart/related; type="image/*"
with a secondary accept of
multipart/related; type="application/octet-stream"; q=0.9

The application/octet-stream is only valid for DICOM encoded responses, and I had conflated that with using it on the frame level retrieve.

Preferring images over application/octet stream, all of them encoded in multipart/related.
It would also be acceptable to do the following set:
"image/*, application/octet-stream; q=0.9, multipart/related; type="image/*;q=0.8, multipart/related; type="application/octet-stream"; q=0.7
Or, if known that the server supports single part, just the first two.

Rather, the server is required to return image/jpeg for JPEG transfer syntaxes, etc. Otherwise how would you know (without inspecting the compressed byte stream) what scheme is used to compress the byte stream in the response, if it was not signaled in the media type in the returned Content-Type header field? See DICOM PS3.18 8.7.3.3.2 Compressed Bulkdata Media Types.

I know that the Google server documentation for retrieval of frames says it supports multipart/related; type="application/octet-stream"; transfer-syntax=* and further that

"For application/octet-stream the bulk data will be returned in whatever format it appears in the uploaded DICOM file"

but I think this is fundamentally wrong; returning compressed byte streams as application/octet-stream is not what the standard specifies; I don't think you should be requesting that, even if it (sometimes) works.

Perhaps you should be requesting multipart/related; type="*/*" or similar, if you don't care how it is compressed or if it is uncompressed, and the Content-Type in the response should tell you what it actually is. Perhaps even multipart/related; type="*/*"; transfer-syntax=*

Yes, but added several request types with q parameters to indicate preference. I don't know that it will work for image/* but technically it is supposed to work AFAIK.

However there is a warning in the Google documentation about image/jpeg defaulting to a .70 TS that is not supported, so if the images are lossless JPEG such as TS .50, then perhaps type="*/*"; transfer-syntax=* will freak it out.

How is OHIF figuring out what compression scheme is used when requesting multipart/related; type="application/octet-stream"; transfer-syntax=* (perhaps there is a transfer-syntax parameter on the Content-Type of the response that you are cuing off)?

There is the transfer syntax in the response data - which is actually required to distinguish between image/jpeg for .50 and .70

I hope you are not using any TransferSyntaxUID value that might be present in the metadata (if they return Group 0x0002 data elements in the metadata, which they shouldn't) as opposed to (trying to) use QIDO AvailableTransferSyntaxUID in CP 1901. I don't think that Google supports AvailableTransferSyntaxUID though. To state the obvious, if Group 0x0002 TransferSyntaxUID is returned in the metadata, its value will not necessarily match whatever is in the returned pixel data if any transcoding has occurred.

No, that isn't used there. However, the AvailableTransferSyntaxUIDs is used to sub-select the accept header to the comrpessed format if it is provided in the original metadata.

I will do some more experiments wrt. what the Google server does or does not support, but the bottom line is that I think a more robust approach by OHIF to getting back (lossless) compressed pixel data if that is the form it is available in may be required if you have felt the need to use multipart/related; type="application/octet-stream"; transfer-syntax=* for that reason and don't have an alternative pathway already.

I would be very interested to hear if you have experience with DICOMweb servers other than Google's that respond to multipart/related; type="application/octet-stream"; transfer-syntax=* with anything other than uncompressed pixel data.

It is supported fairly generally - I can think of at least 4 PACS systems supporting it.

Bill

PS @dclunie - thanks for responding to that, I had assumed incorrectly that the transfer syntax parameter was supported for this accept header as well as the others, and apparently others have made the same assumption.

@wayfarer3130
Copy link
Contributor

According to 8.7.3.5.1 Multipart Media Types, it is actually legal to request
multipart/related; type="application/octet-stream"; transfer-syntax="*"
I agree the interpretation of that is a bit iffy, but the request itself is legal according to the syntax.

@sedghi
Copy link
Member

sedghi commented Nov 13, 2024

Also CCing @amazy the lead for orthanc team, if he can help us with how ohif should send requests

@dclunie
Copy link

dclunie commented Nov 13, 2024

:) "iffy" is one way to put it - "wrong" is another!

I will draft a CP to add a note to PS3.18 to indicate that returning compressed streams in application/octet-stream is not correct, and that combined with transfer-syntax="*", it should only match uncompressed transfer syntaxes (perhaps including the recent encapsulated uncompressed to get around size limits). Then WG 27 can discus the matter.

@dclunie
Copy link

dclunie commented Nov 13, 2024

BTW. AvailableTransferSyntaxUIDs should not be in the metadata - it is an instance level search transaction attribute.

@fedorov
Copy link
Member Author

fedorov commented Nov 14, 2024

Following this exchange - are there any practical and actionable implications regarding how to proceed with achieving the behavior needed for our use case - requesting compression applied to Content-Encoding?

@amazy
Copy link

amazy commented Nov 18, 2024

Also CCing @amazy the lead for orthanc team, if he can help us with how ohif should send requests

Since I'm not a standard expert at all, I'll not comment on the standard itself or on how it should be understood.

I must admit that Orthanc's behavior wrt to multipart/related; type="application/octet-stream"; transfer-syntax="*" was inspired by the Google implementation and it was not incompatible with our understanding of the standard.

That being said, here is how Orthanc works:

  • in the /metadata route, Orthanc always returns the PhotometricInterpretation is it is stored in the DICOM file stored in Orthanc whatever the Accept headers received.
  • if called on ../frames/1with transfer-syntax="*", Orthanc will return the frame without any transcoding -> the PhotometricInterpretation returned in the /metadata route will match the one from the pixels
  • if called with .e.g transfer-syntax="1.2.840.10008.1.2.1", Orthanc will transcode the frame and will potentially return pixels with a different PhotometricInterpretation than the one returned in /metadata

For this reason, we concluded, together with @sedghi, that using transfer-syntax="*" was the best (and only) option for OHIF to receive pixels from Orthanc with the PhotometricInterpretation matching the one returned in an earlier call to /metadata.

We are, of course, open to making Orthanc more standard compliant if it needs to be. In such a case, that would be great to have access to a few well documented test scenarios or scripts for everyone to validate their implementations.

@wayfarer3130
Copy link
Contributor

I will work with @dclunie on making sure that the CP that he has started has examples of how to ask for the image in a compressed format which is standards compliant, as that is a very common question and given how many people get it wrong, it is clear we need some clarification as to what is and is not allowed.

@fedorov
Copy link
Member Author

fedorov commented Nov 18, 2024

@wayfarer3130 can you let me know how we should do it with the current implementation?

Unless it is already clear to @igoroctaviano based on the discussion above.

@dclunie
Copy link

dclunie commented Nov 18, 2024

@amazy the PhotometricInterpretation question (and whether the value in the metadata response is applicable to whatever compressed or uncompressed pixel data is retrieved) should be somewhat orthogonal to the question of whether compressed pixel data byte streams are obtained in the correct manner using image/* or incorrectly using application/octet-stream.

@dclunie
Copy link

dclunie commented Nov 18, 2024

Wrt. drafting examples, many years ago (2018/10/23) I posted the following message to the WG 27 mailing list and got zero responses. It is out of date now (since we have both updated the standard, extended the use of single part responses, and added specific solutions such as AvailableTransferSyntax, which may or may not have been implemented), but I think we should be able to improve it and generalize it and come up with an exhaustive list of requests and appropriate responses:

Hi WG27 folks

I have some general questions related to compressed images that
arise particularly when using anything other than baseline JPEG
images sent to an origin server.

Suppose I send a multi-frame lossless JPEG 2000 (.90) image
to an origin server, and for the sake of argument, assume it
stores it in the supplied Transfer Syntax.

Note that it is multi-frame image, not single frame.

As a QIDO-RS and WADO-RS user agent:

  1. How do I tell in what Transfer Syntax the origin server has
    stored an image (i.e., that I sent it as .90)?

1a. in QIDO-RS, is (0002,0010) permitted in the request/response?
1b. in RetrieveMetadata, is (0002,0010) permitted in the response?
1c. in the dcm-parameters of the RetrieveMetadata Content-Type,
even though there is no inline binary data returned?

  1. What comes back in the response if I accept in a RetrieveInstance
    request only:

2a. '*/*' ?
2b. 'multipart/related; type="application/dicom"' ?
2c. 'multipart/related; type="application/dicom"; transferSyntax=...90' ?
2d. 'multipart/related; type="application/dicom"; transferSyntax=...91' ?
2e. 'multipart/related; type="application/dicom"; transferSyntax=...70' ?
2f. 'multipart/related; type="application/dicom"; transferSyntax=...50' ?
2g. 'multipart/related; type="application/dicom"; transferSyntax=...1' ?
2h. 'multipart/related; type="image/jp2"' ?
2i. 'multipart/related; type="image/jpeg"' ?
2j. 'multipart/related; type="application/octet-stream"' ?

  1. What comes back in the response if I accept in a RetrieveInstance
    request all of:

3a. 'multipart/related; type="application/dicom"; transferSyntax=...90,
multipart/related; type="application/dicom"; transferSyntax=...91,
multipart/related; type="application/dicom"; transferSyntax=...70,
multipart/related; type="application/dicom"; transferSyntax=...50,
multipart/related; type="application/dicom"; transferSyntax=...1' ?

3b. 'multipart/related; type="image/jp2",
multipart/related; type="image/jpeg",
multipart/related; type="application/octet-stream"' ?

  1. What comes back in the response if I accept in a RetrieveFrame
    request only:

4a. '*/*' ?
4b. 'multipart/related; type="image/jp2"' ?
4c. 'multipart/related; type="image/jpeg"' ?

  1. What comes back in the response if I accept in a RetrieveRendered
    Frame request only:

5a. '*/*' ?
5b. 'multipart/related; type="image/jp2"' ?
5c. 'multipart/related; type="image/jpeg"' ?

  1. What is the difference, if any, if the image is sent in (.91) in
    the first place, rather than (.90), remembering that .91 can be
    lossless or lossy (reversible or irreversible)?

  2. If the response contains a DICOM PS3.10 file, or its metadata,
    and any Transfer Syntax conversion performed involved further
    lossy compression:

    7a. is the SOP Instance UID changed?
    7b. are any mandatory Lossy Compression attributes set?

For each answer, where do you think it is currently specified
in the standard, and is the behavior required or optional?

David

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Reproduction Can we reproduce the reported bug? IDC:priority Items that the Imaging Data Commons wants to help sponsor
Projects
None yet
Development

No branches or pull requests

8 participants