-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update CompressedImage #237
base: rolling
Are you sure you want to change the base?
Conversation
Signed-off-by: Alejandro Hernández Cordero <[email protected]>
uint32 height # image height, that is, number of rows | ||
uint32 width # image width, that is, number of columns | ||
|
||
string pixel_format # Specifies the format of the data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a needless difference. I understand that this is slightly more descriptive, but I also don't think it is worth the headache for downstream users to change this. So I'll suggest putting this back to format
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that encoding
is the current spelling for this in sensor_msgs/Image
, which is just wrong in almost every case. Everything else (OpenCV, commercial machine vision libraries, etc) refers to this as "pixel format". Unfortunately, "encoding" is even more confusing when dealing with compressed data, so if there is a chance to fix this properly we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For compatibility we shouldn't rename this field but just update it'd documentation to state that it's expecting the pixel_format string as per OpenCV). It's still clear what the content is and this will avoid breaking any existing code.
uint32 height # image height, that is, number of rows | ||
uint32 width # image width, that is, number of columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior to this PR, how are users of this message determining this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this was encoded into the payload before, we need some more semantics about when it should be used and when not. It's not great to have data that can potentially disagree.
|
||
uint64 sequence_number # sequence number | ||
uint64 flags # flags (for example: KEYFRAME) | ||
uint8 is_bigendian # is this data bigendian? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it would be a bool.
@@ -38,4 +41,10 @@ string format # Specifies the format of the data | |||
# need for successful decoding of the image. Refer to | |||
# documentation of the other transports for details. | |||
|
|||
string compression_type # Compression type used (jpeg, png, theora, etc) | |||
|
|||
uint64 sequence_number # sequence number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I agree that this should have a sequence number. A compressed image doesn't logically have one of these, and we don't have this in Image.msg, for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does seem to be breaking the encapsulation of this message standing on it's own.
string compression_type # Compression type used (jpeg, png, theora, etc) | ||
|
||
uint64 sequence_number # sequence number | ||
uint64 flags # flags (for example: KEYFRAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I don't think I agree that this should be here. A compressed image doesn't represent anything other than data, and I don't think we should be adding "external" metadata like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, this argument is about whether there should be any metadata fields in the message, or if the compression plugin should just shove its metadata into the binary blob in some bespoke fashion. To the extent that a common set of metadata exists, I think it is preferable to put that metadata in the message so that message introspection tooling can see it, rather than putting this all in binary blob form and practically inaccessible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, this argument is about whether there should be any metadata fields in the message
I disagree. I absolutely think we should have metadata in this message that is intrinsic to the CompressedImage
. Things like width
, height
, format
, compression_type
, and is_bigendian
all belong in here.
But metadata about the larger context in which this data is delivered doesn't seem to belong, in my opinion. The exception to this is of course std_msgs/Header
, which is already here and is heavily used with the rest of our tooling.
I definitely understand the need for things like a sequence_number, a keyframe flag, and the like. But those seem like things that should be in a custom message, composed of a CompressedImage
plus the metadata, like:
# CompressedKeyFrame
sensor_msgs/CompressedImage compressed_image
int sequence_number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But those seem like things that should be in a custom message, composed of a CompressedImage plus the metadata...
Frankly, I really think that if the outcome isn't an improved-for-everyone message, the right answer is for image_transport
to just define its own internal message type that it can change on its own. I don't think that's better for the community, but if we have to introduce a new type for image_transport
uses, having it inherit all the inertia that accompanies being a part of core message types isn't an improvement either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that Chris is right that the higher level information such as keyframes and sequence numbers deserve to be in a higher level concept than just the compressed image. Things like Key Frames and sequences are only valid if they're associated with a specific stream. Where a standalone compressed image may or may not be used in those applications.
In particular if someone muxes two compressed Image topics together, the sequence numbers become meaningless unless you're associating them with a name stream or the like.
There's the approach above of using a hierarchy of messages/wrapper messages. The other alternative would be to send a parallel stream of messages in the same way that we do with camera_info
. There's a parallel stream of camera info which can be associated with the image by timestamp + frame_id and with that this extra metadata can be applied in the image_transport process, but can be ignored by those who just want to access the raw images. And those images should standalone, in log files etc. Thus there would be CompressedImages and then there would be a CompressedImageStream protocol/standard for how to send along a series of compressed images with keyframes and potentially other metadata in a parallel CompressedImageStreamInfo
or the like.
uint32 height # image height, that is, number of rows | ||
uint32 width # image width, that is, number of columns | ||
|
||
string pixel_format # Specifies the format of the data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For compatibility we shouldn't rename this field but just update it'd documentation to state that it's expecting the pixel_format string as per OpenCV). It's still clear what the content is and this will avoid breaking any existing code.
@@ -38,4 +41,10 @@ string format # Specifies the format of the data | |||
# need for successful decoding of the image. Refer to | |||
# documentation of the other transports for details. | |||
|
|||
string compression_type # Compression type used (jpeg, png, theora, etc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a backwards compatible default expectation that should be documented and implemented as a fallback if unset?
@@ -38,4 +41,10 @@ string format # Specifies the format of the data | |||
# need for successful decoding of the image. Refer to | |||
# documentation of the other transports for details. | |||
|
|||
string compression_type # Compression type used (jpeg, png, theora, etc) | |||
|
|||
uint64 sequence_number # sequence number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does seem to be breaking the encapsulation of this message standing on it's own.
string compression_type # Compression type used (jpeg, png, theora, etc) | ||
|
||
uint64 sequence_number # sequence number | ||
uint64 flags # flags (for example: KEYFRAME) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that Chris is right that the higher level information such as keyframes and sequence numbers deserve to be in a higher level concept than just the compressed image. Things like Key Frames and sequences are only valid if they're associated with a specific stream. Where a standalone compressed image may or may not be used in those applications.
In particular if someone muxes two compressed Image topics together, the sequence numbers become meaningless unless you're associating them with a name stream or the like.
There's the approach above of using a hierarchy of messages/wrapper messages. The other alternative would be to send a parallel stream of messages in the same way that we do with camera_info
. There's a parallel stream of camera info which can be associated with the image by timestamp + frame_id and with that this extra metadata can be applied in the image_transport process, but can be ignored by those who just want to access the raw images. And those images should standalone, in log files etc. Thus there would be CompressedImages and then there would be a CompressedImageStream protocol/standard for how to send along a series of compressed images with keyframes and potentially other metadata in a parallel CompressedImageStreamInfo
or the like.
uint32 height # image height, that is, number of rows | ||
uint32 width # image width, that is, number of columns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this was encoded into the payload before, we need some more semantics about when it should be used and when not. It's not great to have data that can potentially disagree.
Can I suggest that the I found the use of "formats" and "encodings" in the images also confusing. Eventually, all image data, raw or compressed, is just an unstructured binary blob that has to be interpreted according to the |
The compressedImage message hasn’t been updated since 2009 . This message was probably designed for mainly for
jpeg
andpng
according to the history and comments in the message definition.During this time some other compressedImage types have appeared, for example:
The idea of this PR is to try to refresh this message type allowing new compression algorithms to set all the required information.
As a reference, I was working on a SVTAV1 image transport plugin and I had to encode the metadata inside the payload which is ugly.
This new message type will allow to use the same message type in all the
image_transport
plugins (at least released in rolling)FYI @calderpg-tri @IanTheEngineer