Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CompressedImage #237

Open
wants to merge 1 commit into
base: rolling
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions sensor_msgs/msg/CompressedImage.msg
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
# This message contains a compressed image.

std_msgs/Header header # Header timestamp should be acquisition time of image
std_msgs/Header header # Header timestamp should be acquisition time of image
# Header frame_id should be optical frame of camera
# origin of frame should be optical center of cameara
# +x should point to the right in the image
# +y should point down in the image
# +z should point into to plane of the image

string format # Specifies the format of the data
uint32 height # image height, that is, number of rows
uint32 width # image width, that is, number of columns
Comment on lines +10 to +11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to this PR, how are users of this message determining this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this was encoded into the payload before, we need some more semantics about when it should be used and when not. It's not great to have data that can potentially disagree.


string pixel_format # Specifies the format of the data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a needless difference. I understand that this is slightly more descriptive, but I also don't think it is worth the headache for downstream users to change this. So I'll suggest putting this back to format.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that encoding is the current spelling for this in sensor_msgs/Image, which is just wrong in almost every case. Everything else (OpenCV, commercial machine vision libraries, etc) refers to this as "pixel format". Unfortunately, "encoding" is even more confusing when dealing with compressed data, so if there is a chance to fix this properly we should.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For compatibility we shouldn't rename this field but just update it'd documentation to state that it's expecting the pixel_format string as per OpenCV). It's still clear what the content is and this will avoid breaking any existing code.

# Acceptable values differ by the image transport used:
# - compressed_image_transport:
# ORIG_PIXFMT; CODEC compressed [COMPRESSED_PIXFMT]
Expand Down Expand Up @@ -38,4 +41,10 @@ string format # Specifies the format of the data
# need for successful decoding of the image. Refer to
# documentation of the other transports for details.

string compression_type # Compression type used (jpeg, png, theora, etc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a backwards compatible default expectation that should be documented and implemented as a fallback if unset?


uint64 sequence_number # sequence number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I agree that this should have a sequence number. A compressed image doesn't logically have one of these, and we don't have this in Image.msg, for instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does seem to be breaking the encapsulation of this message standing on it's own.

uint64 flags # flags (for example: KEYFRAME)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I don't think I agree that this should be here. A compressed image doesn't represent anything other than data, and I don't think we should be adding "external" metadata like this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this argument is about whether there should be any metadata fields in the message, or if the compression plugin should just shove its metadata into the binary blob in some bespoke fashion. To the extent that a common set of metadata exists, I think it is preferable to put that metadata in the message so that message introspection tooling can see it, rather than putting this all in binary blob form and practically inaccessible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, this argument is about whether there should be any metadata fields in the message

I disagree. I absolutely think we should have metadata in this message that is intrinsic to the CompressedImage. Things like width, height, format, compression_type, and is_bigendian all belong in here.

But metadata about the larger context in which this data is delivered doesn't seem to belong, in my opinion. The exception to this is of course std_msgs/Header, which is already here and is heavily used with the rest of our tooling.

I definitely understand the need for things like a sequence_number, a keyframe flag, and the like. But those seem like things that should be in a custom message, composed of a CompressedImage plus the metadata, like:

# CompressedKeyFrame
sensor_msgs/CompressedImage compressed_image
int sequence_number

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But those seem like things that should be in a custom message, composed of a CompressedImage plus the metadata...

Frankly, I really think that if the outcome isn't an improved-for-everyone message, the right answer is for image_transport to just define its own internal message type that it can change on its own. I don't think that's better for the community, but if we have to introduce a new type for image_transport uses, having it inherit all the inertia that accompanies being a part of core message types isn't an improvement either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that Chris is right that the higher level information such as keyframes and sequence numbers deserve to be in a higher level concept than just the compressed image. Things like Key Frames and sequences are only valid if they're associated with a specific stream. Where a standalone compressed image may or may not be used in those applications.

In particular if someone muxes two compressed Image topics together, the sequence numbers become meaningless unless you're associating them with a name stream or the like.

There's the approach above of using a hierarchy of messages/wrapper messages. The other alternative would be to send a parallel stream of messages in the same way that we do with camera_info. There's a parallel stream of camera info which can be associated with the image by timestamp + frame_id and with that this extra metadata can be applied in the image_transport process, but can be ignored by those who just want to access the raw images. And those images should standalone, in log files etc. Thus there would be CompressedImages and then there would be a CompressedImageStream protocol/standard for how to send along a series of compressed images with keyframes and potentially other metadata in a parallel CompressedImageStreamInfo or the like.

uint8 is_bigendian # is this data bigendian?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it would be a bool.


uint8[] data # Compressed image buffer