-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify contentDigest meaning for docker/OCI images #287
Comments
Great question! I think image id is the better idea, since it's registry-independent... @jlegrone |
I think we discussed this way back in 2018 in the early days and this was a common view. I think we ended up going with the assumption it was the repo digest, I’ll see if I can find an old issue in the Duffle repo! I think the image id is attractive since it has no registry requirement. |
Some of this history was on Docker’s slack I think, so that’s not all going to be recoverable (unless Docker can get it), some related comments and discussion: A note I had from someone at docker (dcmg) “ The manifest digest refers to compressed layers, so Docker doesn't know that identifier until after push since it calculates it on push. After we replace the image backend in Docker, that will work a little differently, we will be able to keep the compressed image hashes that were pulled or built Related to multiple identifiers, it is always possible to create images that are the "same" but only differ by metadata, compression, encryption, or anything else However, we are trying to move to a world where that original content is always used, so changes to the identifier actually represent a change to the image, rather than a side effect of pulling and pushing an image from a different docker version The image ID does not have a the compressed hash, which tends to be what is needed to fetch the image from a repository or the byte size of the fetch-able artifacts” |
I don't think you can reasonably call a hashed file "a root of a merkle tree". That assumes an intent that is clearly not there in VM images (namely, that they are tree-structured). I am not understanding, though, what particular change you are requesting in the spec. Is it a clarification of which SHA Docker considers to be the correct SHA? Or are you proposing an alternative? |
This issue is scoped to docker/OCI images.
If I consume a bundle containing a docker/OCI image with contentDigest specified, I need to know whether that's the repo digest or the image id in order to verify it. I'm not asking which one is correct from Docker's perspective: they both have valid uses. It's merely a choice that the CNAB spec. has to make. Let's take a simple example to make this crystal clear. CNAB runtime A could assume the contentDigest of a docker/OCI image is its repo digest while CNAB runtime B could assume its the image id. If a bundle created by runtime A was consumed by runtime B, then runtime B could say the contentDigest was invalid because it wasn't what runtime B was expecting. |
With #384 merged, I believe we can consider this issue closed. |
LGTM, thanks. |
Let's start with some definitions (based on the "OCI Image Format Specification").
Docker and OCI images have two types of digest: a repo digest and an image id. A repo digest is the SHA-256 digest of the compressed image manifest. Since compression depends on the implementation of the registry used to store the image, the repo digest doesn't logically exist until the image has been pushed. An image id, on the other hand, is the SHA-256 digest of the uncompressed image configuration, which is independent of the registry implementation.
Both these digests are content addresses of an image in the sense that each uniquely identifies the content (modulo SHA-256 collisions). Note that the docker registry spec refers to the repo digest as a "content digest".
The CNAB spec defines the
contentDigest
fields in bundle.json as follows, firstly for invocation images:and then for images other than invocation images:
Since both repo digests and image ids are roots of Merkle trees, the CNAB spec doesn't actually prescribe whether repo digest or image id (or indeed some other Merkle tree root digest!) should be used for
contentDigest
fields of docker/OCI images. This needs clarifying so that CNAB runtimes know how to validate these fields.The text was updated successfully, but these errors were encountered: