Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oci_image inherits data field from config in base image #756

Closed
chenhunter opened this issue Dec 19, 2024 · 2 comments · Fixed by #774
Closed

oci_image inherits data field from config in base image #756

chenhunter opened this issue Dec 19, 2024 · 2 comments · Fixed by #774

Comments

@chenhunter
Copy link

The latest debian image now adds the data field under config in the image manifest:

docker manifest inspect debian@sha256:ec54b6327d5099ab29b38d70f7290e42d8769ef676fc262b34a18b688104f61b
{
        "config": {
                "data": "eyJhcmNoaXRlY3R1cmUiOiJhbWQ2NCIsImNvbmZpZyI6eyJDbWQiOlsiYmFzaCJdLCJFbnRyeXBvaW50IjpbXSwiRW52IjpbIlBBVEg9L3Vzci9sb2NhbC9zYmluOi91c3IvbG9jYWwvYmluOi91c3Ivc2JpbjovdXNyL2Jpbjovc2JpbjovYmluIl19LCJjcmVhdGVkIjoiMjAyNC0xMi0wMlQwMDowMDowMFoiLCJoaXN0b3J5IjpbeyJjb21tZW50IjoiZGVidWVycmVvdHlwZSAwLjE1IiwiY3JlYXRlZCI6IjIwMjQtMTItMDJUMDA6MDA6MDBaIiwiY3JlYXRlZF9ieSI6IiMgZGViaWFuLnNoIC0tYXJjaCAnYW1kNjQnIG91dC8gJ2Jvb2t3b3JtJyAnQDE3MzMwOTc2MDAnIn1dLCJvcyI6ImxpbnV4Iiwicm9vdGZzIjp7ImRpZmZfaWRzIjpbInNoYTI1NjozMDFjMWJiNDJjYzBiYzY2MThmY2FmMDM2ZTg3MTFmMmFhZDY2Zjc2Njk3ZjU0MWUyMDE0YTY5ZTFmNDU2YWE0Il0sInR5cGUiOiJsYXllcnMifX0K",
                "digest": "sha256:ff869c3288a47c9625a60473a3d5108ec45bd095a00e23568a82ee8b95d12954",
                "mediaType": "application/vnd.oci.image.config.v1+json",
                "size": 453
        },
        "layers": [
                {
                        "digest": "sha256:fdf894e782a221820acf469d425b802be26aedb5e5d26ea80a650ff6a974d488",
                        "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
                        "size": 48497210
                }
        ],
        "mediaType": "application/vnd.oci.image.manifest.v1+json",
        "schemaVersion": 2
}

In the oci_image rule, the image built directly inherits the data field from the config in the base image, despite having a different config. This causes the data field to no longer match the digest for the config, causing verification errors. This can break many libraries and tools (such as https://github.com/google/go-containerregistry) because they will try and verify the manifest and fail to do so.

@chenhunter
Copy link
Author

I made a PR to fix this in #757

@evankyle
Copy link

#757 fixes pushing images with recent versions of debian containers to Amazon ECR

Error: PUT https://...dkr.ecr.us-east-1.amazonaws.com...: 
UNSUPPORTED: Invalid parameter at 'ImageManifest' failed to satisfy constraint: 'Invalid JSON syntax'

Workaround until fix lands

bazel_dep(name = "rules_oci", version = "2.2.0")
git_override(
    module_name = "rules_oci",
    remote = "https://github.com/chenhunter/rules_oci.git",
    commit = "57e0e29501410b371d76013780c2039a17b412ef",
)

plobsing added a commit to plobsing/rules_oci that referenced this issue Feb 11, 2025
The [OCI spec](https://github.com/opencontainers/image-spec/blob/main/descriptor.md#properties)
provides a `data` field in order to embed a copy of an image's config
directly into that image's manifest:

> data string
>
> This OPTIONAL property contains an embedded representation of the referenced content. Values MUST conform to the Base 64 encoding, as defined in RFC 4648. The decoded data MUST be identical to the referenced content and SHOULD be verified against the digest and size fields by content consumers. See Embedded Content for when this is appropriate.

This is [intended as an optimization](https://github.com/opencontainers/image-spec/blob/main/descriptor.md#embedded-content),
in order to reduce the number of network round-trips incurred to pull a
container from a remote repository.

However, as highlighted in the release announcement for the new
[`data` field](https://opencontainers.org/posts/blog/2024-03-13-image-and-distribution-1-1/#data-field),
the contexts in which an embedded config is beneficial can be a
subtle determination. Further, in the same announcement, the spec
clarified that registry implementations usually enforce a
[maximum size](https://opencontainers.org/posts/blog/2024-03-13-image-and-distribution-1-1/#manifest-maximum-size)
on manifests, an obscure limit which incautious use of the `data` field
can trigger.

Given the trade-offs dropping any embedded data (e.g. that might have
been propagated from a base image) seems the preferable option. The
alternative — updating embedded data to match the config on every
change, as proposed by bazel-contrib#757 — is
less universally correct and only sometimes more performant.
Either option would satisfy bazel-contrib#756 .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants