Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement reproducible deployments if images are getting rebuilt? #12277

Closed
iamFIREcracker opened this issue Apr 21, 2022 · 2 comments
Closed

Comments

@iamFIREcracker
Copy link

You can read more about the context of this request on neo4j/docker-neo4j#342, but in a nutshell:

  • our team first pulls neo4j:4.2.8 during summer 2021, and caches it in an internal registry
    • Note: even though 4.2.8 is not latest, it's still supported by the vendor
    • everything works fine in our ci/cd, and deployment environments
  • we pull the same image again, today, and to our surprise the "new" image does not work within our deployment environments anymore
    • its base image had switched to a different major version of the OS which turned out not to be compatible with the specific version of docker we were running in all of our environments
  • how can users of these images implement "reproducible deployments", if the actual image a tag refers to changes over time?
    • users can probably link to layer digests, though I am not sure if those will be kept around after the same image is re-built
      • and even if digests were kept around, forever, how could users know that a given image got re-built, and what changes it included?
        • I can easily subscribe and stay up to date with the changes of the service (neo4j in this case)
        • but how can I know what changed, in a docker image, after it got first published?

At first I thought the problem was specific to the neo4j image and its library file in particular, wherein all the versions supported by the vendor are listed to get rebuilt: "OK to get rolling tags like 4 or 4.4 to be automatically re-built, but point version? That's a bit unexpected"

But then I took a look around inside library/, and realized that it seems to be a common practice to list point versions to be re-built; now, not all the point versions are usually listed there, just the most recent one it seems, but my concern is still valid i.e. pulling an image today or in 3 months might result in a different image altogether.

So, going back to the title of this issue: how can users implement reproducible deployments, if the images they are linking to are getting rebuilt? Are there any best practices around this topic, that this repository is trying to enforce, or is it up to the upstream/vendor to decide the update policy for each of their tags?

Thanks in advance,
M.

@tianon
Copy link
Member

tianon commented Apr 21, 2022

Yeah, neo4j is maintained a little bit differently than most images here (with a much longer list of "supported" tags than other repositories). However, one of the main points of the official images program in general is that images do get rebuilt, especially to pick up changes in the base image, which are more often than not security related (see https://github.com/docker-library/official-images#library-definition-files and https://github.com/docker-library/faq#why-does-my-security-scanner-show-that-an-image-has-cves).

We maintain the https://github.com/docker-library/repo-info repository which is an attempt to track the history of tag changes over time (from which you can find the digest of older versions of tags). For the tag you referenced (neo4j:4.2.8), you can see the current digest (and other image details) at https://github.com/docker-library/repo-info/blob/master/repos/neo4j/remote/4.2.8.md, and can see a history of tag updates via https://github.com/docker-library/repo-info/commits/master/repos/neo4j/remote/4.2.8.md. For a specific example near the timeframe you mentioned (Jul 21, 2021), see docker-library/repo-info@baae92f#diff-783afd88b2eb370f931b1e25128bae7f7f073c492c7ef099903b215b39dc3389:

$ docker pull neo4j@sha256:9c2f5f60c09442be8b51fceab0df9704bbee50ff534a939fe9d369270ff32b5e
$ docker pull neo4j@sha256:f41acf08483b7e59cf76d7faabd90d7cae01bd6d606a2b35b46183295957a3f5

In Docker (and container registries in general), the image digest is "content addressable" -- that is to say, it's designed for the use case you describe. If I pull bash:latest@sha256:b3abe4255706618c550e8db5ec0875328333a14dbf663e6f1e2b6875f45521e5 today and five months from now, I will get exactly the same bits that correspond to that sha256 hash.

There are also tools that are designed to help maintain those digests in your Dockerfiles/deployment YAML files (and automatically update them over time), however I am not familiar with them so I'll leave finding them as an exercise for the reader. 😅

@tianon tianon pinned this issue Apr 22, 2022
@tianon
Copy link
Member

tianon commented May 10, 2022

I'm going to close this (as it's resolved), but I do plan to leave it pinned since I think it's useful clarification. 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants