-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using --reproducible loads entire image into memory #862
Comments
I should probably mention that this means we cannot use --reproducible for our builds because the containers running the build are being killed by k8s on our dev cluster and increasing the node RAM capacity is not a cost effective solution |
We are having this issue as well. We need to use reproducible builds so that our daily builds do not get added to our container registry if there is no "real" change to the docker image. |
This change adds a new flag to zero timestamps in layer tarballs without making a fully reproducible image. My use case for this is maintaining a large image with build tooling. I have a multi-stage Dockerfile that generates an image containing several toolchains for cross-compilation, with each toolchain being prepared in a separate stage before being COPY'd into the final image. This is a very large image, and while it's incredibly convenient for development, making a change as simple as adding one new tool tends to invalidate caches and force the devs to download another 10+ GB image. If timestamps were removed from each layer, these images would be mostly unchanged with each minor update, greatly reducing disk space needed for keeping old versions around and time spent downloading updated images. I wanted to use Kaniko's --reproducible flag to help with this, but ran into issues with memory consumption (GoogleContainerTools#862) and build time (GoogleContainerTools#1960). Additionally, I didn't really care about reproducibility - I mainly cared about the layers having identical contents so Docker could skip pulling and storing redundant layers from a registry. This solution works around these problems by stripping out timestamps as the layer tarballs are built. It removes the need for a separate postprocessing step, and preserves image metadata so we can still see when the image itself was built. An alternative solution would be to use mutate.Time much like Kaniko currently uses mutate.Canonical to implement --reproducible, but that would not be a satisfactory solution for me until [issue 1168](google/go-containerregistry#1168) is addressed by go-containerregistry. Given my lack of Go experience, I don't feel comfortable tackling that myself, and this seems like a simple and useful workaround in the meantime. As a bonus, I believe that this change also fixes GoogleContainerTools#2005 (though that should really be addressed in go-containerregistry itself).
@zx96 do you plan to create a PR for the linked commit? It would save many people a lot of frustration if our image builds would stop crashing at the end of a build due to being killed. |
I certainly can. I was planning on merging it shortly after I made the branch but ran into another bug (that also broke my use case) and got frustrated. I'll rebase my branch tomorrow and see if the other issue is still around - can't recall what issue number it was. |
This change adds a new flag to zero timestamps in layer tarballs without making a fully reproducible image. My use case for this is maintaining a large image with build tooling. I have a multi-stage Dockerfile that generates an image containing several toolchains for cross-compilation, with each toolchain being prepared in a separate stage before being COPY'd into the final image. This is a very large image, and while it's incredibly convenient for development, making a change as simple as adding one new tool tends to invalidate caches and force the devs to download another 10+ GB image. If timestamps were removed from each layer, these images would be mostly unchanged with each minor update, greatly reducing disk space needed for keeping old versions around and time spent downloading updated images. I wanted to use Kaniko's --reproducible flag to help with this, but ran into issues with memory consumption (GoogleContainerTools#862) and build time (GoogleContainerTools#1960). Additionally, I didn't really care about reproducibility - I mainly cared about the layers having identical contents so Docker could skip pulling and storing redundant layers from a registry. This solution works around these problems by stripping out timestamps as the layer tarballs are built. It removes the need for a separate postprocessing step, and preserves image metadata so we can still see when the image itself was built. An alternative solution would be to use mutate.Time much like Kaniko currently uses mutate.Canonical to implement --reproducible, but that would not be a satisfactory solution for me until [issue 1168](google/go-containerregistry#1168) is addressed by go-containerregistry. Given my lack of Go experience, I don't feel comfortable tackling that myself, and this seems like a simple and useful workaround in the meantime. As a bonus, I believe that this change also fixes GoogleContainerTools#2005 (though that should really be addressed in go-containerregistry itself).
This change adds a new flag to zero timestamps in layer tarballs without making a fully reproducible image. My use case for this is maintaining a large image with build tooling. I have a multi-stage Dockerfile that generates an image containing several toolchains for cross-compilation, with each toolchain being prepared in a separate stage before being COPY'd into the final image. This is a very large image, and while it's incredibly convenient for development, making a change as simple as adding one new tool tends to invalidate caches and force the devs to download another 10+ GB image. If timestamps were removed from each layer, these images would be mostly unchanged with each minor update, greatly reducing disk space needed for keeping old versions around and time spent downloading updated images. I wanted to use Kaniko's --reproducible flag to help with this, but ran into issues with memory consumption (GoogleContainerTools#862) and build time (GoogleContainerTools#1960). Additionally, I didn't really care about reproducibility - I mainly cared about the layers having identical contents so Docker could skip pulling and storing redundant layers from a registry. This solution works around these problems by stripping out timestamps as the layer tarballs are built. It removes the need for a separate postprocessing step, and preserves image metadata so we can still see when the image itself was built. An alternative solution would be to use mutate.Time much like Kaniko currently uses mutate.Canonical to implement --reproducible, but that would not be a satisfactory solution for me until [issue 1168](google/go-containerregistry#1168) is addressed by go-containerregistry. Given my lack of Go experience, I don't feel comfortable tackling that myself, and this seems like a simple and useful workaround in the meantime.
With kaniko-project/executor:v1.19.2-debug, building the same image:
Activing profiling (https://github.com/GoogleContainerTools/kaniko#kaniko-builds---profiling) I see lot of traces inflate/deflate with |
Still causes problems |
Enables caching from the qemu-project repository. Uses a dedicated "$NAME-cache" tag for caching, to address limitations. See issue "when using --cache=true, kaniko fail to push cache layer [...]": GoogleContainerTools/kaniko#1459 Does not specify a context since no Dockerfile is using COPY or ADD instructions. Does not enable reproducible builds as that results in builds failing with an out of memory error. See issue "Using --reproducible loads entire image into memory": GoogleContainerTools/kaniko#862 Previous attempts, for the records: - Alex Bennée: https://lore.kernel.org/qemu-devel/[email protected]/ - Camilla Conte (me): https://lore.kernel.org/qemu-devel/[email protected]/ Signed-off-by: Camilla Conte <[email protected]> Message-Id: <[email protected]>
To make the code causing the problem a bit clearer:
There are several options:
I'm having some trouble running kaniko in my dev setup so it is hard to see the gains of 2-4, but until metadata is stripped out during build I think implementing them could save a lot of memory (at the cost of disk usage) and result in a small performance gain. |
Fixes GoogleContainerTools#862, may mitigate GoogleContainerTools#1960 The layerTime function returns a layer backed by an in-memory buffer, which means during the stripping of timestamps the entire image is loaded into memory. Additionally, it runs gzip after the layer is created, resulting in even an even larger in-memory blob. This commit changes this method to use a temporary file instead of an in-memory buffer, and to use gzip compression while writing to this layer file, instead of compressing during read. Signed-off-by: bh <[email protected]>
I've found a 5th option: submitting a PR to go-containerregistry making the layerTime function a lazy transformation. Since the problem lies in a dependency, that may be the best way to achieve this result. |
Actual behavior
When running with --reproducible=true the entire image is loaded into memory.
Expected behavior
Memory profile should remain stable regardless of size of image being built.
To Reproduce
Steps to reproduce the behavior:
Additional Information
Triage Notes for the Maintainers
--cache
flagThe text was updated successfully, but these errors were encountered: