Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--reproducible flag massively increases build time #1960

Open
roussidis opened this issue Mar 2, 2022 · 6 comments
Open

--reproducible flag massively increases build time #1960

roussidis opened this issue Mar 2, 2022 · 6 comments
Labels
area/performance issues related to kaniko performance enhancement feature/reproducible-digest kind/enhancement New feature or request priority/p2 High impact feature/bug. Will get a lot of users happy

Comments

@roussidis
Copy link

roussidis commented Mar 2, 2022

I am building one large docker image and I am experiencing a somewhat weird behavior that I would like to address.

Host info:

  • Docker version: 20.10.12
  • kaniko version 1.7.0

So we are talking about a large-ish image (~2.5GB) and when I built it with cache it is being built at ~1 min.
So far so good, but the problem that I am facing is that this cached image is always creating a new artifact with a new sha256.

I found out that for that issue the --reproducible flag exists and it does exactly that. When I built the image with the --reproducible flag it doesn't create a new artifact with a new sha256 but the build time increases from ~1 min to ~7 mins. That's a huge overhead imho and I would like to figure that out.
I got logs (with trace verbosity) when the --reproducible flag was enabled and when it was not. The main difference that was found in the logs:

  • --reproducible enabled
time="2022-03-02T12:36:05Z" level=debug msg="mapping stage idx 0 to digest sha256: ..."
time="2022-03-02T12:36:05Z" level=debug msg="mapping digest sha256 to cachekey ..."
time="2022-03-02T12:40:47Z" level=info msg="Pushing image to ..."
  • --reproducible disabled
 time="2022-03-02T12:29:12Z" level=debug msg="mapping stage idx 0 to digest sha256: ..."
 time="2022-03-02T12:29:12Z" level=debug msg="mapping digest sha256 to cachekey ..."
 time="2022-03-02T12:29:12Z" level=info msg="Pushing image to ..."

If you watch closely in the timestamps when --reproducible is enabled the stage mapping digest... is ~5 mins longer compared when the -reproducible is disabled.

Why is this behavior and why does it add so much overhead in the build process?
Is there any other way to use cache to build an image but not create a new artifact like docker does. --reproducible completely strips down the timestamps which provides a not useful output when running docker images.

e.g. In the below code block when using --reproducible the CREATED output is N/A which is not the desired behavior.

REPOSITORY                           TAG               IMAGE ID        CREATED              SIZE
random-image                       latest             abcdefghijk         N/A                 2.68GB

Another issue that was found is that when --reproducible is used on an image when running docker history for that image the output is not useful at all

IMAGE           CREATED          CREATED BY        SIZE
<missing>      292 years ago                              364B
<missing>      292 years ago                              18.4kB
<missing>      292 years ago                              5.69MB
<missing>      292 years ago                              4.93kB
<missing>      292 years ago                              12.5kB
<missing>      292 years ago                              83.9kB
<missing>      292 years ago                              0B
<missing>      292 years ago                              222MB
<missing>      292 years ago                              780B
<missing>      292 years ago                              0B
<missing>      292 years ago                              5.85MB
<missing>      292 years ago                              5.1MB
<missing>      292 years ago                              58.5MB
<missing>      292 years ago                              127MB
<missing>      292 years ago                              198MB                                                 

As you can see with the reproducible flag every useful information was lost from the docker history

@tejal29
Copy link
Contributor

tejal29 commented Mar 10, 2022

For your second question, see here Using the flag, all timestamps are stripped off and no history is available.

Will have to dig deeper into performance issue. Can you clarify your use case a little bit more?
What do you mean by "but the problem that I am facing is that this cached image is always creating a new artifact with a new sha256"

@roussidis
Copy link
Author

Thank you for your answer @tejal29 !

So this behavior to completely strip off the timestamps and the history should be expected.
What I wanted to do was to build an image with kaniko using cache. I was expecting that if cache was used to build the image that it would not create a new artifact in my registry. That means that even if cache was used to create the image (so no update had been done) kaniko is creating a "new" image in the registry. That creates problems with subsequent caches and filling the registry with the same images. Docker for example when is using cache to build a docker image is not creating a new artifact with a new sha256 but kaniko did that.

Reproducible flag is not preferred because of the overhead in the build time andthe stripped down timestamps. We value those timestamps and the history of the image

@crisbal
Copy link

crisbal commented Jun 21, 2022

Hey, just to add some voice to the issue, I am also seeing super slow builds with --reproducible, up to 7-8 times slower.

I suspect (but have no proof here) that it is because it takes some time to strip the timestamp metadata after the image is built, the image/layers needs to be extracted, changed and then repacked.

To be fair I have noticed that I use a very old version of Kaniko (Kaniko version : v1.3.0), will try to update it and see if it works better.

@crisbal
Copy link

crisbal commented Jun 21, 2022

Ok I have tested and see that it still very slow with Kaniko version : v1.8.1,

Let me show an example:

# > cat Dockerfile
FROM quay.io/pypa/manylinux2014_x86_64
ENV FOO=BAR

Here with --reproducible (it takes 1 minute and 10 seconds)

# > /kaniko/executor --no-push --dockerfile Dockerfile --log-timestamp --destination tmp --tarPath ./tmp.tar --reproducible
INFO[2022-06-21T09:47:29Z] Retrieving image manifest quay.io/pypa/manylinux2014_x86_64 
INFO[2022-06-21T09:47:29Z] Retrieving image quay.io/pypa/manylinux2014_x86_64 from registry quay.io 
INFO[2022-06-21T09:47:30Z] Built cross stage deps: map[]                
INFO[2022-06-21T09:47:30Z] Retrieving image manifest quay.io/pypa/manylinux2014_x86_64 
INFO[2022-06-21T09:47:30Z] Returning cached image manifest              
INFO[2022-06-21T09:47:30Z] Executing 0 build triggers                   
INFO[2022-06-21T09:47:30Z] Skipping unpacking as no commands require it. 
INFO[2022-06-21T09:47:30Z] ENV FOO=BAR 
INFO[2022-06-21T09:48:39Z] Skipping push to container registry due to --no-push flag 

Here without, it takes 8 seconds

/kaniko/executor --no-push --dockerfile Dockerfile --log-timestamp --destination tmp --tarPath ./tmp.tar
INFO[2022-06-21T09:49:06Z] Retrieving image manifest quay.io/pypa/manylinux2014_x86_64 
INFO[2022-06-21T09:49:06Z] Retrieving image quay.io/pypa/manylinux2014_x86_64 from registry quay.io 
INFO[2022-06-21T09:49:07Z] Built cross stage deps: map[]                
INFO[2022-06-21T09:49:07Z] Retrieving image manifest quay.io/pypa/manylinux2014_x86_64 
INFO[2022-06-21T09:49:07Z] Returning cached image manifest              
INFO[2022-06-21T09:49:07Z] Executing 0 build triggers                   
INFO[2022-06-21T09:49:07Z] Skipping unpacking as no commands require it. 
INFO[2022-06-21T09:49:07Z] ENV FOO=BAR                                  
INFO[2022-06-21T09:49:14Z] Skipping push to container registry due to --no-push flag 

zx96 added a commit to zx96/kaniko that referenced this issue Jan 2, 2023
This change adds a new flag to zero timestamps in layer tarballs without
making a fully reproducible image.

My use case for this is maintaining a large image with build tooling.
I have a multi-stage Dockerfile that generates an image containing
several toolchains for cross-compilation, with each toolchain being
prepared in a separate stage before being COPY'd into the final image.
This is a very large image, and while it's incredibly convenient for
development, making a change as simple as adding one new tool tends to
invalidate caches and force the devs to download another 10+ GB image.

If timestamps were removed from each layer, these images would be mostly
unchanged with each minor update, greatly reducing disk space needed for
keeping old versions around and time spent downloading updated images.

I wanted to use Kaniko's --reproducible flag to help with this, but ran
into issues with memory consumption (GoogleContainerTools#862) and build time (GoogleContainerTools#1960).
Additionally, I didn't really care about reproducibility - I mainly
cared about the layers having identical contents so Docker could skip
pulling and storing redundant layers from a registry.

This solution works around these problems by stripping out timestamps as
the layer tarballs are built. It removes the need for a separate
postprocessing step, and preserves image metadata so we can still see
when the image itself was built.

An alternative solution would be to use mutate.Time much like Kaniko
currently uses mutate.Canonical to implement --reproducible, but that
would not be a satisfactory solution for me until
[issue 1168](google/go-containerregistry#1168)
is addressed by go-containerregistry. Given my lack of Go experience, I
don't feel comfortable tackling that myself, and this seems like a
simple and useful workaround in the meantime.

As a bonus, I believe that this change also fixes GoogleContainerTools#2005 (though that
should really be addressed in go-containerregistry itself).
zx96 added a commit to zx96/kaniko that referenced this issue Apr 22, 2023
This change adds a new flag to zero timestamps in layer tarballs without
making a fully reproducible image.

My use case for this is maintaining a large image with build tooling.
I have a multi-stage Dockerfile that generates an image containing
several toolchains for cross-compilation, with each toolchain being
prepared in a separate stage before being COPY'd into the final image.
This is a very large image, and while it's incredibly convenient for
development, making a change as simple as adding one new tool tends to
invalidate caches and force the devs to download another 10+ GB image.

If timestamps were removed from each layer, these images would be mostly
unchanged with each minor update, greatly reducing disk space needed for
keeping old versions around and time spent downloading updated images.

I wanted to use Kaniko's --reproducible flag to help with this, but ran
into issues with memory consumption (GoogleContainerTools#862) and build time (GoogleContainerTools#1960).
Additionally, I didn't really care about reproducibility - I mainly
cared about the layers having identical contents so Docker could skip
pulling and storing redundant layers from a registry.

This solution works around these problems by stripping out timestamps as
the layer tarballs are built. It removes the need for a separate
postprocessing step, and preserves image metadata so we can still see
when the image itself was built.

An alternative solution would be to use mutate.Time much like Kaniko
currently uses mutate.Canonical to implement --reproducible, but that
would not be a satisfactory solution for me until
[issue 1168](google/go-containerregistry#1168)
is addressed by go-containerregistry. Given my lack of Go experience, I
don't feel comfortable tackling that myself, and this seems like a
simple and useful workaround in the meantime.

As a bonus, I believe that this change also fixes GoogleContainerTools#2005 (though that
should really be addressed in go-containerregistry itself).
zx96 added a commit to zx96/kaniko that referenced this issue Apr 22, 2023
This change adds a new flag to zero timestamps in layer tarballs without
making a fully reproducible image.

My use case for this is maintaining a large image with build tooling.
I have a multi-stage Dockerfile that generates an image containing
several toolchains for cross-compilation, with each toolchain being
prepared in a separate stage before being COPY'd into the final image.
This is a very large image, and while it's incredibly convenient for
development, making a change as simple as adding one new tool tends to
invalidate caches and force the devs to download another 10+ GB image.

If timestamps were removed from each layer, these images would be mostly
unchanged with each minor update, greatly reducing disk space needed for
keeping old versions around and time spent downloading updated images.

I wanted to use Kaniko's --reproducible flag to help with this, but ran
into issues with memory consumption (GoogleContainerTools#862) and build time (GoogleContainerTools#1960).
Additionally, I didn't really care about reproducibility - I mainly
cared about the layers having identical contents so Docker could skip
pulling and storing redundant layers from a registry.

This solution works around these problems by stripping out timestamps as
the layer tarballs are built. It removes the need for a separate
postprocessing step, and preserves image metadata so we can still see
when the image itself was built.

An alternative solution would be to use mutate.Time much like Kaniko
currently uses mutate.Canonical to implement --reproducible, but that
would not be a satisfactory solution for me until
[issue 1168](google/go-containerregistry#1168)
is addressed by go-containerregistry. Given my lack of Go experience, I
don't feel comfortable tackling that myself, and this seems like a
simple and useful workaround in the meantime.
@aaron-prindle aaron-prindle added feature/reproducible-digest priority/p2 High impact feature/bug. Will get a lot of users happy area/performance issues related to kaniko performance enhancement kind/enhancement New feature or request labels May 30, 2023
@philippe-granet
Copy link

philippe-granet commented Dec 23, 2023

With kaniko-project/executor:v1.19.2-debug, building the same image:

  • with --reproducible flag, build took 5m36s and use 7690Mb
  • without --reproducible flag, build took 1m38s and use 350Mb

Activing profiling (https://github.com/GoogleContainerTools/kaniko#kaniko-builds---profiling) I see lot of traces inflate/deflate with --reproducible flag :
kaniko.zip

@cniessigma
Copy link

Can confirm that I have this same issue

bh-tt added a commit to bh-tt/kaniko that referenced this issue Oct 22, 2024
Fixes GoogleContainerTools#862, may mitigate GoogleContainerTools#1960

The layerTime function returns a layer backed by an in-memory buffer, which means during
the stripping of timestamps the entire image is loaded into memory. Additionally, it runs
gzip after the layer is created, resulting in even an even larger in-memory blob.

This commit changes this method to use a temporary file instead of an in-memory buffer,
and to use gzip compression while writing to this layer file, instead of compressing during read.

Signed-off-by: bh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance issues related to kaniko performance enhancement feature/reproducible-digest kind/enhancement New feature or request priority/p2 High impact feature/bug. Will get a lot of users happy
Projects
None yet
Development

No branches or pull requests

6 participants