Add content-based instruction cache #58

tonistiigi · 2017-07-06T05:30:05Z

I'll tackle this next. Writing down some thoughts.

One of the problems with docker's instruction cache is that it only defines a cache between two steps. You have to solve the definition to a certain point to see if there is a next step that may be possibly cached.

Buildkit should attempt to find all the cache keys as soon as possible. You don't have to solve the whole graph or all branches to find out that a vertex data has been cached. For example, when two branches are merged together you don't have to have the data for the original branches to verify that you have the cache for the merged part as long as you can verify that the sources and the graph definition have not been updated.

Other difference is that vertexes should have multiple cache keys. For example, COPY should be cached by both definition and source content. In docker build, COPY is only fixed to content while other commands only use meta definition. This is because in docker build there is no unique cache key for the root of the context source. Also, cache keys by content should never need to be recalculated, even with --no-cache options.

Some definitions for the cache keys:
Image source: ChainID
Git source: commit-sha
Local file source: session-id
Exec: meta+cachekey of inputs, possibly meta+cachekey of input contents
Copy: meta+cachekey of inputs, meta + cachekey of input contents

A complication is that keys based on contents can't be found until the input has been solved. In the case of sources, cache key can be usually found without fully downloading the source data. The source interface would need to be updated to add an extra method for that.

@AkihiroSuda

The text was updated successfully, but these errors were encountered:

tonistiigi · 2017-07-07T17:31:04Z

Most of this was added in #60 but I'll leave this open to track the content-based cache keys.

send logrus logs to stderr

feat(sbom): return SBOMs if present

tonistiigi self-assigned this Jul 6, 2017

AkihiroSuda mentioned this issue Jul 7, 2017

RFC: Distributed BuildKit (Swarm/Kubernetes/Mesos..) #62

Closed

tonistiigi removed their assignment Jul 7, 2017

tonistiigi changed the title ~~Add instruction cache~~ Add content-based instruction cache Jul 7, 2017

tonistiigi mentioned this issue Aug 3, 2017

solver: implement content based cache support #91

Merged

tonistiigi closed this as completed in #91 Aug 4, 2017

tonistiigi mentioned this issue Aug 10, 2017

Proposal: Use graph search to find valid chains of relevant docker build cache hits moby/moby#20304

Closed

alexcb added a commit to alexcb/buildkit that referenced this issue Nov 15, 2021

Merge pull request moby#58 from earthly/acb/logrus-stderr

9027211

send logrus logs to stderr

goller added a commit to goller/buildkit that referenced this issue Oct 20, 2023

Merge pull request moby#58 from depot/feat/sboms

af04b49

feat(sbom): return SBOMs if present

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add content-based instruction cache #58

Add content-based instruction cache #58

tonistiigi commented Jul 6, 2017

tonistiigi commented Jul 7, 2017

Add content-based instruction cache #58

Add content-based instruction cache #58

Comments

tonistiigi commented Jul 6, 2017

tonistiigi commented Jul 7, 2017