-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add content-based instruction cache #58
Comments
Most of this was added in #60 but I'll leave this open to track the content-based cache keys. |
alexcb
added a commit
to alexcb/buildkit
that referenced
this issue
Nov 15, 2021
send logrus logs to stderr
goller
added a commit
to goller/buildkit
that referenced
this issue
Oct 20, 2023
feat(sbom): return SBOMs if present
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I'll tackle this next. Writing down some thoughts.
One of the problems with docker's instruction cache is that it only defines a cache between two steps. You have to solve the definition to a certain point to see if there is a next step that may be possibly cached.
Buildkit should attempt to find all the cache keys as soon as possible. You don't have to solve the whole graph or all branches to find out that a vertex data has been cached. For example, when two branches are merged together you don't have to have the data for the original branches to verify that you have the cache for the merged part as long as you can verify that the sources and the graph definition have not been updated.
Other difference is that vertexes should have multiple cache keys. For example,
COPY
should be cached by both definition and source content. Indocker build
,COPY
is only fixed to content while other commands only use meta definition. This is because indocker build
there is no unique cache key for the root of the context source. Also, cache keys by content should never need to be recalculated, even with--no-cache
options.Some definitions for the cache keys:
Image source: ChainID
Git source: commit-sha
Local file source: session-id
Exec: meta+cachekey of inputs, possibly meta+cachekey of input contents
Copy: meta+cachekey of inputs, meta + cachekey of input contents
A complication is that keys based on contents can't be found until the input has been solved. In the case of sources, cache key can be usually found without fully downloading the source data. The source interface would need to be updated to add an extra method for that.
@AkihiroSuda
The text was updated successfully, but these errors were encountered: