-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure all layers are here when tagging, committing, saving, or converting #3435
Conversation
8c55dbd
to
82daf62
Compare
pkg/cmd/container/commit.go
Outdated
@@ -57,7 +57,7 @@ func Commit(ctx context.Context, client *containerd.Client, rawRef string, req s | |||
if found.MatchCount > 1 { | |||
return fmt.Errorf("multiple IDs found with provided prefix: %s", found.Req) | |||
} | |||
imageID, err := commit.Commit(ctx, client, found.Container, opts) | |||
imageID, err := commit.Commit(ctx, client, found.Container, opts, options.GOptions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to add GOptions to the commit opts struct, but this is good enough short term.
pkg/imgutil/imgutil.go
Outdated
} | ||
refDomain := distributionref.Domain(named) | ||
|
||
// Get a resolver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should put that away somewhere with the login code refactor. Resolver is repetitive and error prone...
58a9dd3
to
6aa644c
Compare
ccdb533
to
d2905da
Compare
@AkihiroSuda can you poke the CI? |
pkg/cmd/image/tag.go
Outdated
err = imgutil.EnsureAllContent(ctx, client, srcName, img.Target, "", options.GOptions) | ||
if err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When tagging, do we need to return errors for failed pulls? This might not meet expectations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is a valid concern.
Then if at a later point we try to push it, it will fail…
Not sure how to deal with that…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not like any of the code involved in there.
The "single-platform tmp-reduced-platform image" gymnastic / convert intermediary step on push just smells fishy.
Then again, this is containerd design issues on display here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could introduce a new flag called ensure-image
for both the tag
and commit
commands.
When this flag is set, we would attempt to pull any missing layers. If any errors occur during this process, they would be returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing that is bugging me with the new flag approach is that it would leave the default behavior broken, which I don't think is right.
So, what about this: instead of hard error when the fetch fail, we just print a loud warning but we continue?
wdyt @lingdie ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest push implements this.
We no longer hard error, but warn the user.
PTAL at your convenience @lingdie
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
look good to me.
297233b
to
c64c494
Compare
d5e41e8
to
764a2aa
Compare
CI is green (save for the windows regression #3437). @AkihiroSuda This is how far I will go with this. I already hate this PR, and there are still glaring issues to fix in all that stuff, especially the whole Anyhow, I still believe this PR here will help a lot, as the issue being fixed has been one of our most pervasive and egregious. Credit for this fix should go to @lingdie who did the research and correctly identified the underlying containerd issue, then pointed me in the right direction. PTAL at your convenience. |
@@ -55,26 +52,45 @@ func TestKubeCommitPush(t *testing.T) { | |||
cmd := testutil.KubectlHelper(base, "get", "pods", tID, "-o", "jsonpath={ .status.containerStatuses[0].containerID }") | |||
cmd.Run() | |||
containerID = strings.TrimPrefix(cmd.Out(), "containerd://") | |||
|
|||
// This below is missing configuration to allow for plain http communication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was not able / interested enough to finish setting up a cluster registry.
This here is a start, and can be completed later on, either by me or someone else.
} | ||
|
||
tearDown := func() { | ||
testutil.KubectlHelper(base, "delete", "pod", "-f", tID).Run() | ||
testutil.KubectlHelper(base, "delete", "pod", "--all").Run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason that escapes me, kubectl was not cleaning up with the pod name
Err: "failed to create a tmp single-platform image", | ||
}) | ||
base.Cmd("commit", containerID, "testcommitsave").AssertOK() | ||
base.Cmd("save", "testcommitsave").AssertOK() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This correctly reproduce issue #827 I believe.
) | ||
|
||
func TestImageInspectContainsSomeStuff(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This turned out flaky as I was debugging this new code.
Rewriting against the new toolkit to fix that.
} | ||
|
||
func TestImageInspectDifferentValidReferencesForTheSameImage(t *testing.T) { | ||
testutil.DockerIncompatible(t) | ||
nerdtest.Setup() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as the other test.
@@ -125,7 +125,8 @@ func TestImagesFilterDangling(t *testing.T) { | |||
testutil.RequiresBuild(t) | |||
testutil.RegisterBuildCacheCleanup(t) | |||
base := testutil.NewBase(t) | |||
base.Cmd("images", "prune", "--all").AssertOK() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (obviously?) was doing nothing and leftovers from other tests were having an impact here.
Because we depend on build here, we cannot use a private namespace, so, making sure we clean house.
@@ -0,0 +1,133 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the underlying issue is impacting a lot of different functions, it did not feel right to scatter it over in different commands.
Suggesting instead to have a new /issues/ folder for these type of stuff (eg: regression testing for complex / multi-command issues)
@@ -23,6 +23,7 @@ readonly root | |||
|
|||
GO_VERSION=1.23 | |||
KIND_VERSION=v0.24.0 | |||
CNI_PLUGINS_VERSION=v1.5.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow when I came up with the Kube tooling, I did not test networking.
We (obviously?) do need cni here for nerdctl to work outside of kube itself.
@@ -31,7 +32,7 @@ import ( | |||
func Tag(ctx context.Context, client *containerd.Client, options types.ImageTagOptions) error { | |||
imageService := client.ImageService() | |||
var srcName string | |||
imagewalker := &imagewalker.ImageWalker{ | |||
walker := &imagewalker.ImageWalker{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not shadow the package name.
// Ensure all the layers are here: https://github.com/containerd/nerdctl/issues/3425 | ||
err = EnsureAllContent(ctx, client, srcName, options.GOptions) | ||
if err != nil { | ||
log.G(ctx).Warn("Unable to fetch missing layers before committing. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On tag, we do not hard error if we fail to retrieve the missing layers.
This is to preserve the previous behavior, in case we are offline.
Warn the user instead, and point it to a special purpose ticket that provides explanations.
@@ -126,6 +127,13 @@ func Commit(ctx context.Context, client *containerd.Client, container containerd | |||
return emptyDigest, err | |||
} | |||
|
|||
// Ensure all the layers are here: https://github.com/containerd/nerdctl/issues/3425 | |||
err = image.EnsureAllContent(ctx, client, baseImg.Name(), globalOptions) | |||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as tag, we do not hard error on commit.
// Ensure all the layers are here: https://github.com/containerd/nerdctl/issues/3425 | ||
err = EnsureAllContent(ctx, client, srcRawRef, options.GOptions) | ||
if err != nil { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convert and Save DO hard error, since the operation cannot complete if there is missing content.
Added some inline comments to ease review. |
@AkihiroSuda with containerd rc5 just out - it would be nice to have this one in for the next nerdctl rc so that we can get it some soak time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
@apostasie we seem to run into this issue when using the GKE image streaming snapshotter. When we commit an image locally (e.g. with docker) and don't push it, the converter tries to pull the image from remote. Is this expected? Example
|
@ChrisBr Though, in this case, maybe we can relax to a soft error if the pull fails. Do you mind opening a new issue with these details? |
Thanks for looking into it, really appreciated. |
See #3425 and linked issues for context.
content digest sha256:xxx: not found
in CI #2327 (convert)invalid checksum digest format
) #2506 (tag)Unconfirmed (no test), but cannot reproduce anymore with the patch:
Unconfirmed, but probably fixed as well:
Reviewers, most of this is test wrangling: