-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage can remove ErrNotAContainer as well #11787
Storage can remove ErrNotAContainer as well #11787
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@edsantiago Could you see if this fixes your race condition? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
55843f9
to
c23f516
Compare
Fixes: containers#11775 [NO TESTS NEEDED] No easy way to cause this problem in CI. Signed-off-by: Daniel J Walsh <[email protected]>
c23f516
to
c9ea2ca
Compare
Restarted my reproducer because I noticed you pushed new changes. FWIW, first attempt ran for 2-3 minutes without failure (prior to this, it would fail in a few seconds). Second reproducer (using c9ea2ca) running now, one minute with no failures. Looks good so far. |
15 minutes in, still no failures (other than CI, Docker-py and Test Bindings, which are so weird that I've just re-run on the assumption that they're both flakes). I guess this "fixes" the race, in the sense of sweeping under the rug. I will leave it to more knowledgeable devs to determine how safe this is and how likely it is to mask a real problem. Thank you for taking this on so quickly! |
@@ -184,8 +184,12 @@ func (r *storageService) DeleteContainer(idOrName string) error { | |||
} | |||
err = r.store.DeleteContainer(container.ID) | |||
if err != nil { | |||
logrus.Debugf("Failed to delete container %q: %v", container.ID, err) | |||
return err | |||
if errors.Cause(err) == storage.ErrNotAContainer || errors.Cause(err) == storage.ErrContainerUnknown { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does ErrNotAContainer mean? Is the ID present in c/storage, but not as a container? That points to a potential bug to me - why is that exact 256-bit random ID still in use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at times, we attempt to remove a layer as opposed to a container, and call DeleteContainer with the id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mheon Looking at the code https://github.com/containers/storage/blob/65e6ab3cdc1ea69c8a55039dc668ee3844255a92/store.go#L2449-L2532 it means that the container id does not exists, it does not mean that the id exists as image or layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
LGTM |
/lgtm |
Fixes: #11775
[NO TESTS NEEDED] No easy way to cause this problem in CI.
Signed-off-by: Daniel J Walsh [email protected]
What this PR does / why we need it:
How to verify it
Which issue(s) this PR fixes:
Special notes for your reviewer: