-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop container does not exist on removal to debugf #10427
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fixes: #10423 |
Hold on please, I just saw: $ while :;do ./bin/podman-remote run --rm alpine true;done
ERRO[0000] Error removing container 63170103bcc92cde7a8c3454f463def82e39196b3e588f75e183725f28433cf0: refusing to remove "63170103bcc92cde7a8c3454f463def82e39196b3e588f75e183725f28433cf0" as it exists in libpod as container 63170103bcc92cde7a8c3454f463def82e39196b3e588f75e183725f28433cf0: container already exists Looks like podman/libpod/runtime_cstorage.go Line 97 in e48aa8c
|
@edsantiago Try again. |
libpod/runtime_ctr.go
Outdated
@@ -645,10 +645,14 @@ func (r *Runtime) removeContainer(ctx context.Context, c *Container, force, remo | |||
} | |||
} else { | |||
if err := r.state.RemoveContainer(c); err != nil { | |||
if cleanupErr == nil { | |||
cleanupErr = err | |||
if err == define.ErrNoSuchCtr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is still worth an error. If we got this far, we did it holding the lock - there is 0 chance the container was removed from under us. If this happens something very strange and very bad is going on - someone is messing around in the DB without obeying our locks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok some of this was just guessing. I will remove this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually @mheon @edsantiago reported in the original issue that he was getting.
WARN[0000] Container 6b4d7060b9466bd7c0e806758b3c2366227ff415784541882ea324c0bfb1b9e2 does not exist: container 6b4d7060b9466bd7c0e806758b3c2366227ff415784541882ea324c0bfb1b9e2 does not exist in database: no such container
Which looks like this is happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we have a serious bug, because that should never happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I reverted, let's see if this error is reproducible, with the other fixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pulled your latest change (6ca721c), rebuilt, and am running tests. No failures in the last two minutes, which is good. I need to be AFK periodically this morning but will check in as time allows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still running with no errors (20+ minutes). I call that good. Gotta head out now, will check back in in 90m or so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still no failures. I call this good.
We have race conditions where a container can be removed by two different processes when running podman --remove rm. It can be cleaned up in the API or by the conmon executing podman container cleanup. When we fail to remove a container that does not exists we should not be printing errors or warnings, we should just debug the fact. [NO TESTS NEEDED] Since this is a race condition it is difficult to test. Signed-off-by: Daniel J Walsh <[email protected]>
LGTM |
The other flake also looks like #7139 |
/lgtm |
/hold cancel |
We have race conditions where a container can be removed
by two different processes when running podman --remove rm.
It can be cleaned up in the API or by the conmon executing
podman container cleanup.
When we fail to remove a container that does not exists we should
not be printing errors or warnings, we should just debug the fact.
[NO TESTS NEEDED] Since this is a race condition it is difficult to
test.
Signed-off-by: Daniel J Walsh [email protected]