Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock between 'podman ps' and 'container inspect' commands #16327

Merged

Conversation

tyler92
Copy link
Contributor

@tyler92 tyler92 commented Oct 27, 2022

Fixes: #16326

[NO NEW TESTS NEEDED]

Signed-off-by: Mikhail Khachayants [email protected]

Does this PR introduce a user-facing change?

Fix deadlock between 'podman ps' and 'container inspect' commands

@@ -581,20 +581,22 @@ func (p *Pod) Status() (map[string]define.ContainerStatus, error) {
}

func containerStatusFromContainers(allCtrs []*Container) (map[string]define.ContainerStatus, error) {
// We need to lock all the containers
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, I saw this comment and my assumption is this is not correct now. Please let me know if I am wrong.

@baude
Copy link
Member

baude commented Oct 27, 2022

/approve

i'm looking deeper on this r/n. i dont know about that assumption but there is an easy enough way to find ou8t

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 27, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: baude, tyler92

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 27, 2022
@baude
Copy link
Member

baude commented Oct 27, 2022

@mheon i dont see evidence that this is no longer true unless i missed some change to container and pod locking. What might make more sense would be to iterate the containers, grab a lock, get the container information, release the lock? One loop vs two ?

@mheon
Copy link
Member

mheon commented Oct 28, 2022

Code looks fine as written, though personally would have avoided the anonymous function. The core issue is similar to the reason we had to rewrite pod removal - holding more than one container lock simultaneously, without specific ordering, can cause deadlocks with other commands that want to lock, say, a container and its dependencies. We got rid of one instance of this (pod removal), this looks like another.

@mheon
Copy link
Member

mheon commented Oct 28, 2022

LGTM, for reference

@tyler92
Copy link
Contributor Author

tyler92 commented Oct 28, 2022

personally would have avoided the anonymous function

I Can rewrite with explicit Unlock calls and without an anonymous function

libpod/pod_api.go Outdated Show resolved Hide resolved
Fixes: containers#16326

[NO NEW TESTS NEEDED]

Signed-off-by: Mikhail Khachayants <[email protected]>
@tyler92 tyler92 force-pushed the fix-deadlock-pod-ps-inspect branch from 4d332a5 to f355900 Compare October 28, 2022 07:13
@tyler92
Copy link
Contributor Author

tyler92 commented Oct 28, 2022

I've published my small pet-project tool for debugging such deadlocks (in case it's impossible to use IDE or GDB): https://github.com/tyler92/podman-locks. It would be great if someday it will help someone other than me.

(I hope it's allowed to attach such links here)

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@vrothberg
Copy link
Member

I've published my small pet-project tool for debugging such deadlocks (in case it's impossible to use IDE or GDB): https://github.com/tyler92/podman-locks. It would be great if someday it will help someone other than me.

Thanks for sharing! I usually use the BPF tools from bcc-tools, for instance, /usr/share/bcc/tools/deadlock.

@vrothberg
Copy link
Member

/lgtm
/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 28, 2022
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 28, 2022
@vrothberg
Copy link
Member

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 28, 2022
@tyler92
Copy link
Contributor Author

tyler92 commented Oct 28, 2022

I usually use the BPF tools from bcc-tools,

Thanks, I will take it into account! The main difference with my tool is that it can be launched after the deadlock has occurred and the podman process has already started as usual without any external tools.

@openshift-merge-robot openshift-merge-robot merged commit 40073ab into containers:main Oct 28, 2022
@vrothberg
Copy link
Member

/cherry-pick v4.3.0

@openshift-cherrypick-robot
Copy link
Collaborator

@vrothberg: new pull request could not be created: failed to create pull request against containers/podman#v4.3.0 from head openshift-cherrypick-robot:cherry-pick-16327-to-v4.3.0: status code 422 not one of [201], body: {"message":"Validation Failed","errors":[{"resource":"PullRequest","field":"base","code":"invalid"}],"documentation_url":"https://docs.github.com/rest/reference/pulls#create-a-pull-request"}

In response to this:

/cherry-pick v4.3.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ashley-cui
Copy link
Member

/cherry-pick v4.3

@openshift-cherrypick-robot
Copy link
Collaborator

@ashley-cui: new pull request created: #16448

In response to this:

/cherry-pick v4.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vrothberg
Copy link
Member

Thanks, @ashley-cui :)

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deadlock between 'pod ps' and 'container inspect' commands
7 participants