Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always spawn a cleanup process with exec #10405

Merged
merged 2 commits into from
Jun 11, 2021

Conversation

mheon
Copy link
Member

@mheon mheon commented May 19, 2021

We were previously only doing this for detached exec. I don't know why we did that, but I don't see any reason not to extend it to all exec sessions - it guarantees that we will always clean up exec sessions, even if the original podman exec process died.

[NO TESTS NEEDED] because I don't really know how to test this one.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 19, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mheon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 19, 2021
@rhatdan
Copy link
Member

rhatdan commented May 19, 2021

LGTM

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 20, 2021
@vrothberg
Copy link
Member

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 20, 2021
@giuseppe
Copy link
Member

/retest

@mheon
Copy link
Member Author

mheon commented May 20, 2021

I think the errors are legit - I need to handle the already-removed race condition in the exec code.

@rhatdan rhatdan removed the lgtm Indicates that a PR is ready to be merged. label May 27, 2021
@mheon mheon force-pushed the always_cleanup_exec branch from 0480b6d to 476c852 Compare June 3, 2021 18:05
@mheon
Copy link
Member Author

mheon commented Jun 3, 2021

Alright, now I realize why I didn't do this originally. We now have a race where podman exec needs to pick up the exit code of the exec session from the container, but the cleanup process is wiping the exec session (and said exit code) at the same time.

We can potentially work around this by having a special event for exec session exit (Docker uses exec_die for this) and using logic similar to what podman run uses when the container is removed before we can read its exit code. I'll get this coded up tomorrow.

We were previously only doing this for detached exec. I don't
know why we did that, but I don't see any reason not to extend it
to all exec sessions - it guarantees that we will always clean up
exec sessions, even if the original `podman exec` process died.

[NO TESTS NEEDED] because I don't really know how to test this
one.

Signed-off-by: Matthew Heon <[email protected]>
@mheon mheon force-pushed the always_cleanup_exec branch from 476c852 to 28e866c Compare June 10, 2021 18:16
When making Exec Cleanup processes mandatory, I introduced a race
wherein attached exec sessions could be cleaned up and removed by
the cleanup process before the frontend had a chance to get their
exit code. Fortunately, we've dealt with this issue before in
containers, and the same solution can be applied here. I added an
event for an exec session's process exiting, `exec_died` (Docker
has an identical event, so this actually improves our
compatibility there) that includes the exit code of the exec
session. If the race happens and the exec session no longer
exists when we go to remove it, pick up exit code from the event
and exit cleanly.

Signed-off-by: Matthew Heon <[email protected]>
@mheon mheon force-pushed the always_cleanup_exec branch from 28e866c to 62f4b0a Compare June 10, 2021 18:17
@mheon
Copy link
Member Author

mheon commented Jun 10, 2021

Re-pushed with a fix. Added an exec_died event, which we now use for retrieving exit codes on exec sessions being removed before we can retrieve them.

@vrothberg
Copy link
Member

Restarted the two jobs. Looked like flakes.

@mheon
Copy link
Member Author

mheon commented Jun 11, 2021

I think they are, but they seem extremely consistent.

@mheon
Copy link
Member Author

mheon commented Jun 11, 2021

/hold cancel

Oh dear, it went green. @containers/podman-maintainers Anyone want to drop an LGTM?

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 11, 2021
@rhatdan
Copy link
Member

rhatdan commented Jun 11, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 11, 2021
@openshift-merge-robot openshift-merge-robot merged commit 45dc3d6 into containers:master Jun 11, 2021
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants