Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

system tests: safer container-stop signaling #18323

Merged

Conversation

edsantiago
Copy link
Member

Having a container spin-wait on a /stop file, then exit, is
unsafe: 'podman exec $ctr touch /stop' can get sucked into
container cleanup before the exec terminates, resulting in
the podman-exec failing and hence the test failing.

Most existing instances of this pattern are unnecessary.
Replace those with just 'podman rm -f'.

When necessary, use a variety of safer alternatives.

Re-Closes: #10825 (already closed; this addresses remaining cases)

Signed-off-by: Ed Santiago [email protected]

None

@edsantiago edsantiago requested a review from vrothberg April 24, 2023 15:50
@openshift-ci openshift-ci bot added release-note-none approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 24, 2023
@edsantiago edsantiago force-pushed the container_exit_signaling branch from 98bcb99 to 911e55f Compare April 24, 2023 16:41
Having a container spin-wait on a /stop file, then exit, is
unsafe: 'podman exec $ctr touch /stop' can get sucked into
container cleanup before the exec terminates, resulting in
the podman-exec failing and hence the test failing.

Most existing instances of this pattern are unnecessary.
Replace those with just 'podman rm -f'.

When necessary, use a variety of safer alternatives.

Re-Closes: containers#10825 (already closed; this addresses remaining cases)

Signed-off-by: Ed Santiago <[email protected]>
@edsantiago edsantiago force-pushed the container_exit_signaling branch from 911e55f to faeed14 Compare April 24, 2023 17:36
Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I am curious. Do some of the fixed tests show up in your flake data, @edsantiago ?

@Luap99 PTAL

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 25, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago, vrothberg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [edsantiago,vrothberg]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 25, 2023
@openshift-merge-robot openshift-merge-robot merged commit 0a8113b into containers:main Apr 25, 2023
@edsantiago edsantiago deleted the container_exit_signaling branch April 25, 2023 11:39
@edsantiago
Copy link
Member Author

Do some of the fixed tests show up in your flake data

This one is too infrequent: most logs have cycled away, and my reports from 2021-22 are worthless: they do not include excerpts nor test names. (I have since improved my reporting habits).

Since I still have text logs archived on my laptop I could conceivably run a multi-line grep -B10 "rc=137.*xpected" ... |grep "podman.*exec.*touch" and cross-reference that to tests... but I'm not feeling strongly motivated to do so. I believe the pattern described above ("spin-wait on file, run exec to touch it") is dangerous, and I have fixed all such instances, and I don't have to worry about this kind of flake any more. Do you see a reason to pursue this further? (Serious question. If you do, please let me know, and I'll invest that effort).

@vrothberg
Copy link
Member

Thanks, @edsantiago! Please don't waste time pursuing this further. I was mostly wondering if there are some ripple effects of this PR that would allow closing a number of flakes at once.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Aug 26, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note-none
Projects
None yet
Development

Successfully merging this pull request may close these issues.

podman exec sometimes exits 137
4 participants