-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sdnotify tests: try real hard to kill socat processes #9697
sdnotify tests: try real hard to kill socat processes #9697
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: edsantiago The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Sure, LGTM |
LGTM |
podman gating tests are hanging in the new Fedora CI setup; long and tedious investigation suggests that 'socat' processes are being left unkilled, which then causes BATS to hang when it (presumably) runs a final 'wait' in its end cleanup. The two principal changes are to exec socat in a subshell with fd3 closed, and to pkill its child processes before killing the process itself. I don't know if both are needed. The pkill definitely is; the exec may just be superstition. Since I've wasted more than a day of PTO time on this, I'm okay with a little superstition. What I do know is that with these two changes, my reproducer fails to reproduce in over one hour of trying (normally it fails within 5 minutes). AND, update: only rawhide (f35) leaves stray socat processes behind. f33 and ubuntu do not, so 'pkill -P' fails. I really have no idea what's going on. Signed-off-by: Ed Santiago <[email protected]>
2e94cff
to
660a729
Compare
Update: changed
Anyhow, on f33: socat-1.7.4.1-1.fc33 5.10.19-200.fc33 |
LGTM |
/lgtm |
/hold cancel |
podman gating tests are hanging in the new Fedora CI setup;
long and tedious investigation suggests that 'socat' processes
are being left unkilled, which then causes BATS to hang when
it (presumably) runs a final 'wait' in its end cleanup.
The two principal changes are to exec socat in a subshell
with fd3 closed, and to pkill its child processes before
killing the process itself. I don't know if both are needed.
The pkill definitely is; the exec may just be superstition.
Since I've wasted more than a day of PTO time on this, I'm
okay with a little superstition. What I do know is that with
these two changes, my reproducer fails to reproduce in over
one hour of trying (normally it fails within 5 minutes).
Signed-off-by: Ed Santiago [email protected]