Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v4.0-rhel] Do not error on signalling a just-stopped container #14727

Conversation

openshift-cherrypick-robot
Copy link
Collaborator

This is an automated cherry-pick of #14533

/assign lsm5

Fixed a bug where Podman could print error messages when signals were forwarded to a container via `--sig-proxy` to a container as the container process exited.

Previous PR containers#12394 tried to address this, but made a mistake:
containers that have just exited do not move to the Exited state
but rather the Stopped state - as such, the code would never have
run (there is no way we start `podman kill`, and the container
transitions to Exited while we are doing it - that requires
holding the container lock, which Kill already does).

Fix the code to check Stopped as well (we omit Exited entirely
but it's a cheap check and our state logic could change in the
future). Also, return an error, instead of exiting cleanly - the
Kill failed, after all. ErrCtrStateInvalid is already handled by
the sig-proxy logic so there won't be issues.

[NO NEW TESTS NEEDED] This fixes a race that I cannot reproduce
myself, and I have no idea how we'd repro in CI.

Signed-off-by: Matthew Heon <[email protected]>
@rhatdan
Copy link
Member

rhatdan commented Jun 24, 2022

LGTM

@rhatdan
Copy link
Member

rhatdan commented Jun 24, 2022

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 24, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: openshift-cherrypick-robot, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 24, 2022
@mheon
Copy link
Member

mheon commented Jun 24, 2022

Lot of netavark test failures, looking...

@mheon
Copy link
Member

mheon commented Jun 24, 2022


Downloading latest netavark from upstream branch 'v1.0.1-rhel'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    22  100    22    0     0     44      0 --:--:-- --:--:-- --:--:--    44
100    22  100    22    0     0     44      0 --:--:-- --:--:-- --:--:--    44
/usr/local/libexec/podman
Archive:  /tmp/netavark.zip
warning [/tmp/netavark.zip]:  zipfile is empty

Potentially a network issue (on the Github or Cirrus end)? Or maybe that release just disappeared...

@mheon
Copy link
Member

mheon commented Jun 24, 2022

Restarting one of the failures to see if it's a flake

/lgtm
/hold

In case it is

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Jun 24, 2022
@vrothberg
Copy link
Member

Seems to fail consistently

@vrothberg
Copy link
Member

@lsm5 what's up with this PR? Do we need it? If it's for RHEL please link a BZ.

@lsm5
Copy link
Member

lsm5 commented Jul 11, 2022

@vrothberg
Copy link
Member

@mheon do you know what's up with CI?

@mheon
Copy link
Member

mheon commented Jul 12, 2022

Potentially trying to grab a bad version of Netavark for this branch? @cevich PTAL

@cevich
Copy link
Member

cevich commented Jul 12, 2022

Looking...

Downloading latest netavark from upstream branch 'v1.0.1-rhel'
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100    22  100    22    0     0     68      0 --:--:-- --:--:-- --:--:--    68
/usr/local/libexec/podman
Archive:  /tmp/netavark.zip
warning [/tmp/netavark.zip]:  zipfile is empty

@cevich
Copy link
Member

cevich commented Jul 12, 2022

...sigh, the netavark CI VM images got pruned 😭 The cirrus-cron jobs have been failing since June and nobody noticed because monitoring didn't exist. The monitoring part is fixed by containers/netavark#325 but it may be too late for me to recover the VM images, still looking...

@cevich
Copy link
Member

cevich commented Jul 12, 2022

...well If I can't be smart all the time, at least I can be lucky: containers/netavark#337 <-- should fix this after it merges and CI runs on that branch.

@rhatdan
Copy link
Member

rhatdan commented Jul 16, 2022

@cevich Does this need a rebase?

@cevich
Copy link
Member

cevich commented Jul 19, 2022

Looking.

@cevich
Copy link
Member

cevich commented Jul 19, 2022

warning [/tmp/netavark.zip]: zipfile is empty

This is referring to https://api.cirrus-ci.com/v1/artifact/github/containers/netavark/success/binary.zip?branch=v1.0.1-rhel. I suspect something changed on that branch which is causing the netavark binary not to be included for some reason. Once that's fixed, re-running the jobs here should pick it up. Investigating...

@cevich
Copy link
Member

cevich commented Jul 19, 2022

...curious, it appears a recent job run by cirrus-cron on the v1.0.1-rhel branch successfully built and stored all the proper artifacts. Though there was a problem early last week, everything looks good now. Let me try re-running a task here...

@cevich
Copy link
Member

cevich commented Jul 19, 2022

...looks like it's working now. Just needed to re-run the jobs (I just did).

@vrothberg
Copy link
Member

https://bugzilla.redhat.com/show_bug.cgi?id=2097049 targets v4.1.1 where it's fixed. Please reopen if I am mistaken.

@vrothberg vrothberg closed this Jul 28, 2022
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants