-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a backoff and retries to retrieving exited event #11681
Add a backoff and retries to retrieving exited event #11681
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mheon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
3b7246c
to
567faed
Compare
There's a potential race around extremely short-running containers and events with journald. Events may not be written for some time (small, but appreciable) after they are received, and as such we can fail to retrieve it if there is a sufficiently short time between us writing the event and trying to read it. Work around this by just retrying, with a 0.25 second delay between retries, up to 4 times. [NO TESTS NEEDED] because I have no idea how to reproduce this race in CI. Fixes containers#11633 Signed-off-by: Matthew Heon <[email protected]>
567faed
to
4ecbc7c
Compare
LGTM |
LGTM |
[+0194s] Error: initializing source docker://registry.fedoraproject.org/f32/fedora-toolbox:latest: pinging container registry registry.fedoraproject.org: Get "https://registry.fedoraproject.org/v2/": dial tcp 38.145.60.20:443: i/o timeout Looks like a registry died |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold
/hold cancel |
There's a potential race around extremely short-running containers and events with journald. Events may not be written for some time (small, but appreciable) after they are received, and as such we can fail to retrieve it if there is a sufficiently short time between us writing the event and trying to read it.
Work around this by just retrying, with a 0.25 second delay between retries, up to 4 times.
Fixes #11633