Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flake on failed podman-remote build #10030

Merged
merged 1 commit into from
Apr 14, 2021

Conversation

rhatdan
Copy link
Member

@rhatdan rhatdan commented Apr 14, 2021

We have a race condition where podman build can fail
but still return an exit code of 0. This PR ensures
that as soon as the build fails, the failed flag is set
eliminating the race.

Fixes: #10029

[NO TESTS NEEDED] Tests of failed builds are already in place, and
the elimination of the race should be enough.

Signed-off-by: Daniel J Walsh [email protected]

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 14, 2021
We have a race condition where podman build can fail
but still return an exit code of 0.  This PR ensures
that as soon as the build fails, the failed flag is set
eliminating the race.

Fixes: containers#10029

[NO TESTS NEEDED] Tests of failed builds are already in place, and
the elimination of the race should be enough.

Signed-off-by: Daniel J Walsh <[email protected]>
@rhatdan
Copy link
Member Author

rhatdan commented Apr 14, 2021

@edsantiago I ran your tests for a few minutes and could not get this PR to fail.

@edsantiago
Copy link
Member

@rhatdan I'm sorry... no-go. I'm still seeing it fail, anywhere from 7 seconds to 208 seconds. New reproducer shows how long it takes:

$ t0=$SECONDS;while :;do echo $(($SECONDS - $t0));../bin/podman-remote build -t build_test --pull-never . && break;done

@edsantiago
Copy link
Member

@Luap99 we got a compose flake! Rootless, and curl shows one ECONNRESET followed by consistent ECONNREFUSED. To me that implies that the container has died. The timestamps are consistent with that. I don't think any amount of retrying is going to help here; we need a different way to debug.

(Sorry to hijack the thread, Dan)

@mheon
Copy link
Member

mheon commented Apr 14, 2021

This LGTM, even if it's not a complete fix for the problem - the change on its own does seem to make things better.

@edsantiago
Copy link
Member

/lgtm
/hold
Le mieux est l'ennemi du bien

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 14, 2021
@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 14, 2021
@jwhonce
Copy link
Member

jwhonce commented Apr 14, 2021

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 14, 2021
@openshift-merge-robot openshift-merge-robot merged commit 9f36efd into containers:master Apr 14, 2021
edsantiago added a commit to edsantiago/libpod that referenced this pull request Apr 15, 2021
This test continues to flake on podman-remote (especially Ubuntu)
even after containers#10030 and containers#10034. I give up. Stop checking the error
message in podman-remote tests.

Signed-off-by: Ed Santiago <[email protected]>
jmguzik pushed a commit to jmguzik/podman that referenced this pull request Apr 26, 2021
This test continues to flake on podman-remote (especially Ubuntu)
even after containers#10030 and containers#10034. I give up. Stop checking the error
message in podman-remote tests.

Signed-off-by: Ed Santiago <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

podman-remote build: can fail but exit 0 (success)
6 participants