-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compose test: ongoing efforts to diagnose flakes #10031
compose test: ongoing efforts to diagnose flakes #10031
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: edsantiago The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The flaking test run, for posterity. I tested the new code by hardcoding I suppose we just merge this and see what happens on a test failure. @Luap99, @baude, any other suggestions for what we can run in addition to / instead of |
_show_ok 0 "$testname - curl failed with status $curl_rc" | ||
_show_ok 0 "$testname - curl (port $port) failed with status $curl_rc" | ||
# FIXME: is this useful? What else can we do to diagnose? | ||
docker-compose logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docker-compose logs should be good but can you add podman ps -a
. It would be good to know if the container is still active. Also ss -tulpn
should show us if the port is binded by the rootlesskit port forwarder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thank you!
2a041cd
to
7e901ae
Compare
Yay, we got a failure with the new code (containers#10017). It shows one ECONNRESET followed by a lot of ECONNREFUSED over an 8-second period (actually 15s because of the second curl retry). My hunch: the container itself is dying. No amount of retrying will get anything to work. So, instead of the curl retry, if curl fails, run 'docker-compose logs', 'podman ps', and 'ss -tulpn' and hope that one/more of those tells us something useful when the test flakes again. Also: DUH! Bitten by one of the most common bash pitfalls. Checking exit status after 'local' will always be zero. Split the declaration and the action into separate lines. Also: if curl fails, return immediately. There's no point in running the string output comparison. Also: in _show_ok(), don't emit "actual/expect" messages if both strings are empty. Signed-off-by: Ed Santiago <[email protected]>
7e901ae
to
1cf2b3e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/lgtm |
Yay, we got a failure with the new code (#10017). It shows
one ECONNRESET followed by a lot of ECONNREFUSED over an 8-second
period (actually 15s because of the second curl retry).
My hunch: the container itself is dying. No amount of retrying
will get anything to work. So, instead of the curl retry, if
curl fails, run 'docker-compose logs' and hope that it tells us
something useful when the test flakes again.
Also: DUH! Bitten by one of the most common bash pitfalls.
Checking exit status after 'local' will always be zero.
Split the declaration and the action into separate lines.
Also: if curl fails, return immediately. There's no point in
running the string output comparison.
Also: in _show_ok(), don't emit "actual/expect" messages
if both strings are empty.
Signed-off-by: Ed Santiago [email protected]