Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connectivity: Retry on inconclusive results #1543

Merged
merged 1 commit into from
Apr 28, 2023

Conversation

pchaigno
Copy link
Member

When running the connectivity tests in AKS, we sometimes get interrupted commands that don't have any output [1]. Unfortunately, those commands then exit without any error and are therefore considered successful. We think this is caused by connectivity blips between Kubernetes components.

This commit adds a check for those inconclusive results. If we see a seemingly successful command with no output, we retry it until we get something conclusive. This works because all our test commands (curl, ping, nslookup) dump something to stdout.

1 - cilium/cilium#22162

@pchaigno pchaigno temporarily deployed to ci April 26, 2023 22:06 — with GitHub Actions Inactive
@pchaigno pchaigno force-pushed the pr/pchaigno/retry-on-inconclusive-results branch from e5e7733 to ede4490 Compare April 26, 2023 22:12
@pchaigno pchaigno temporarily deployed to ci April 26, 2023 22:13 — with GitHub Actions Inactive
@pchaigno pchaigno marked this pull request as ready for review April 27, 2023 07:29
@pchaigno pchaigno requested a review from a team as a code owner April 27, 2023 07:29
@pchaigno pchaigno requested a review from joamaki April 27, 2023 07:29
Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit adds a check for those inconclusive results. If we see a seemingly successful command with no output, we retry it until we get something conclusive. This works because all our test commands (curl, ping, nslookup) dump something to stdout.

I'm a bit worried that at one point we start using commands that don't output to stdout and that will then fail due to this logic. But I don't know a better way to work around the issue we're seeing in cilium/cilium#22162. So let's give this a try and worry about this eventuality once we actually have such commands. Might be worth to add the above as a comment to the retry loop in (*Action).ExecInPod.

connectivity/check/action.go Outdated Show resolved Hide resolved
connectivity/check/action.go Outdated Show resolved Hide resolved
@tklauser tklauser removed the request for review from joamaki April 28, 2023 10:04
When running the connectivity tests in AKS, we sometimes get interrupted
commands that don't have any output [1]. Unfortunately, those commands
then exit without any error and are therefore considered successful. We
think this is caused by connectivity blips between Kubernetes
components.

This commit adds a check for those inconclusive results. If we see a
seemingly successful command with no output, we retry it until we get
something conclusive. This works because all our test commands (curl,
ping, nslookup) dump something to stdout.

1 - cilium/cilium#22162
Signed-off-by: Paul Chaignon <[email protected]>
@pchaigno pchaigno force-pushed the pr/pchaigno/retry-on-inconclusive-results branch from ede4490 to 6a4bdd4 Compare April 28, 2023 10:38
@pchaigno pchaigno temporarily deployed to ci April 28, 2023 10:38 — with GitHub Actions Inactive
@pchaigno
Copy link
Member Author

I'm a bit worried that at one point we start using commands that don't output to stdout and that will then fail due to this logic. But I don't know a better way to work around the issue we're seeing in cilium/cilium#22162. So let's give this a try and worry about this eventuality once we actually have such commands. Might be worth to add the above as a comment to the retry loop in (*Action).ExecInPod.

I have the same concern. If that assumption doesn't hold anymore in the future, we should at least see a clear failure. It's also why I added a separate error message for the "inconclusive results": I'm hoping that makes the failure cause clearer if the assumption doesn't hold anymore.

I've added a comment above the retry loop.

@pchaigno pchaigno requested a review from tklauser April 28, 2023 11:00
Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 28, 2023
@tklauser
Copy link
Member

multicluster (helm) workflow is failing due to #1554: https://github.com/cilium/cilium-cli/actions/runs/4829721713/jobs/8605084154?pr=1543

All other workflows passed. Merging.

@tklauser tklauser merged commit 927d788 into main Apr 28, 2023
@tklauser tklauser deleted the pr/pchaigno/retry-on-inconclusive-results branch April 28, 2023 11:50
mhofstetter added a commit to mhofstetter/cilium-cli that referenced this pull request Jul 21, 2023
When running the connectivity tests in AKS, we sometimes get interrupted
commands that don't have any output [1]. Therefore, successful command
terminations with an empty output are treated as inconclusive that
result in a retry [2].

Unfortunately, the ping command might get stuck by only outputting its
header line to the output.

`PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.`

This commit adds an additional check for the ping commands.
If we see an output only containing the header, we treat it as
inconclusive result.

1 - cilium/cilium#22162
2 - cilium#1543

Signed-off-by: Marco Hofstetter <[email protected]>
mhofstetter added a commit to mhofstetter/cilium-cli that referenced this pull request Jul 21, 2023
When running the connectivity tests in AKS, we sometimes get interrupted
commands that don't have any output [1]. Therefore, successful command
terminations with an empty output are treated as inconclusive that
result in a retry [2].

Unfortunately, the ping command might get stuck by only outputting its
header line to the output.

`PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.`

This commit adds an additional check for the ping commands.
If we see an output only containing the header, we treat it as
inconclusive result.

1 - cilium/cilium#22162
2 - cilium#1543

Signed-off-by: Marco Hofstetter <[email protected]>
mhofstetter added a commit to mhofstetter/cilium-cli that referenced this pull request Jul 21, 2023
When running the connectivity tests in AKS, we sometimes get interrupted
commands that don't have any output [1]. Therefore, successful command
terminations with an empty output are treated as inconclusive that
result in a retry [2].

Unfortunately, the ping command might get stuck by only outputting its
header line to the output.

`PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.`

This commit adds an additional check for the ping commands.
If we see an output only containing the header, we treat it as
inconclusive result.

1 - cilium/cilium#22162
2 - cilium#1543

Signed-off-by: Marco Hofstetter <[email protected]>
michi-covalent pushed a commit that referenced this pull request Jul 25, 2023
When running the connectivity tests in AKS, we sometimes get interrupted
commands that don't have any output [1]. Therefore, successful command
terminations with an empty output are treated as inconclusive that
result in a retry [2].

Unfortunately, the ping command might get stuck by only outputting its
header line to the output.

`PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.`

This commit adds an additional check for the ping commands.
If we see an output only containing the header, we treat it as
inconclusive result.

1 - cilium/cilium#22162
2 - #1543

Signed-off-by: Marco Hofstetter <[email protected]>
liyihuang pushed a commit to liyihuang/cilium-cli that referenced this pull request Aug 1, 2023
When running the connectivity tests in AKS, we sometimes get interrupted
commands that don't have any output [1]. Therefore, successful command
terminations with an empty output are treated as inconclusive that
result in a retry [2].

Unfortunately, the ping command might get stuck by only outputting its
header line to the output.

`PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.`

This commit adds an additional check for the ping commands.
If we see an output only containing the header, we treat it as
inconclusive result.

1 - cilium/cilium#22162
2 - cilium#1543

Signed-off-by: Marco Hofstetter <[email protected]>
michi-covalent pushed a commit to cilium/cilium that referenced this pull request Jul 16, 2024
When running the connectivity tests in AKS, we sometimes get interrupted
commands that don't have any output [1]. Therefore, successful command
terminations with an empty output are treated as inconclusive that
result in a retry [2].

Unfortunately, the ping command might get stuck by only outputting its
header line to the output.

`PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.`

This commit adds an additional check for the ping commands.
If we see an output only containing the header, we treat it as
inconclusive result.

1 - #22162
2 - cilium/cilium-cli#1543

Signed-off-by: Marco Hofstetter <[email protected]>
michi-covalent pushed a commit to cilium/cilium that referenced this pull request Aug 5, 2024
[ cherry-picked from cilium/cilium-cli repository ]

When running the connectivity tests in AKS, we sometimes get interrupted
commands that don't have any output [1]. Therefore, successful command
terminations with an empty output are treated as inconclusive that
result in a retry [2].

Unfortunately, the ping command might get stuck by only outputting its
header line to the output.

`PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.`

This commit adds an additional check for the ping commands.
If we see an output only containing the header, we treat it as
inconclusive result.

1 - #22162
2 - cilium/cilium-cli#1543

Signed-off-by: Marco Hofstetter <[email protected]>
github-merge-queue bot pushed a commit to cilium/cilium that referenced this pull request Aug 16, 2024
[ cherry-picked from cilium/cilium-cli repository ]

When running the connectivity tests in AKS, we sometimes get interrupted
commands that don't have any output [1]. Therefore, successful command
terminations with an empty output are treated as inconclusive that
result in a retry [2].

Unfortunately, the ping command might get stuck by only outputting its
header line to the output.

`PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.`

This commit adds an additional check for the ping commands.
If we see an output only containing the header, we treat it as
inconclusive result.

1 - #22162
2 - cilium/cilium-cli#1543

Signed-off-by: Marco Hofstetter <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants