Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: quarantine services nodeport w/ L7 policy test. #25236

Merged

Conversation

tommyp1ckles
Copy link
Contributor

@tommyp1ckles tommyp1ckles commented May 2, 2023

Changes in c1a0dba may have introduced flakes for this set of tests when attempting to reach NodePort from outside the cluster.

A replacement test is being added to cilium cli connectivity tests:

cilium/cilium-cli#1547

Quarantine this test for now until we can remove it.

Addresses: #25119

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 2, 2023
@tommyp1ckles tommyp1ckles marked this pull request as ready for review May 2, 2023 20:54
@tommyp1ckles tommyp1ckles requested review from a team as code owners May 2, 2023 20:54
@tommyp1ckles tommyp1ckles requested review from jibi and nebril May 2, 2023 20:54
Changes in c1a0dba may have introduced flakes for this set of tests
when attempting to reach NodePort from outside the cluster.

A replacement test is being added to cilium cli connectivity tests:

cilium/cilium-cli#1547

Quarantine this test for now until we can remove it.

Addresses: cilium#25119

Signed-off-by: Tom Hadlaw <[email protected]>
@tommyp1ckles tommyp1ckles force-pushed the pr/tp/quarantine-ext-l7-lb-test branch from ec28dbe to f1ed7d7 Compare May 2, 2023 20:55
@tommyp1ckles tommyp1ckles added the area/CI Continuous Integration testing issue or flake label May 2, 2023
@tommyp1ckles
Copy link
Contributor Author

tommyp1ckles commented May 2, 2023

/test

Job 'Cilium-PR-K8s-1.24-kernel-5.4' failed:

Click to show.

Test Name

K8sDatapathConfig Host firewall With native routing and endpoint routes

Failure Output

FAIL: Failed to reach 192.168.56.11:80 from testclient-host-2gs77

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.24-kernel-5.4/1945/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.24-kernel-5.4 so I can create one.

Then please upload the Jenkins artifacts to that issue.

@tommyp1ckles
Copy link
Contributor Author

/test-runtime

@tommyp1ckles
Copy link
Contributor Author

/test-1.24-5.4

@tommyp1ckles tommyp1ckles added the release-note/ci This PR makes changes to the CI. label May 3, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 3, 2023
@tommyp1ckles
Copy link
Contributor Author

/test-runtime

Copy link
Member

@jibi jibi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps worth reassigning this to sig-policy in CODEOWNERS?

@tommyp1ckles tommyp1ckles merged commit 90c5d1b into cilium:main May 4, 2023
@julianwiedmann
Copy link
Member

julianwiedmann commented May 8, 2023

I don't follow :/.

(it's not helping that #25119 turned into a catch-all for similar fails in main, for both IPv4 and IPv6 😞)

@joestringer
Copy link
Member

@tommyp1ckles @jschwinger233 any comments on @julianwiedmann 's questions above?

It looks to me like c1a0dba could have broken the test, potentially representing a real regression. The PR here doesn't explain why. Should we revert c1a0dba instead?

@jschwinger233
Copy link
Member

jschwinger233 commented May 25, 2023

@joestringer The quarantined ginkgo test in this PR is replaced by connectivity test in cli. As connectivity test passes, the ginkgo test failure could be a flake, instead of a regression, I think?

Should we revert c1a0dba instead?

The quarantined ginkgo test actually was introduced to cover c1a0dba, so reverting that commit won't help to resolve the issue.

Since we have connectivity test to cover the scenario, we can remove this ginkgo test case if that quarantine is bothering.

@joestringer
Copy link
Member

joestringer commented May 25, 2023

👍 OK, then if we added the test to cover that commit and the test is buggy and we have the same coverage in another CLI test, we should just remove the buggy test (on both main and v1.13 branches, we can propose the change to delete the test on main branch and then label it for backport to v1.13). Would you mind following up on that @jschwinger233 ?

sidenote: I would strongly discourage describing something as "could be a flake". All failures are the result of either bugs in production code, or bugs in the testing code, or (rarely) infrastructure issues. When we as developers call something a "flake", in my experience this is often a shorthand that sounds like a reasonable motivation to ignore something and do nothing about it. Over time, ignored bugs in testing infrastructure stack up and make the overall system unreliable, decreasing developer trust in the system, and eventually leading to decreased development velocity because we can't reliably validate the functionality. We need to be vigilant about enforcing a high level quality on the testing code, and if it's buggy, fix it or delete it. It's not good enough to merge a buggy test and impose the cost of those failures on the rest of the development community.

jschwinger233 added a commit to jschwinger233/cilium that referenced this pull request May 26, 2023
The test case was introduced to cover issue cilium#21954, but it turned out
the test is buggy and caused a number of CI flakes (cilium#25119).
Consequently, PR cilium#25236 put the test case under quarantine.

This commit removes that problematic test, as the target scenario has
been covered by connectivity test in cilium-cli
(cilium/cilium-cli#1547).

Signed-off-by: Zhichuan Liang <[email protected]>
brb pushed a commit that referenced this pull request May 26, 2023
The test case was introduced to cover issue #21954, but it turned out
the test is buggy and caused a number of CI flakes (#25119).
Consequently, PR #25236 put the test case under quarantine.

This commit removes that problematic test, as the target scenario has
been covered by connectivity test in cilium-cli
(cilium/cilium-cli#1547).

Signed-off-by: Zhichuan Liang <[email protected]>
sayboras pushed a commit to sayboras/cilium that referenced this pull request May 29, 2023
[ upstream commit eb5bf06 ]

The test case was introduced to cover issue cilium#21954, but it turned out
the test is buggy and caused a number of CI flakes (cilium#25119).
Consequently, PR cilium#25236 put the test case under quarantine.

This commit removes that problematic test, as the target scenario has
been covered by connectivity test in cilium-cli
(cilium/cilium-cli#1547).

Signed-off-by: Zhichuan Liang <[email protected]>
Signed-off-by: Tam Mach <[email protected]>
julianwiedmann pushed a commit that referenced this pull request Jun 2, 2023
[ upstream commit eb5bf06 ]

The test case was introduced to cover issue #21954, but it turned out
the test is buggy and caused a number of CI flakes (#25119).
Consequently, PR #25236 put the test case under quarantine.

This commit removes that problematic test, as the target scenario has
been covered by connectivity test in cilium-cli
(cilium/cilium-cli#1547).

Signed-off-by: Zhichuan Liang <[email protected]>
Signed-off-by: Tam Mach <[email protected]>
pchaigno pushed a commit that referenced this pull request Jan 8, 2024
[ upstream commit eb5bf06 ]

The test case was introduced to cover issue #21954, but it turned out
the test is buggy and caused a number of CI flakes (#25119).
Consequently, PR #25236 put the test case under quarantine.

This commit removes that problematic test, as the target scenario has
been covered by connectivity test in cilium-cli
(cilium/cilium-cli#1547).

Signed-off-by: Zhichuan Liang <[email protected]>
Signed-off-by: Tam Mach <[email protected]>
Signed-off-by: Michi Mutsuzaki <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake release-note/ci This PR makes changes to the CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants