Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFP-25694: Move cilium/cilium-cli code into cilium/cilium repository #9
CFP-25694: Move cilium/cilium-cli code into cilium/cilium repository #9
Changes from 2 commits
bf48480
64b82c0
d7b9338
3b5b719
5bd1494
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cilium-CLI for PRs uses a helm chart and docker images to deploy Cilium CLI into the cluster, then runs it from there right? Looking at this.
Could we model the cilium/cilium CI like this instead, and autopublish CLI images for every PR or every main commit... then provide a way for cilium/cilium PRs to point to those CLI dev images?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking a step back, I guess there's two cases, right? There's:
For (1), it's helpful to be able to point arbitrary cilium/cilium PRs towards arbitrary cilium/cilium-cli versions, so that you can co-develop the tests and functionality. Some background here, I would hope that 80%+ of the coverage is already provided by unit tests and module tests in cilium/cilium, so the test changes in cilium-CLI are for the remaining e2e & integration testing. Those tests should be able to autodetect whatever functionality they need in the cluster so that they can be skipped when running against older Cilium versions that don't have the requisite support. While we do need to integrate such tests into cilium/cilium CLI, I am not sure I necessarily see the need for this to be tightly coupled with the implementation that goes into cilium/cilium.
For (2), the existing process to release the new version and rely on renovate seems reasonable enough to me. If we think it needs to happen faster, then someone can post the PR directly instead of renovate, or otherwise we'll get to it when we get to it (and take advantage of the automation that removes some of the manual work there).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My recollection of the origins of Cilium CLI in general was that it's intended to focus on the user experience in order to simplify installing, observing, maintaining live Cilium clusters. Tools like
cilium connectivity test
were conceived as a sanity check for user environments so we can quickly see whether a prod environment is healthy or not.Over time, given that this framework provided mechanisms to develop new tests without the baggage of ginkgo and jenkins, I think it became an attractive place for Cilium developers to expand testing. Now we're at several minutes to assess the cluster sanity, and it's doing a lot of advanced feature testing. We're talking about reconfiguring the target environment to perform some of this testing, and starting to hide such "extra test" options behind special flags to avoid impacting prod environments.
I have a bit of concern that over time we're losing sight of the original UX-focused goals of Cilium-CLI and turning it into the Cilium developer CI tool instead. On the way on that path, we'll simplify some Cilium developer workflows (ideal) but also potentially raise the barrier to entry for cilium-cli contribution (not ideal).
Should the tests for advanced Cilium functionality exist in cilium/cilium or cilium/cilium-cli? Well, probably cilium/cilium. Does that mean that all of cilium-cli should then go into cilium/cilium? I'm not so sure. I'd hope that most testing can be done in unit or module / cell level tests directly in the cilium/cilium tree, rather than being e2e. Even if we do need e2e tests for some of the new functionality, is the user-facing tool the right place to implement a lot of that advanced testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already do that, sort of. you can specify a commit SHA of cilium-cli here: https://github.com/cilium/cilium/blob/6d4b2f7092d634f67c6192c85143051c49846245/.github/workflows/conformance-aks.yaml#L74. you just need to edit these workflow files. i guess some people find this step more annoying than others.
i think this is the key question.
cilium connectivity test
unintentionally became "the" place to add e2e tests. maybe that's where things started going wrong 🔥 cc @brb.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gandro had another idea: use cilium-cli's
Hook
interface https://github.com/cilium/cilium-cli/blob/main/internal/cli/cmd/hooks.go. this might make more sense if we want to keep cilium-cli relatively lean, and at the same time let cilium developers use the samecilium connectivity test
framework to add advanced e2e tests.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this - cilium/cilium#25228. However, bunch of workflows
# pull_request: {}
need to be uncommented though to make changes effective.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We started adding more tests to the CLI connectivity suite, because otherwise we would have reinvented the CLI connectivity suite for e2e tests (as part of the Hyper Jump project). There was an attempt to introduce the K8s E2E testing fw, but we quickly realized that it's an tremendous effort, and the CLI already provides all the required constructs. Also, the CLI connectivity suite was executed by different GH workflows running in different environments. So adding a new test meant that it will be executed e.g., on EKS.
To address the long runtime concern - we will add a flag to prevent from inclusion more advanced tests which take time. However, I find it useful that users have means to do an extensive testing in their clusters.
Unfortunately, it's not possible for many features which involve interactions between many subsystems. I doubt that having Egress GW tests in a form of unit / control-plane would give us enough confidence.
In the end I am fine with any approach re cilium-cli (i.e., merge or having a
Hook
), as longs as test files live in cilium/cilium, and thus get executed for each PR.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, everything which gets tested after a merge usually involves a few folks spending their cycles instead of a single PR owner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of the backporting process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are more details in "Impact: Backporting to stable branches" section. let me know if it makes sense 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed the
since
in the middle there. I think the goal in that case is "Continue to test older Cilium versions against new versions of the CLI". The extra backporting overhead is a logical consequence of the proposal.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which version of Cilium CLI will these tests use?
Note there may also be security considerations particularly for cloud credential cases. I believe we have mitigations in place for this threat model already in cilium-cli which we may need to import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests build cilium-cli from the tree.