-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add flag to administratively enable APIs on bootstrap #25009
Add flag to administratively enable APIs on bootstrap #25009
Conversation
/test |
k8s-1.26-kernel-4.19 hit known flake #24453. k8s-1.27-kernel-net-next hit a provisioning error while waiting for the VM to boot. I'll re-run these tests. |
/test-1.26-4.19 |
/test-1.27-net-next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for API
Signed-off-by: Joe Stringer <[email protected]>
This preparatory commit introduces a new Cell for each Swagger API Specification in order to allow reuse earlier on during the agent initialization / lifecycle. No functional changes intended. Signed-off-by: Joe Stringer <[email protected]>
Use the autogenerated API specifications in order to populate "allowed APIs" flags in the api specification objects for each API. This code will be hooked in for usage in subsequent commits. Signed-off-by: Joe Stringer <[email protected]>
Now that there's dedicated cells to provide the API specs for each API, each server can now directly depend on those rather than doing their own swagger spec instantiation. Signed-off-by: Joe Stringer <[email protected]>
Add new flag(s) to the daemon which restrict the ability for API clients to call certain API endpoints. The new option is an allowlist of Pascalized API endpoints that may be allowed, or optionally a Prefix followed by the '*' character in order to allow a wildcard of API endpoints, for example "*" for all API endpoints, or "Get*" for all GET endpoints. Set the default to allow all ("*") API modifications so that there is no change in behaviour for existing deployments. Signed-off-by: Joe Stringer <[email protected]>
The newly added API access flags are very flexible in the APIs that they allow to be administratively disabled, but there are some options that are really required for any standard Cilium operations. Add some basic sanity checking before applying the user configuration in these cases, as it could lead to unexpected results such as inability to keep the Cilium agent running or inability to deploy new endpoints. Signed-off-by: Joe Stringer <[email protected]>
pkg/api will be used in an upcoming commit, so rename the variable in this function to avoid that conflict. Signed-off-by: Joe Stringer <[email protected]>
Similar to the recent commit to administratively enable/disable support for agent API endpoints, extend this support to the cilium-health API exposed by nodes as well. Signed-off-by: Joe Stringer <[email protected]>
Signed-off-by: Joe Stringer <[email protected]>
For now this is not particularly important as the operator only has a /healthz API to get the health of the operator. However, if we ever extend this API in future then it could become useful to consistently apply API restrictions via the new flag. Signed-off-by: Joe Stringer <[email protected]>
b97e6b9
to
bdf566f
Compare
/test |
Hit #25146 on optional ci-aks run. I'll re-kick it. |
/ci-aks |
OK apparently I can't re-run the job since it's now disabled. It's optional anyway, so we can ignore it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small non-blocking feedback.
"GetConfig", | ||
"GetHealthz", | ||
"PutEndpointID", | ||
"DeleteEndpointID", | ||
"PostIPAM", | ||
"DeleteIPAMIP", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to either leave a comment in the code explaining why each one of these APIs are fundamental and / or leave the reasoning for it in the Warning log message so that users / admins know why it is important.
Also, let's say that the admin removed the liveness probe from the daemonset and therefore doesn't need the GetHealthz
and so it gets administratively disabled. Should we still print a Warning every time Cilium starts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 comments make sense, I can send a follow-up.
Warnings are not perfect, but this is exactly the sort of reason it's a warning and not an error or fatal message. This is a way to let the administrator know that they're running an unconventional setup, and if they know what they're doing and are happy with the consequences then they can ignore it. If you're that far off the beaten path, then I'm OK with having a warning until someone complains that it's a real issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on
#24967.Add a new (hidden) flag,
enable-cilium-api-server-access
, which accepts alist of APIs that the daemon should allow REST API clients to mutate.
This flag can be useful to restrict the sorts of changes that can be
made locally directly from the node or from users with access to the
Cilium container. Many of these have been added over time to assist
development and debugging purposes, and may not be necessary in
production environments.
Set the default to allow "all" API modifications (
*
) so that there is nochange in behaviour for existing deployments.
In order to successfully use this flag, it is likely that at least the
following options must be configured in order to allow cilium-cni to provision
pods, and to ensure that the health of the agent can be assessed by Kubernetes:
GetConfig
GetHealthz
PutEndpointID
DeleteEndpointID
PostIPAM
DeleteIPAMIP
Users may additionally see benefit in providing
Get*
as a wildcard in orderto allow any read-only APIs to be accessed, which can be very useful for
debugging.
Occasionally, additional "flush" (Delete) APIs can also be very useful in order
to mitigate issues at runtime, so this option should not be used unless you
have a pretty strong understanding of the APIs you are likely to rely upon.
For completeness, similar flags are also added for the cilium-health and
operator APIs, though the usefulness there is more limited.
Related docs: #24968