-
-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore support for env-based API server discovery #1000
Comments
From recent AKS release notes https://github.com/Azure/AKS/releases/tag/2022-07-17:
AKS clusters with egress firewalls basically require the use of the environment variables for discovery :( |
Seems like the important section is https://docs.microsoft.com/en-us/azure/aks/limit-egress-traffic#required-outbound-network-rules-and-fqdns-for-aks-clusters. This seems like a pretty weird way to handle things on their end, but I can see other use cases for letting users override the address. And as you say, it's probably worth at least trying to stay aligned with client-go to minimize confusion. I'd be +1 on reverting #876, but waiting for @clux's input since he seemed to have pretty strong opinions on the topic. |
I opened an issue on AKS Azure/AKS#3183, but that's probably a longer conversation. I'd be willing to get a PR together if we can make it into the next kube release :) |
Well, the problem is that the old method also has problems, so not sure there's many great solutions here for us.
And as long as client-go has this TODO, it's not super tempting to switch our default way back to what is also broken. What about some way to opt-into to old behaviour like some |
The AWS thing looks like an error on our side, Rust's The rustls situation is sadder, but outside of our control (and an opt-in feature). |
@clux Thanks. This is helpful context.
Something like this is probably a good middle-ground. In Linkerd, we could support an install flag that enables this. Alternatively, we could do something more clever (but maybe too clever): only honor fn kube_dns() -> String {
if let Ok(h) = std::env::var("KUBERNETES_SERVICE_HOST") {
if !h.is_empty() && h.parse::<std::net::IpAddr>().is_err() {
return h
}
}
"kubernetes.default.svc".to_string()
} |
This feels waaaay too weird and magical IMO. Maybe if we emit a loud warning when ignoring the address, but that would give 99% of users a warning for a non-issue that they have no control over. |
I also opened kubernetes/kubernetes#112263 to discuss changing client-go's behavior. |
Yeah, same. Given this is legacy behaviour, only affects a smaller number of users, and can be reasonably opted in by Linkerd, it is preferable for us to have it be an explicit opt-in in our codebase about what this is meant to do - and that it is not intended to stick around (unless client-go does a 180 at some point). Logging a message for using non-standard Kubernetes behaviour when activating this code path is probably a good idea as well, but it can be less loud when it's opt-in. (The IPv6 issue could also be fixed, but when we tried that fix it, we ran into cert validation issues, so making it opt-in avoids us spending time on that.) |
Those upstream issues are great. Thanks a lot! |
The `Config::from_cluster_env` constructor is misleadingly named: it doesn't use the environment, it uses the default cluster configurations. This change deprecates the `Config::from_cluster_env` constructor in favor of `Config::load_in_cluster`. An additional constructor, `Config::load_in_cluster_from_legacy_env`, uses the `KUBERNETES_SERVICE_HOST` and `KUBERNETES_SERVICE_PORT` environment variables to match client-go's behavior. This changes does NOT alter the default inferred configuration in any way. It simply allows users to opt-in to using the old behavior. Related to kubernetes/kubernetes#112263 Closes kube-rs#1000 Signed-off-by: Oliver Gould <[email protected]>
The `Config::from_cluster_env` constructor is misleadingly named: it doesn't use the environment, it uses the default cluster configurations. This change deprecates the `Config::from_cluster_env` constructor in favor of `Config::load_in_cluster`. An additional constructor, `Config::load_in_cluster_from_legacy_env`, uses the `KUBERNETES_SERVICE_HOST` and `KUBERNETES_SERVICE_PORT` environment variables to match client-go's behavior. This changes does NOT alter the default inferred configuration in any way. It simply allows users to opt-in to using the old behavior. Related to kubernetes/kubernetes#112263 Closes kube-rs#1000 Signed-off-by: Oliver Gould <[email protected]>
#1001 adds a separate config constructor; but kubernetes/kubernetes#112263 (comment) suggests that client-go's behavior may be "correct." If that's the case, #1001 at least fixes host/port formatting for IPv6; and we should probably try to fix the https://github.com/rustls/webpki/ is a recent fork of briansmith/webpki that will be maintained by the |
* client: Expose a Config constructor to support legacy configurations The `Config::from_cluster_env` constructor is misleadingly named: it doesn't use the environment, it uses the default cluster configurations. This change deprecates the `Config::from_cluster_env` constructor in favor of `Config::load_in_cluster`. An additional constructor, `Config::load_in_cluster_from_legacy_env`, uses the `KUBERNETES_SERVICE_HOST` and `KUBERNETES_SERVICE_PORT` environment variables to match client-go's behavior. This changes does NOT alter the default inferred configuration in any way. It simply allows users to opt-in to using the old behavior. Related to kubernetes/kubernetes#112263 Closes #1000 Signed-off-by: Oliver Gould <[email protected]> * constify "https" scheme to make accidental "http" harder Signed-off-by: Oliver Gould <[email protected]> * Restore `Config::from_cluster_env` naming Add `Config::from_cluster_dns` to support the current behavior. Signed-off-by: Oliver Gould <[email protected]> * Disable the in-cluster rustls test Signed-off-by: Oliver Gould <[email protected]> * fix typo Signed-off-by: Oliver Gould <[email protected]> * client: Make discovery conditional on the TLS impl When `rustls-tls` is enabled, the `kubernetes.default.svc` DNS name is used. Otherwise, the `KUBERNETES_SERVICE_{HOST,PORT}` environment variables are used. Signed-off-by: Oliver Gould <[email protected]> * Review feedback * Make `Config::incluster_env` and `Config::incluster_dns` public regardless of what features are enabled. * Restrict visibility for `pub` helpers that are not actually publicly exported. Signed-off-by: Oliver Gould <[email protected]> * Add URI-formatting tests Signed-off-by: Oliver Gould <[email protected]> * fmt Signed-off-by: Oliver Gould <[email protected]> Signed-off-by: Oliver Gould <[email protected]>
Current and expected behavior
In Linkerd, we have a report that dd0b258 breaks discovery for some configurations of Azure clusters. I'm not 100% clear on the details, but it sounds like they can't rely on the default
kubernetes
service.Given that client-go still honors these environment variables, this manifests as only the Rust controllers being unable to communicate with the API server.
Possible solution
In the interest of compatibility (i.e., with client-go's behavior), I think it would be best to restore support for the environment variables.
If we really don't want to continue to support the environment variables, we should probably engage client-go about removing support for the environment variables so that there's really no expectation that they can be used.
Additional context
No response
Environment
...
Configuration and features
...
Affected crates
kube-client
Would you like to work on fixing this bug?
maybe
The text was updated successfully, but these errors were encountered: