-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(dns): Expand SRV resolver to account for IPv6 #2864
Conversation
Given the limitations of the dns resolver lib, to retrieve a service IP we currently take the SRV response target, which has the form `<host|IP>.svc.ns.svc.cluster-domain.`, and extract the first segment replacing dashes with dots. This fails for IPv6 address, so in this change we attempt parsing for IPv4 and if that fails fallback to IPv6 by instead replacing dots with colons. The `trust-dns-resolver` has been renamed to `hickory-resolver` upstream, so we follow suit and also upgrade the version.
I'm pretty uncomfortable with parsing IPs out of DNS names. I know it's the existing approach that we hacked together years ago. Do we have any idea how hard it is to use DNS to discover the IPs? cc @zaharidichev Also, can you split out the hickory update into a distinct PR (e.g. so that this could be reverted independently)? |
|
So after spending some time looking at that, here are my thoughts:
use hickory_resolver::Resolver;
fn main() {
let hosts = vec![
"linkerd-identity-headless.linkerd.svc.cluster.local",
"linkerd-dst-headless.linkerd.svc.cluster.local",
"linkerd-policy.linkerd.svc.cluster.local",
];
let resolver = Resolver::from_system_conf().unwrap();
for host in hosts {
println!("looking up {}", host);
let results = resolver.lookup_ip(host).unwrap();
dbg!(results);
}
} zaharidichev@zahari-dev-2:~$ kubectl get pods -n linkerd -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
linkerd-destination-59756c78b-jpj2g 4/4 Running 0 45h 10.244.0.7 kind-control-plane <none> <none>
linkerd-destination-59756c78b-zrrmz 4/4 Running 0 20m 10.244.0.10 kind-control-plane <none> <none>
linkerd-identity-b8cf46c8d-687n8 2/2 Running 0 45h 10.244.0.5 kind-control-plane <none> <none>
linkerd-identity-b8cf46c8d-v9lfh 2/2 Running 0 20m 10.244.0.11 kind-control-plane <none> <none>
linkerd-proxy-injector-854f76f9bd-dp46g 2/2 Running 0 45h 10.244.0.6 kind-control-plane <none> <none>
zaharidichev@zahari-dev-2:~$ root@dns-test:/# ./dnstest
looking up linkerd-identity-headless.linkerd.svc.cluster.local
[dnstest/src/main.rs:14:9] results = LookupIp(
Lookup {
query: Query {
name: Name("linkerd-identity-headless.linkerd.svc.cluster.local"),
query_type: A,
query_class: IN,
},
records: [
Record {
name_labels: Name("linkerd-identity-headless.linkerd.svc.cluster.local."),
rr_type: A,
dns_class: IN,
ttl: 30,
rdata: Some(
A(
A(
10.244.0.11,
),
),
),
},
Record {
name_labels: Name("linkerd-identity-headless.linkerd.svc.cluster.local."),
rr_type: A,
dns_class: IN,
ttl: 30,
rdata: Some(
A(
A(
10.244.0.5,
),
),
),
},
],
valid_until: Instant {
tv_sec: 244851,
tv_nsec: 59158692,
},
},
)
looking up linkerd-dst-headless.linkerd.svc.cluster.local
[dnstest/src/main.rs:14:9] results = LookupIp(
Lookup {
query: Query {
name: Name("linkerd-dst-headless.linkerd.svc.cluster.local"),
query_type: A,
query_class: IN,
},
records: [
Record {
name_labels: Name("linkerd-dst-headless.linkerd.svc.cluster.local."),
rr_type: A,
dns_class: IN,
ttl: 30,
rdata: Some(
A(
A(
10.244.0.10,
),
),
),
},
Record {
name_labels: Name("linkerd-dst-headless.linkerd.svc.cluster.local."),
rr_type: A,
dns_class: IN,
ttl: 30,
rdata: Some(
A(
A(
10.244.0.7,
),
),
),
},
],
valid_until: Instant {
tv_sec: 244851,
tv_nsec: 78318836,
},
},
)
looking up linkerd-policy.linkerd.svc.cluster.local
[dnstest/src/main.rs:14:9] results = LookupIp(
Lookup {
query: Query {
name: Name("linkerd-policy.linkerd.svc.cluster.local"),
query_type: A,
query_class: IN,
},
records: [
Record {
name_labels: Name("linkerd-policy.linkerd.svc.cluster.local."),
rr_type: A,
dns_class: IN,
ttl: 30,
rdata: Some(
A(
A(
10.244.0.10,
),
),
),
},
Record {
name_labels: Name("linkerd-policy.linkerd.svc.cluster.local."),
rr_type: A,
dns_class: IN,
ttl: 30,
rdata: Some(
A(
A(
10.244.0.7,
),
),
),
},
],
valid_until: Instant {
tv_sec: 244851,
tv_nsec: 104219797,
},
},
) So my question is, cant we just get rid of the SRV lookup logic? I mean, sure we get a port out of these. But do we really need it? |
Thanks for the detailed respose @zaharidichev
My opinion is that it's indeed a bug, as I tested using the ip_iter method on the SRV lookup result and it just returns the first IP, contrary to their docs claim.
I also wonder that, so looking forward to @olix0r's comments. But to clarify my understanding, digging into #594, it seems the reasoning for using SRV was that it gives us the actual pod port, even though for all the control plane services, their port is the same as the pod's, right? |
Can we find any docs on Headless services that describe the behavior we can rely on across k8s DNS providers? |
So there is this: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#srv-records, which elaborates on SRV records for Headless services. Not sure whether this is something that we can rely on. And it is in line with what we are seeing, right ? More specifically:
|
In my last commit I've split out the library change, and created #2872 instead. |
To summarize the conversation we had last week:
|
Given the limitations of the dns resolver lib, to retrieve a service IP we currently take the SRV response target, which has the form
<host|IP>.svc.ns.svc.cluster-domain.
, and extract the first segment replacing dashes with dots. This fails for IPv6 address, so in this change we attempt parsing for IPv4 and if that fails fallback to IPv6 by instead replacing dots with colons.The(dealt with separately in #2872).trust-dns-resolver
has been renamed tohickory-resolver
upstream, so we follow suit and also upgrade the version.