Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support on-demand hostname for workloads #608

Closed
kdorosh opened this issue Jul 13, 2023 · 7 comments · Fixed by #612
Closed

support on-demand hostname for workloads #608

kdorosh opened this issue Jul 13, 2023 · 7 comments · Fixed by #612
Assignees

Comments

@kdorosh
Copy link
Contributor

kdorosh commented Jul 13, 2023

edit: the idea for the implementation here is inspired by envoyproxy/envoy#20562


old, slightly confusing original context:

it's not completely implemented, but what we do have is wrong. what we have is per-pod hostnames for statefulsets:

ztunnel/src/dns/server.rs

Lines 324 to 331 in 9583efc

// Didn't find a service, try a workload.
if let Some(wl) = state.workloads.find_hostname(&search_name_str) {
return Some(ServerMatch {
server: Address::Workload(Box::new(wl)),
name: search_name,
alias,
});
}

we still need workload entry hostname support, which functions a bit differently.

the original intent behind the workload hostname field was to have a place to represent the WorkloadEntry.address field, for example:

apiVersion: networking.istio.io/v1alpha3
kind: WorkloadEntry
metadata:
  name: details-svc
spec:
  # use of the service account indicates that the workload has a
  # sidecar proxy bootstrapped with this service account. Pods with
  # sidecars will automatically communicate with the workload using
  # istio mutual TLS.
  serviceAccount: details-legacy
  address: vm1.vpc01.corp.net # this would be the hostname on the workload we resolve via async polling
  labels:
    app: details-legacy
    instance-id: vm1

The new proposal after headless services support in ztunnel is to add a new async_hostname field for the WE hostname behavior.

The address field in the WorkloadEntry above translates to an envoy DNS cluster, and the workload hostname field in ztunnel was originally intended for this. Note that this means the DNS "cluster" in ztunnel here should use the system forwarder (any address field that's only in istio's DNS cache should return nxdomain to avoid potential looping DNS queries).

it's important we have some way for ztunnel to do DNS resolution as a client for workloads for geographic DNS resolution, e.g. https://github.com/istio/istio/blob/7bf52db38c91c89a4d56d6a94f402fd9ddaf6465/pilot/pkg/serviceregistry/kube/conversion.go#L151-L155

also note that the ztunnel DNS "cluster" implementation should likely mirror envoy's implementation as closely as possible to keep semantics the same for users. this also means we don't just want this lookup available for DNS proxy, but also need this to work during the request path (e.g. a service entry se.istio.io resolves to a workload with hostname aws.elb.com which resolves to final dest) by pre-resolving the provided hostnames asynchronously

cc @nmittler

kdorosh referenced this issue Jul 13, 2023
Additionally, this adds support for DNS resolution for workload hostnames.
@hzxuzhonghu
Copy link
Member

Why donot let the upstream dns resolve the aws.elb.com and ztunnel dns no need handle it at all

@kdorosh
Copy link
Contributor Author

kdorosh commented Jul 14, 2023

Please correct me if I'm understanding your suggestion wrong, I will answer to the best of my understanding.

I think you're asking: "why don't you just forward a regular request upstream and do DNS on the request path as usual?"

My answer:

  1. I want the implementation and semantics to match sidecar as much as possible. To this end, async resolution of DNS of the request path is a closer to original behavior, and better prepares us to support the same APIs in sidecar+ambient in the future. For example, we also may need to eventually support more complex DNS flows like logical vs strict DNS depending on what kind of workflows are behind a WorkloadEntry
  2. The async DNS resolution should improve request time to these workloads rather than doing a normal flow.

@nmittler
Copy link
Contributor

@kdorosh

FYI, here's the tracking issue for workload hostnames: istio/istio#45895

the intent behind the workload hostname field was to have a place to represent the WorkloadEntry.address field

The goal of workload hostname task I posted above is to specifically address the case of statefulsets and to allow a client to access an endpoint for a headless service directly. Any other hostnames will just go to the upstream DNS resolver.

I'm open to the idea of also supporting WE. But, perhaps we should address it separately?

The async DNS resolution should improve request time to these workloads rather than doing a normal flow.

Sorry, what does async DNS resolution have to do with any of this?

FYI @howardjohn

@kdorosh
Copy link
Contributor Author

kdorosh commented Jul 14, 2023

ah, now I see what you mean. I think we're conflating per-pod hostnames for statefulsets vs workload entry hostnames in the ambient workload xds design.

I understand what you're doing here for statefulsets. We could also represent the statefulset per-pod hostname as actual services that just select one workload.. but then we have a service per workload when we could have optimized like you did.

The way you did this, however, hijacked the field that was originally meant for WorkloadEntry addresses that are hostnames per the original design doc. So I do see this as a separate issue; we can leave hostname as it is but then we will need a new field on the Workload, say async_hostname for async DNS-resolved workloads that were derived from WorkloadEntry.

Sorry, what does async DNS resolution have to do with any of this?

see #608 (comment). in particular:

I want the implementation and semantics to match sidecar as much as possible. To this end, async resolution of DNS of the request path is a closer to original behavior, and better prepares us to support the same APIs in sidecar+ambient in the future. For example, we also may need to eventually support more complex DNS flows like logical vs strict DNS depending on what kind of workflows are behind a WorkloadEntry

There are good reasons to support different connection semantics for ztunnel as a DNS client, roughly enumerated in https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/service_discovery

I'm happy to rename this issue to: "support hostname for workload entry" and update the issue to reflect we need a new hostname field on the Workload.

also happy to take this conversation to slack if you think that would help clear things up

@kdorosh kdorosh self-assigned this Jul 14, 2023
@nmittler
Copy link
Contributor

I'm happy to rename this issue to: "support hostname for workload entry" and update the issue to reflect we need a new hostname field on the Workload.

Yeah I think that makes sense.

@nmittler
Copy link
Contributor

nmittler commented Jul 14, 2023

@kdorosh I've also renamed my issue to clarify that it's addressing pod hostnames (as opposed to WE).

@kdorosh kdorosh changed the title workload hostname is "implemented" wrong support hostname for workload entry Jul 14, 2023
@kdorosh kdorosh changed the title support hostname for workload entry support on-demand hostname for workloads Jul 19, 2023
@kdorosh
Copy link
Contributor Author

kdorosh commented Jul 19, 2023

the idea for the implementation here is inspired by envoyproxy/envoy#20562

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants