Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

proposal: executors, that publish DiscoveryInfo, should get A and SRV records #233

Open
jdef opened this issue Aug 11, 2015 · 5 comments
Open
Labels

Comments

@jdef
Copy link
Contributor

jdef commented Aug 11, 2015

this proposal establishes a new namespace for executor-provided services in mesos-dns:

  • {framework}.exec.{domain}

for example, in kubernetes-mesos the executor runs a "kubelet" and "kube-proxy" process, both of which are shared across all tasks on slave. each process exposes useful ports. it would be nice to somehow address the services on these ports.

it's important to remember that in kubernetes-world, every task is a pod and so each task is already assigned to its own netns (and gets its own IP address, etc). the scheduler will likely also be dynamically allocating ports for each task on the slave host and that these task ports have nothing to do with dynamically allocated executor ports.

examples of services exposed by kubelet and kube-proxy:

  • api & read-only api
  • health check
  • cadvisor

perhaps one way generate records for these executor-provided services would be:

  • A :: {framework}.exec.mesos
    • resolves to (multiple) IP address, for all running executors for the framework
  • A :: {ename}.{framework}.exec.mesos
    • resolves to (multiple) IP address of executor container named {ename}
  • A :: eid-{eid-hash}.{framework}.exec.mesos
    • resolves to (unique) IP address of executor container given a specific executor.id
  • SRV :: _{port-name}._{proto}.{framework}.exec.mesos
    • resolves to (multiple) eid-{eid-hash}.{framework}.exec.mesos:{di-port-number}
    • only generated if the requisite DiscoveryInfo is available

where (the following are required):

  • {eid-hash} --> hash-of(ExecutorInfo.id + framework-id + slave-id + other salt); for uniqueness
  • {ename} --> ExecutorInfo.discovery.name, or else ExecutorInfo.name
  • {port-name} --> ExecutorInfo.discovery.ports.ports[x].name
  • {di-port-number} --> ExecutorInfo.discovery.ports.ports[x].number
  • {proto} --> ExecutorInfo.discovery.ports.ports[x].protocol, or else both _tcp and _udp
  • {framework} --> name of the framework instance running on the cluster

the above allows us to identify the location of all kubelet API services in the cluster via:

  • _api._tcp.kubernetes.exec.mesos

furthermore, if the hash-of(.. + salt) algorithm is known to the framework, the k8s framework can refer to specific executor instances (aka k8s "nodes") via the executor-id hashed name:

  • eid-{eid-hash}.{framework}.exec.mesos

(+) this approach is compatible with multiple instances of the same framework in the same cluster, provided that they are registered with different framework names
(+) this approach is compatible with multiple framework executors (for a single framework) running on a single slave, provided that they have unique names
(+) this approach is compatible with the recent A and SRV record generation semantics recently introduced for the {framework}.slave.mesos namespace, established by #226

/cc @kozyraki @tsenart @sttts

@jdef jdef added the PTAL label Aug 11, 2015
@jdef
Copy link
Contributor Author

jdef commented Aug 11, 2015

xref kubernetes/kubernetes#11224

@tsenart
Copy link
Contributor

tsenart commented Aug 12, 2015

This all sounds perfectly reasonable overall. A few questions and remarks:

  1. Is there a technical reason for prefixing the A records with eid? I'd rather have {eid-hash}.{framework}.exec.mesos.
  2. Why do we need an extra salt in {eid-hash}?
  3. I'd expect A::{framework}.exec.mesos to return all executor IPs for a given framework.
  4. Are SRV records skipped when no DI is available?
  5. This question has been itching me: Consul generates SRV records that have the same name as A records, as well as the ones with underscores. Why can't we do the same? https://www.consul.io/docs/agent/dns.html

@jdef
Copy link
Contributor Author

jdef commented Aug 14, 2015

This all sounds perfectly reasonable overall. A few questions and remarks:

  1. Is there a technical reason for prefixing the A records with eid? I'd rather have {eid-hash}.{framework}.exec.mesos.

initial thinking was that it would help to distinguish between executor names and hashed executor ids. it doesn't really reduce the opportunity for collisions, unless people recognize that they may not want to use the eid- prefix for their executor names.

  1. Why do we need an extra salt in {eid-hash}?

may need salt to get a reasonable bit distribution in the hash function. salt isn't a hard requirement for me.

  1. I'd expect A::{framework}.exec.mesos to return all executor IPs for a given framework.

sounds good to me, i'll update the proposal.

  1. Are SRV records skipped when no DI is available?

yes, i'll make that clear in the proposal.

  1. This question has been itching me: Consul generates SRV records that have the same name as A records, as well as the ones with underscores. Why can't we do the same? https://www.consul.io/docs/agent/dns.html

we probably could. that sounds like another proposal :)

@tsenart
Copy link
Contributor

tsenart commented Aug 14, 2015

Thanks for the answers. Regarding the eid- prefix: Is there really a need to distinguish between executor names and executor ids?

@jdef
Copy link
Contributor Author

jdef commented Aug 14, 2015

Probably doesn't matter to mesos-dns; might matter to humans.

On Fri, Aug 14, 2015 at 11:39 AM, Tomás Senart [email protected]
wrote:

Thanks for the answers. Regarding the eid- prefix: Is there really a need
to distinguish between executor names and executor ids?


Reply to this email directly or view it on GitHub
#233 (comment)
.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants