-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API spec for FQDN selector #200
Conversation
Hi @rahulkjoshi. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
✅ Deploy Preview for kubernetes-sigs-network-policy-api ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave this a quick pass and I'll give it another one this week thanks @rahulkjoshi
/ok-to-test |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
658a1bb
to
33b7833
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Dyanngg, rahulkjoshi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Just a few minor questions, otherwise great job @rahulkjoshi! |
record](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#services) | ||
because the generated DNS records contain a list of all the Pod IPs that | ||
back the service. | ||
* ❌ Not Supported: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to add another gotcha: we had a user who used DNS load balancing to load balance to a set of external IPs, each in a separate k8s cluster. They then tried to use DNS policy with that; it would work when a pod connected to another cluster but not the local cluster because kube-proxy was short-circuiting the load balancer and handling the external IP in-cluster.
User thought they were using an external load balancer but that load balancer used external IPs under the covers. Very confusing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love these stories :D
so did the problem happen because DNS matching happened after service loadbalancing? IOW, would matching network policy before service resolution solve this problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, but then it'd be very hard to interleave normal and DNS policy in anything that resembles iptables. The DNS policy would happen "first" in order to see the pre-DNAT IPs but what if a higher priority normal policy should have dropped that traffic? BPF dataplanes that do policy and NAT can keep the pre-DNAT information around to match on at the policy stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar to a discussion we had about the egress CIDRs @tssurya
I forget if we documented the decision, but I think we concluded that trying to select traffic heading to service VIPs or service LoadBalancer IPs is not guaranteed to work. I think that same restriction can be applied here.
// "*" matches 0 or more DNS valid characters (except for "."), and may | ||
// occur anywhere in the pattern. | ||
// Examples: | ||
// - `*.kubernetes.io` matches subdomains of kubernetes.io at that level. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked my colleagues in our solutions team for examples of matching multiple URL segments, we do have customers that use * to match multiple parts in a prefix. some examples they were able to share:
<team>.<platform>.example.com
matched with*.example.com
<sessionID>.<session>.<random>.<region>.cache.amazonaws.com
matched with*.cache.amazonaws.com
I think the AWS one would be *.*.*.*.cache.amazonaws.com
in the proposed spec, which feels a bit messy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice examples, thank you for gathering!
I am getting more convinced every day that *
should match any number of labels, unless someone can point out a security/performance/etc concerns about doing that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with that -- I can see CDNs getting pretty deeply nested so having *
match multiple labels should be ok.
I don't imagine this should have any performance implications. Security-wise, you sacrifice some explicitness. But on the other hand, when you allow *.domain.com
, you've already indicated you trust the owner domain.com
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this can be useful, if not just for some additional perspectives.
In cilium/cilium#22081, I've been discussing this topic with maintainers of Cilium (wich regards to CiliumNetworkPolicy
) . Though still not decided or implemented, we've been leaning towards *
for single-level subdomain matching and **
for multi-level subdomains matching.
Personally I don't see any problem with *
matching multiple subdomains, as you have proposed here. It's more intuitive 👍
f28f400
to
54f2331
Compare
// +optional | ||
// +listType=set | ||
// +kubebuilder:validation:MinItems=1 | ||
Domains []Domain `json:"domains,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the meeting @npinaeva Proposed DomainName
here however regardless Antonio suggested we always reference an RFC for decisions like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://datatracker.ietf.org/doc/html/rfc1034 S3.1 is the relevant one;
A domain is identified by a domain name, and consists of that part of
the domain name space that is at or below the domain name which
specifies the domain.
So a Domain is a subtree of domain space whereas a Domain Name is the textual name of one particular node in that tree. Not sure that gives a clear answer to our naming dilemma though! *
patterns represent subtrees, FQDNs represent single nodes in the tree! Perhaps DomainNamePatterns
conveys exactly what our field represents (but I think Domains
and DomainNames
are both defensible/attackable depending on how closely you read the RFC 😆 )?
The "." suffix is also explained in that section. A "." at the end of the domain name means it is absolute, whereas no "." means that it is relative to the local domain and the DNS resolver should add suffixes as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a domain name with *
is still considered a domain name. I read RFC as "domain name is a string, domain is an internal structure of the domain name space".
If we read closer to the wildcards definition S4.3.3, it says
In the previous algorithm, special treatment was given to RRs with owner
names starting with the label "*".
S3.6 defines
When we talk about a specific RR, we assume it has the following:
owner - which is the domain name where the RR is found.
So owner name is the domain name, and owner name may start with *
, which makes it a valid domain name?
Reading RFCs is fun :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They do say that the owner is a domain(!) but they also say
If any match, copy them into the answer
section, but set the owner of the RR to be QNAME, and
not the node with the "*" label.
I.e. don't send *
on the wire and
The owner name of the wildcard RRs is of
the form "*.<anydomain>", where <anydomain> is any domain name.
<anydomain> should not contain other * labels
Which implicitly says that a wildcard is not a domain so, it's a bit contradictory!
I think the wildcard comes from implementation concerns, they're really talking about the configuration for a DN server (and probably a particular one at that) and saying "you'll need wildcards in your implementation to handle mail and they should work like this to avoid these problems".
1. Each implementation will provide guidance on which DNS name-server is | ||
considered authoritative for resolving domain names. This could be the | ||
`kube-dns` Service or potentially some other DNS provider specified in the | ||
implementation's configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it mean that if implementation X considers kube-dns authoritative but the Pod uses a different nameserver the policy will not apply?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as the two nameservers return the same IPs for domains it's not a problem. But if kube-dns says the IP is 1.2.3.4
, the implementation only allows 1.2.3.4
. If the pod's nameserver then says 5.6.7.8
, then traffic will be denied.
1. FQDN policies should not affect the ability of workloads to resolve domains, | ||
only their ability to communicate with the IP backing them. | ||
* For example, if a policy allows traffic to `kubernetes.io`, any selected | ||
Pods can still resolve `wikipedia.org` or | ||
`my-services.default.svc.cluster.local`, but can not send traffic to them | ||
unless allowed by a different rule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of infer that this is based on some implementation details, but is hard to get from the text why this will happen ... "ability of workloads to resolve domains" means to filter DNS traffic or DNS packets? why should the policy filter things are not supposed to do? how is this different than any other traffic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to clarify that this spec is explicitly not doing DNS filtering. So even though the policy says "allow traffic to kubernetes.io", you can still resolve any domains you like.
I can add a clarification explicitly saying that FQDN policies will not perform DNS Filtering.
* Pods are expected to make a DNS query for a domain before sending traffic | ||
to it. If the Pod fails to send a DNS request and instead just sends | ||
traffic to the IP (either because of caching or a static config), traffic | ||
is not guaranteed to flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this implementable? this means the implementation has to cache all possible IPs of all the possible domains, no?
* Pods should respect the TTL of DNS records they receive. Trying to | ||
establish new connection using DNS records that are expired is not | ||
guaranteed to work. | ||
* When the TTL for a DNS record expires, the implementor should stop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what TTL records is referring here? the TTL records of the DNS response that the Pod has received?
does it imply that the implementation always has to sniff the traffic or access the nameserver logs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I see the table below, good work with that table, really intuitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we've pretty much agreed that the only way to correctly implement this is via a proxy that listens to DNS responses.
The implementer can still deny traffic to `1.2.3.4` because no single | ||
response contained the full chain required to resolve the domain. | ||
|
||
## Alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other alternative is implement DNS filtering, as cisco umbrella https://docs.umbrella.com/deployment-umbrella/docs/point-your-dns-to-cisco or dnsfilter do ...
/lgtm |
Issue #133
Provides the proposed API spec as well as some expected behaviors for both application Pods and ANP implementors to follow.