Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API spec for FQDN selector #200

Merged
merged 1 commit into from
May 27, 2024
Merged

Conversation

rahulkjoshi
Copy link
Contributor

Issue #133

Provides the proposed API spec as well as some expected behaviors for both application Pods and ANP implementors to follow.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 22, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @rahulkjoshi. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot requested a review from Dyanngg February 22, 2024 17:54
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 22, 2024
Copy link

netlify bot commented Feb 22, 2024

Deploy Preview for kubernetes-sigs-network-policy-api ready!

Name Link
🔨 Latest commit 0a56a07
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-network-policy-api/deploys/664cc5531954c60008ea6bb8
😎 Deploy Preview https://deploy-preview-200--kubernetes-sigs-network-policy-api.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Member

@astoycos astoycos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave this a quick pass and I'll give it another one this week thanks @rahulkjoshi

npeps/npep-133-fqdn-egress-selector.md Outdated Show resolved Hide resolved
npeps/npep-133-fqdn-egress-selector.md Outdated Show resolved Hide resolved
npeps/npep-133-fqdn-egress-selector.md Show resolved Hide resolved
npeps/npep-133-fqdn-egress-selector.md Outdated Show resolved Hide resolved
npeps/npep-133-fqdn-egress-selector.md Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 29, 2024
@tssurya
Copy link
Contributor

tssurya commented Mar 5, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 5, 2024
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 6, 2024
@rahulkjoshi rahulkjoshi force-pushed the main branch 2 times, most recently from 658a1bb to 33b7833 Compare March 6, 2024 18:32
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 7, 2024
Copy link
Contributor

@Dyanngg Dyanngg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Dyanngg, rahulkjoshi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 26, 2024
@astoycos
Copy link
Member

Just a few minor questions, otherwise great job @rahulkjoshi!

record](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#services)
because the generated DNS records contain a list of all the Pod IPs that
back the service.
* ❌ Not Supported:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to add another gotcha: we had a user who used DNS load balancing to load balance to a set of external IPs, each in a separate k8s cluster. They then tried to use DNS policy with that; it would work when a pod connected to another cluster but not the local cluster because kube-proxy was short-circuiting the load balancer and handling the external IP in-cluster.

User thought they were using an external load balancer but that load balancer used external IPs under the covers. Very confusing!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love these stories :D
so did the problem happen because DNS matching happened after service loadbalancing? IOW, would matching network policy before service resolution solve this problem?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but then it'd be very hard to interleave normal and DNS policy in anything that resembles iptables. The DNS policy would happen "first" in order to see the pre-DNAT IPs but what if a higher priority normal policy should have dropped that traffic? BPF dataplanes that do policy and NAT can keep the pre-DNAT information around to match on at the policy stage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to a discussion we had about the egress CIDRs @tssurya

I forget if we documented the decision, but I think we concluded that trying to select traffic heading to service VIPs or service LoadBalancer IPs is not guaranteed to work. I think that same restriction can be applied here.

// "*" matches 0 or more DNS valid characters (except for "."), and may
// occur anywhere in the pattern.
// Examples:
// - `*.kubernetes.io` matches subdomains of kubernetes.io at that level.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked my colleagues in our solutions team for examples of matching multiple URL segments, we do have customers that use * to match multiple parts in a prefix. some examples they were able to share:

  • <team>.<platform>.example.com matched with *.example.com
  • <sessionID>.<session>.<random>.<region>.cache.amazonaws.com matched with *.cache.amazonaws.com

I think the AWS one would be *.*.*.*.cache.amazonaws.com in the proposed spec, which feels a bit messy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice examples, thank you for gathering!
I am getting more convinced every day that * should match any number of labels, unless someone can point out a security/performance/etc concerns about doing that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with that -- I can see CDNs getting pretty deeply nested so having * match multiple labels should be ok.

I don't imagine this should have any performance implications. Security-wise, you sacrifice some explicitness. But on the other hand, when you allow *.domain.com, you've already indicated you trust the owner domain.com

Copy link

@illrill illrill Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this can be useful, if not just for some additional perspectives.

In cilium/cilium#22081, I've been discussing this topic with maintainers of Cilium (wich regards to CiliumNetworkPolicy) . Though still not decided or implemented, we've been leaning towards * for single-level subdomain matching and ** for multi-level subdomains matching.

Personally I don't see any problem with * matching multiple subdomains, as you have proposed here. It's more intuitive 👍

@rahulkjoshi rahulkjoshi force-pushed the main branch 2 times, most recently from f28f400 to 54f2331 Compare April 18, 2024 15:51
// +optional
// +listType=set
// +kubebuilder:validation:MinItems=1
Domains []Domain `json:"domains,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the meeting @npinaeva Proposed DomainName here however regardless Antonio suggested we always reference an RFC for decisions like this

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://datatracker.ietf.org/doc/html/rfc1034 S3.1 is the relevant one;

A domain is identified by a domain name, and consists of that part of
the domain name space that is at or below the domain name which
specifies the domain.

So a Domain is a subtree of domain space whereas a Domain Name is the textual name of one particular node in that tree. Not sure that gives a clear answer to our naming dilemma though! * patterns represent subtrees, FQDNs represent single nodes in the tree! Perhaps DomainNamePatterns conveys exactly what our field represents (but I think Domains and DomainNames are both defensible/attackable depending on how closely you read the RFC 😆 )?

The "." suffix is also explained in that section. A "." at the end of the domain name means it is absolute, whereas no "." means that it is relative to the local domain and the DNS resolver should add suffixes as needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a domain name with * is still considered a domain name. I read RFC as "domain name is a string, domain is an internal structure of the domain name space".
If we read closer to the wildcards definition S4.3.3, it says

In the previous algorithm, special treatment was given to RRs with owner
names starting with the label "*".

S3.6 defines

When we talk about a specific RR, we assume it has the following:
owner - which is the domain name where the RR is found.

So owner name is the domain name, and owner name may start with *, which makes it a valid domain name?
Reading RFCs is fun :D

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They do say that the owner is a domain(!) but they also say

If any match, copy them into the answer
section, but set the owner of the RR to be QNAME, and
not the node with the "*" label.

I.e. don't send * on the wire and

The owner name of the wildcard RRs is of
the form "*.<anydomain>", where <anydomain> is any domain name.
<anydomain> should not contain other * labels

Which implicitly says that a wildcard is not a domain so, it's a bit contradictory!

I think the wildcard comes from implementation concerns, they're really talking about the configuration for a DN server (and probably a particular one at that) and saying "you'll need wildcards in your implementation to handle mail and they should work like this to avoid these problems".

Comment on lines +255 to +265
1. Each implementation will provide guidance on which DNS name-server is
considered authoritative for resolving domain names. This could be the
`kube-dns` Service or potentially some other DNS provider specified in the
implementation's configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it mean that if implementation X considers kube-dns authoritative but the Pod uses a different nameserver the policy will not apply?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as the two nameservers return the same IPs for domains it's not a problem. But if kube-dns says the IP is 1.2.3.4, the implementation only allows 1.2.3.4. If the pod's nameserver then says 5.6.7.8, then traffic will be denied.

Comment on lines 249 to 261
1. FQDN policies should not affect the ability of workloads to resolve domains,
only their ability to communicate with the IP backing them.
* For example, if a policy allows traffic to `kubernetes.io`, any selected
Pods can still resolve `wikipedia.org` or
`my-services.default.svc.cluster.local`, but can not send traffic to them
unless allowed by a different rule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of infer that this is based on some implementation details, but is hard to get from the text why this will happen ... "ability of workloads to resolve domains" means to filter DNS traffic or DNS packets? why should the policy filter things are not supposed to do? how is this different than any other traffic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to clarify that this spec is explicitly not doing DNS filtering. So even though the policy says "allow traffic to kubernetes.io", you can still resolve any domains you like.

I can add a clarification explicitly saying that FQDN policies will not perform DNS Filtering.

Comment on lines +260 to +270
* Pods are expected to make a DNS query for a domain before sending traffic
to it. If the Pod fails to send a DNS request and instead just sends
traffic to the IP (either because of caching or a static config), traffic
is not guaranteed to flow.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this implementable? this means the implementation has to cache all possible IPs of all the possible domains, no?

* Pods should respect the TTL of DNS records they receive. Trying to
establish new connection using DNS records that are expired is not
guaranteed to work.
* When the TTL for a DNS record expires, the implementor should stop
Copy link
Contributor

@aojea aojea Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what TTL records is referring here? the TTL records of the DNS response that the Pod has received?
does it imply that the implementation always has to sniff the traffic or access the nameserver logs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I see the table below, good work with that table, really intuitive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we've pretty much agreed that the only way to correctly implement this is via a proxy that listens to DNS responses.

The implementer can still deny traffic to `1.2.3.4` because no single
response contained the full chain required to resolve the domain.

## Alternatives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other alternative is implement DNS filtering, as cisco umbrella https://docs.umbrella.com/deployment-umbrella/docs/point-your-dns-to-cisco or dnsfilter do ...

@npinaeva
Copy link
Member

/lgtm
thanks @rahulkjoshi!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 27, 2024
@k8s-ci-robot k8s-ci-robot merged commit 6bafaf0 into kubernetes-sigs:main May 27, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants