Initial FQDN Selector NPEP with User stories

kubernetes-sigs · Sep 25, 2023 · 12eceea · 12eceea
1 parent bf98cec
commit 12eceea
Showing 1 changed file with 121 additions and 0 deletions.
diff --git a/npep/npep-133.md b/npep/npep-133.md
@@ -0,0 +1,121 @@
+# NPEP-133: FQDN Selector for Egress Traffic
+
+* Issue:
+  [#133](https://github.com/kubernetes-sigs/network-policy-api/issues/133)
+* Status: Provisional
+
+## TLDR
+
+This enhancement proposes adding a new optional selector to specify egress peers
+using Fully Qualified Domain Names (FQDNs).
+
+## Goals
+
+* Provide a selector to specify egress peers using a Fully Qualified Domain Name
+  (for example `kubernetes.io`).
+* Support a restricted set of regex matching capabilities when specifying FQDNs.
+* Currently only AdminNetworkPolicy is the intended scope for this proposal.
+  * Since Kubernetes NetworkPolicy does not have a FQDN selector, adding this
+    capability to BaselineAdminNetworkPolicy can result in unintended behavior.
+    For example, if BANP allows traffic to `example.io`, but the namespace admin
+    installs a Kubernetes Network Policy, the namespace admin has no way to
+    replicate the `example.io` selector using just Kubernetes Network Policies.
+
+## Non-Goals
+
+* This enhancement does not include a FQDN selector for allowing ingress
+  traffic.
+* This enhancement does not include any L7 matching or filtering capabilities,
+  like matching HTTP traffic or URL paths.
+  * This selector should not control what DNS records are resolvable from a
+      particular workload.
+* This enhancement does not provide a mechanism for selecting in-cluster
+  endpoints using FQDNs. This is explicitly disallowed by the spec.
+  * To select Pods, Nodes, API Server, AdminNetworkPolicy has more first party
+      selector with better UX.
+* This enhancement does not specify the details of how traffic is routed to the
+  specified destination. For example, it does not prescribe details around NAT
+  or egress gateways.
+* This enhancement does not require any mechanism for securing DNS resolution
+  (e.g. DNSSEC or DNS-over-TLS). Unsecured DNS requests are expected to be
+  sufficient for looking up FQDNs.
+
+## Introduction
+
+FQDN-based egress controls are a common enterprise security practice.
+Administrators often prefer to write security policies using DNS names such as
+“www.kubernetes.io” instead of capturing all the IP addresses the DNS name might
+resolve to. Keeping up with changing IP addresses is a maintenance burden, and
+hampers the readability of the network policies.
+
+## User Stories
+
+* As a cluster admin, I want to allow all Pods in the cluster to send traffic to
+  an external service specified by a well-known domain name. For example, all
+  Pods must be able to talk to `my-service.com`.
+
+* As a cluster admin, I want to allow Pods in the "monitoring" namespace to be
+  able to send traffic to a logs-sink, hosted at `logs-storage.com`
+
+* As a cluster admin, I want to allow all Pods in the cluster to send traffic to
+  any of the managed services provided by my Cloud Provider. Since the cloud
+  provider has a well known parent domain, I want to allow Pods to send traffic
+  to all sub-domains using a wild-card selector -- `*.my-cloud-provider.com`
+
+### Future User Stories
+
+These are some user stories we want to keep in mind, but due to limitations of
+the existing Network Policy API, cannot be implemented currently. The design
+goal in this case is to ensure we do not make these unimplementable down the line.
+
+* As a cluster admin, I want to block all cluster egress traffic by default, and
+  require namespace admins to create NetworkPolicies explicitly allowing egress
+  to the domains they need to talk to.
+
+  The Cluster admin would use a `BaselineAdminNetworkPolicy` object to switch
+  the default disposition of the cluster. Namespace admins would then use
+  a FQDN selector in the Kubernetes `NetworkPolicy` objects to allow
+  `my-service.com`.
+
+## API
+
+TODO
+
+## Alternatives
+
+### IP Block Selector
+
+IP blocks are an important tool for specifying Network Policies. However, they
+do not address all user needs and have a few short-comings when compared to FQDN
+selectors:
+
+* IP-based selectors can become verbose if a single logical service has numerous
+  IPs backing it.
+* IP-based selectors pose an ongoing maintanance burden for administrators, who
+  need to be aware of changing IPs.
+* IP-based selectors can result in policies that are difficult to read and
+  audit.
+
+### L7 Policy
+
+Another alternative is to provide a true L7 selector, similar to the policies
+provided by Service Mesh providers. While L7 selectors can offer more
+expressibility, they often come trade-offs that are not suitable for all users:
+
+* L7 selectors necessarily support a select set of protocols. Customers may be
+  using a custom protocol for application-level communication, but still want
+  the ability to specify endpoints using DNS.
+* L7 selectors often require proxies to perform deep packet inspection and
+  enforce the policies. These proxies can introduce un-desireable latencies in
+  the datapath of applications.
+
+## References
+
+* [NPEP #126](https://github.com/kubernetes-sigs/network-policy-api/issues/126):
+  Egress Control in ANP
+
+### Implementations
+
+* [Calico](https://docs.tigera.io/calico-enterprise/latest/network-policy/domain-based-policy)
+* [Cilium](https://docs.cilium.io/en/latest/security/policy/language/#dns-based)
+* [Open Shift](https://docs.openshift.com/container-platform/latest/networking/openshift_sdn/configuring-egress-firewall.html)