From babca39a99bf1d39586362980b8e3fbce0407702 Mon Sep 17 00:00:00 2001 From: Hemanth Malla Date: Mon, 19 Aug 2024 17:29:36 -0400 Subject: [PATCH 1/5] Adding v2 of CFP for DNS proxy HA Signed-off-by: Hemanth Malla --- cilium/CFP-30984-dns-proxy-ha-v2.md | 109 ++++++++++++++++++++++++++++ 1 file changed, 109 insertions(+) create mode 100644 cilium/CFP-30984-dns-proxy-ha-v2.md diff --git a/cilium/CFP-30984-dns-proxy-ha-v2.md b/cilium/CFP-30984-dns-proxy-ha-v2.md new file mode 100644 index 0000000..cfaf953 --- /dev/null +++ b/cilium/CFP-30984-dns-proxy-ha-v2.md @@ -0,0 +1,109 @@ +# CFP-30984: toFQDN DNS proxy HA + +**SIG: SIG-POLICY** + +**Begin Design Discussion:** 2024-08-19 + +**Cilium Release:** 1.17 + +**Authors:** Hemanth Malla , Vipul Singh + +## Summary + +Cilium agent uses a proxy to intercept all DNS queries and obtain necessary information to enforce FQDN network policies. However, the lifecycle of this proxy is coupled with the cilium agent. When an endpoint has a toFQDN network policy in place, cilium installs a redirect to capture all DNS traffic. So, when the agent is unavailable, all DNS requests time out, including when DNS name to IP address mappings are already in place for this name. DNS policy unload on shutdown can be enabled on the agent, but it works only when the L7 policy is set to * and the agent is shutdown gracefully. + +This CFP introduces a standalone DNS proxy that can run alongside the Cilium agent, which should eliminate hard dependency for names that already have policy map entries in place. + +## Motivation + +Users rely on toFQDN policies to enforce network policies against traffic to destinations outside the cluster, typically to blob storage / other services on the internet. Rolling out the Cilium agent should not result in packet drops. Introducing a high availability (HA) mode will allow for increased adoption of toFQDN network policies in critical environments. + +## Goals + +* Introduce a streaming gRPC API for exchanging FQDN policy related information. +* Introduce a standalone DNS proxy (SDP) that binds on the same port as built-in proxy with SO_REUSEPORT. +* Enforce L7 DNS policy via SDP. + +## Non-Goals + +* Updating new DNS <> IP mappings when the agent is down. + +## Proposal + +### Overview + +There are two parts to enforcing toFQDN network policy. L4 policy enforcement against IP addresses resolved from an FQDN and policy enforcement on DNS requests (L7 DNS policy). To enforce L4 policy, per endpoint policy bpf maps need to be updated. We'd like to avoid multiple processes writing entries to policy maps, so the standalone DNS proxy (SDP) needs a mechanism to notify agent of newly resolved FQDN <> IP address mappings. This CFP proposes exposing a new gRPC streaming API from the cilium agent. Since the connection is bi-directional, the cilium agent can reuse the same connection to notify the SDP of L7 DNS policy changes. + +Additionally, SDP needs to translate the IP address to cilium identity to enforce the policy. Our proposal involves retrieving the identity mapping from the cilium_ipcache BPF map. Currently L7 proxy (envoy) relies on accessing ipcache directly as well. We aren't aware of any efforts to introduce an abstraction to avoid reading bpf maps owned by the cilium agent beyond the agent process. If / when such abstraction is introduced, SDP can also be updated to implement a similar mechanism. We brainstormed a few options on how the API might look like if we exchange IP to identity mappings via the API as well, but it brings in a lot of additional complexity to keep the mappings in sync as endpoints churn. This CFP will focus on the contract between SDP and Cilium agent to exchange minimum information for implementing the high availability mode. + +In addition to existing unix domain socket (UDS) opened by the agent to host HTTP APIs, we'll need a new UDS for the gRPC streaming service with similar permissions. + +### RPC Methods + +Method : UpdateMappings (Invoked from SDP to agent) + +_rpc UpdatesMappings(steam FQDNMapping) returns (Result){}_ +Request : +``` +message FQDNMapping { + string FQDN = 1; // DNS Name of the request made by the client + repeated bytes IPS = 2; // Resolved IP addresses + uint32 TTL = 3; + uint64 source_identity = 4; // Identity of the client making the DNS request + int dns_response_code = 5; +} +``` +Response : +``` +message Result { + bool success = 1; +} +``` + +Method : UpdatesDNSRules ( Invoked from agent to SDP via bi-directional stream ) + +_rpc UpdatesDNSRules(stream DNSPolicies) returns (Result){}_ +Request : +``` +message DNSPolicy { + uint64 source_identity = 1; // Identity of the workload this L7 DNS policy should apply to + repeated string dns_pattern = 2; // Allowed DNS pattern this identity is allowed to resolve + uint64 dns_server_identity = 3; // Identity of destination DNS server + uint16 dns_server_port = 4; + uint8 dns_server_proto = 5; +} + + +message DNSPolicies { + repeated DNSPolicy l7_dns_policy = 1; +} + +``` + +Response : +``` +message Result { + bool success = 1; +} +``` + +### Load balancing + +SDP and agent's DNS proxy will run on the same port using SO_REUSEPORT. By default, kernel will use round robin algorithm to distribute load evenly between all sockets in the reuseport group. If cilium agent's DNS proxy goes down, kernel will automatically switch all traffic to SDP and vice versa. In the future, we can consider using a custom bpf program to make SDP only act as a hot standby. See (PoC)[https://github.com/hemanthmalla/reuseport_ebpf/blob/main/bpf/reuseport_select.c] / eBPF summit 2023 talk for more details. + + +### High Level Information Flow + +* Agent starts up with gRPC streaming service. +* SDP starts up. +* Connects to gRPC service, retrying periodically until success. +* Agent sends current snapshot for L7 DNS Policy enforcement via UpdatesDNSRules to SDP. +* On policy recomputation, agent invokes UpdatesDNSRules. +* On DNS request from the client, DNS request redirects to DNS proxy port. +* Kernel round robin load balances between SDP and built in proxy. +* Assuming SDP gets the request, SDP enforces L7 DNS policy. + * Lookup identity based on IP address via bpf map. + * Check against policy snapshot if this identity is allowed to resolve the current DNS name and is allowed to talk to DNS server target identity (also needs lookup). +* Make upstream DNS request from SDP. +* On response, SDP invokes UpdatesMappings() to notify agent of new mappings. +* Release DNS response after success from UpdatesMappings() / timeout. \ No newline at end of file From 1980cad72fb1eb2df6e95f19e22b440e00df90c1 Mon Sep 17 00:00:00 2001 From: Hemanth Malla Date: Fri, 6 Sep 2024 12:19:44 -0400 Subject: [PATCH 2/5] Addressing feedback - part 1 Signed-off-by: Hemanth Malla --- cilium/CFP-30984-dns-proxy-ha-v2.md | 36 ++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 11 deletions(-) diff --git a/cilium/CFP-30984-dns-proxy-ha-v2.md b/cilium/CFP-30984-dns-proxy-ha-v2.md index cfaf953..96c8269 100644 --- a/cilium/CFP-30984-dns-proxy-ha-v2.md +++ b/cilium/CFP-30984-dns-proxy-ha-v2.md @@ -6,7 +6,7 @@ **Cilium Release:** 1.17 -**Authors:** Hemanth Malla , Vipul Singh +**Authors:** Hemanth Malla , Vipul Singh , Tamilmani Manoharan ## Summary @@ -32,7 +32,7 @@ Users rely on toFQDN policies to enforce network policies against traffic to des ### Overview -There are two parts to enforcing toFQDN network policy. L4 policy enforcement against IP addresses resolved from an FQDN and policy enforcement on DNS requests (L7 DNS policy). To enforce L4 policy, per endpoint policy bpf maps need to be updated. We'd like to avoid multiple processes writing entries to policy maps, so the standalone DNS proxy (SDP) needs a mechanism to notify agent of newly resolved FQDN <> IP address mappings. This CFP proposes exposing a new gRPC streaming API from the cilium agent. Since the connection is bi-directional, the cilium agent can reuse the same connection to notify the SDP of L7 DNS policy changes. +There are two parts to enforcing toFQDN network policy. L3/L4 policy enforcement against IP addresses resolved from an FQDN and policy enforcement on DNS requests (L7 DNS policy). To enforce L3/L4 policy, per endpoint policy bpf maps need to be updated. We'd like to avoid multiple processes writing entries to policy maps, so the standalone DNS proxy (SDP) needs a mechanism to notify agent of newly resolved FQDN <> IP address mappings. This CFP proposes exposing a new gRPC streaming API from the cilium agent. Since the connection is bi-directional, the cilium agent can reuse the same connection to notify the SDP of L7 DNS policy changes. Additionally, SDP needs to translate the IP address to cilium identity to enforce the policy. Our proposal involves retrieving the identity mapping from the cilium_ipcache BPF map. Currently L7 proxy (envoy) relies on accessing ipcache directly as well. We aren't aware of any efforts to introduce an abstraction to avoid reading bpf maps owned by the cilium agent beyond the agent process. If / when such abstraction is introduced, SDP can also be updated to implement a similar mechanism. We brainstormed a few options on how the API might look like if we exchange IP to identity mappings via the API as well, but it brings in a lot of additional complexity to keep the mappings in sync as endpoints churn. This CFP will focus on the contract between SDP and Cilium agent to exchange minimum information for implementing the high availability mode. @@ -50,7 +50,8 @@ message FQDNMapping { repeated bytes IPS = 2; // Resolved IP addresses uint32 TTL = 3; uint64 source_identity = 4; // Identity of the client making the DNS request - int dns_response_code = 5; + bytes source_ip = 5; // IP address of the client making the DNS request + int dns_response_code = 6; } ``` Response : @@ -65,21 +66,26 @@ Method : UpdatesDNSRules ( Invoked from agent to SDP via bi-directional stream ) _rpc UpdatesDNSRules(stream DNSPolicies) returns (Result){}_ Request : ``` +message DNSServer { + uint64 dns_server_identity = 1; // Identity of destination DNS server + uint32 dns_server_port = 2; + uint32 dns_server_proto = 3; +} + message DNSPolicy { uint64 source_identity = 1; // Identity of the workload this L7 DNS policy should apply to - repeated string dns_pattern = 2; // Allowed DNS pattern this identity is allowed to resolve - uint64 dns_server_identity = 3; // Identity of destination DNS server - uint16 dns_server_port = 4; - uint8 dns_server_proto = 5; + repeated string dns_pattern = 2; // Allowed DNS pattern this identity is allowed to resolve. + repeated DNSServer dns_servers = 3; } - message DNSPolicies { - repeated DNSPolicy l7_dns_policy = 1; + repeated DNSPolicy egress_l7_dns_policy = 1; } ``` +*Note: `dns_pattern` follows the same convention used in CNPs. See https://docs.cilium.io/en/stable/security/policy/language/#dns-based for more details* + Response : ``` message Result { @@ -94,7 +100,7 @@ SDP and agent's DNS proxy will run on the same port using SO_REUSEPORT. By defau ### High Level Information Flow -* Agent starts up with gRPC streaming service. +* Agent starts up with gRPC streaming service (only after resources are synced from k8s and ipcache bpf map is populated) * SDP starts up. * Connects to gRPC service, retrying periodically until success. * Agent sends current snapshot for L7 DNS Policy enforcement via UpdatesDNSRules to SDP. @@ -106,4 +112,12 @@ SDP and agent's DNS proxy will run on the same port using SO_REUSEPORT. By defau * Check against policy snapshot if this identity is allowed to resolve the current DNS name and is allowed to talk to DNS server target identity (also needs lookup). * Make upstream DNS request from SDP. * On response, SDP invokes UpdatesMappings() to notify agent of new mappings. -* Release DNS response after success from UpdatesMappings() / timeout. \ No newline at end of file +* Release DNS response after success from UpdatesMappings() / timeout. + +### Handling SDP <> Agent re-connections + +* When the agent is unavailable, SDP will periodically attempt to re-connect to the streaming service. Any FQDN<>IP mappings resolved when the agent is down will be cached in SDP and `UpdatesMappings` will be retried after establishing the connection. + * A new bpf map for ipcache is populated on agent startup, so SDP needs to re-open the ipcache bpf map when the connection is re-established. See https://github.com/cilium/cilium/pull/32864 for similar handling in envoy. + * On a new connection from SDP, the agent will invoke `UpdatesDNSRules` to notify SDP of all L7 DNS policy rules. + +* SDP will not listen on the DNS proxy port until a connection is established with cilium agent and initial L7 DNS policy rules are received. Meanwhile, built-in DNS proxy will continue to serve requests. SDP relies on cilium agent for initial bootstrap. In future, we could make SDP retrieve initial policy information from other sources, but this is not in scope for this CFP. From 272872bb295600d3a4f8847d853b2877673df973 Mon Sep 17 00:00:00 2001 From: Hemanth Malla Date: Wed, 25 Sep 2024 11:59:33 -0400 Subject: [PATCH 3/5] Addressing feedback - part 2 Signed-off-by: Hemanth Malla --- cilium/CFP-30984-dns-proxy-ha-v2.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/cilium/CFP-30984-dns-proxy-ha-v2.md b/cilium/CFP-30984-dns-proxy-ha-v2.md index 96c8269..812e385 100644 --- a/cilium/CFP-30984-dns-proxy-ha-v2.md +++ b/cilium/CFP-30984-dns-proxy-ha-v2.md @@ -34,7 +34,7 @@ Users rely on toFQDN policies to enforce network policies against traffic to des There are two parts to enforcing toFQDN network policy. L3/L4 policy enforcement against IP addresses resolved from an FQDN and policy enforcement on DNS requests (L7 DNS policy). To enforce L3/L4 policy, per endpoint policy bpf maps need to be updated. We'd like to avoid multiple processes writing entries to policy maps, so the standalone DNS proxy (SDP) needs a mechanism to notify agent of newly resolved FQDN <> IP address mappings. This CFP proposes exposing a new gRPC streaming API from the cilium agent. Since the connection is bi-directional, the cilium agent can reuse the same connection to notify the SDP of L7 DNS policy changes. -Additionally, SDP needs to translate the IP address to cilium identity to enforce the policy. Our proposal involves retrieving the identity mapping from the cilium_ipcache BPF map. Currently L7 proxy (envoy) relies on accessing ipcache directly as well. We aren't aware of any efforts to introduce an abstraction to avoid reading bpf maps owned by the cilium agent beyond the agent process. If / when such abstraction is introduced, SDP can also be updated to implement a similar mechanism. We brainstormed a few options on how the API might look like if we exchange IP to identity mappings via the API as well, but it brings in a lot of additional complexity to keep the mappings in sync as endpoints churn. This CFP will focus on the contract between SDP and Cilium agent to exchange minimum information for implementing the high availability mode. +Additionally, SDP needs to translate the IP address to cilium identity to enforce the policy. This CFP proposes to retrieve the identity mapping from the cilium_ipcache BPF map. If / when other another abstraction is introduced for getting IP<->Identity mappings (for ex, in L7 proxy), this implementation can use that abstraction. This CFP will focus on the contract between SDP and Cilium agent to exchange minimum information for implementing the high availability mode. In addition to existing unix domain socket (UDS) opened by the agent to host HTTP APIs, we'll need a new UDS for the gRPC streaming service with similar permissions. @@ -121,3 +121,7 @@ SDP and agent's DNS proxy will run on the same port using SO_REUSEPORT. By defau * On a new connection from SDP, the agent will invoke `UpdatesDNSRules` to notify SDP of all L7 DNS policy rules. * SDP will not listen on the DNS proxy port until a connection is established with cilium agent and initial L7 DNS policy rules are received. Meanwhile, built-in DNS proxy will continue to serve requests. SDP relies on cilium agent for initial bootstrap. In future, we could make SDP retrieve initial policy information from other sources, but this is not in scope for this CFP. + +### Handling Upgrades + +Other than the streaming API from the agent, this CFP introduces a dependency on the ipcache bpf map which isn't a stable API exposed to components beyond the agent. Sufficent tests will be added to catch such datapath changes impacting SDP. In order to support a safe upgrade path, SDP would need to support reading from the current and future formats of the map (including possibly reading from an entire new map). From dadacaef0429d04823bff2003c832b0d99ee1d40 Mon Sep 17 00:00:00 2001 From: Hemanth Malla Date: Wed, 23 Oct 2024 16:51:13 -0400 Subject: [PATCH 4/5] Updating RPC method for DNSPolicies Signed-off-by: Hemanth Malla --- cilium/CFP-30984-dns-proxy-ha-v2.md | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/cilium/CFP-30984-dns-proxy-ha-v2.md b/cilium/CFP-30984-dns-proxy-ha-v2.md index 812e385..ae9fb2e 100644 --- a/cilium/CFP-30984-dns-proxy-ha-v2.md +++ b/cilium/CFP-30984-dns-proxy-ha-v2.md @@ -42,7 +42,7 @@ In addition to existing unix domain socket (UDS) opened by the agent to host HTT Method : UpdateMappings (Invoked from SDP to agent) -_rpc UpdatesMappings(steam FQDNMapping) returns (Result){}_ +_rpc UpdatesMappings(FQDNMapping) returns (UpdatesMappingsResult){}_ Request : ``` message FQDNMapping { @@ -56,14 +56,14 @@ message FQDNMapping { ``` Response : ``` -message Result { +message UpdatesMappingsResult { bool success = 1; } ``` -Method : UpdatesDNSRules ( Invoked from agent to SDP via bi-directional stream ) +Method : SubscribeToDNSPolicies ( Invoked from agent to SDP via bi-directional stream ) -_rpc UpdatesDNSRules(stream DNSPolicies) returns (Result){}_ +_rpc SubscribeToDNSPolicies(stream DNSPoliciesResult) returns (stream DNSPolicies){}_ Request : ``` message DNSServer { @@ -80,18 +80,23 @@ message DNSPolicy { message DNSPolicies { repeated DNSPolicy egress_l7_dns_policy = 1; + string request_id = 2; // Random UUID based identifier which will be referenced in ACKs } ``` *Note: `dns_pattern` follows the same convention used in CNPs. See https://docs.cilium.io/en/stable/security/policy/language/#dns-based for more details* -Response : +*Note: `DNSPolicies` is a snapshot of the latest known policy information for all endpoints on the host. Sending a snapshot allows for dealing with deletions automatically* + +SDP to CA message format : ``` -message Result { - bool success = 1; +message DNSPoliciesResult { + bool success = 1; + string request_id = 2; } ``` +``` ### Load balancing @@ -124,4 +129,4 @@ SDP and agent's DNS proxy will run on the same port using SO_REUSEPORT. By defau ### Handling Upgrades -Other than the streaming API from the agent, this CFP introduces a dependency on the ipcache bpf map which isn't a stable API exposed to components beyond the agent. Sufficent tests will be added to catch such datapath changes impacting SDP. In order to support a safe upgrade path, SDP would need to support reading from the current and future formats of the map (including possibly reading from an entire new map). +Other than the streaming API from the agent, this CFP introduces a dependency on the ipcache bpf map which isn't a stable API exposed to components beyond the agent. An e2e test for the toFQDN HA feature in CI will be added to catch such datapath changes impacting SDP. In order to support a safe upgrade path, SDP would need to support reading from the current and future formats of the map (including possibly reading from an entire new map). From 8d7459caffb557f6fd3766aebddbe92fbc27c336 Mon Sep 17 00:00:00 2001 From: Hemanth Malla Date: Wed, 6 Nov 2024 15:00:34 -0500 Subject: [PATCH 5/5] Adding more details to goals and dealing with upgrades sections Signed-off-by: Hemanth Malla --- cilium/CFP-30984-dns-proxy-ha-v2.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/cilium/CFP-30984-dns-proxy-ha-v2.md b/cilium/CFP-30984-dns-proxy-ha-v2.md index ae9fb2e..27c8247 100644 --- a/cilium/CFP-30984-dns-proxy-ha-v2.md +++ b/cilium/CFP-30984-dns-proxy-ha-v2.md @@ -23,10 +23,11 @@ Users rely on toFQDN policies to enforce network policies against traffic to des * Introduce a streaming gRPC API for exchanging FQDN policy related information. * Introduce a standalone DNS proxy (SDP) that binds on the same port as built-in proxy with SO_REUSEPORT. * Enforce L7 DNS policy via SDP. +* When an endpoint's DNS traffic is selected by an L7 policy, DNS requests and responses will be forwarded to their destinations via SDP even if cilium-agent is not running. So, clients re-resolving DNS to establish new connections will not be blocked anymore if the IP addresses from the new resolution are unchanged. Note that the L3/L4 policy for the resolved names should have already been plumbed when the agent was running. ## Non-Goals -* Updating new DNS <> IP mappings when the agent is down. +* Updating new DNS <> IP mappings when the agent is down is not in scope for this CFP. Solving for this scenario would likely be more involved and may require creation of a dedicated set of bpf maps. This will likely come with a performance penalty of having to perform multiple lookups. We will explore solutions for this in a future CFP. ## Proposal @@ -96,7 +97,6 @@ message DNSPoliciesResult { string request_id = 2; } ``` -``` ### Load balancing @@ -129,4 +129,6 @@ SDP and agent's DNS proxy will run on the same port using SO_REUSEPORT. By defau ### Handling Upgrades -Other than the streaming API from the agent, this CFP introduces a dependency on the ipcache bpf map which isn't a stable API exposed to components beyond the agent. An e2e test for the toFQDN HA feature in CI will be added to catch such datapath changes impacting SDP. In order to support a safe upgrade path, SDP would need to support reading from the current and future formats of the map (including possibly reading from an entire new map). +Other than the streaming API from the agent, this CFP introduces a dependency on the ipcache bpf map which isn't a stable API exposed to components beyond the agent. An e2e test for the toFQDN HA feature in CI will be added to catch such datapath changes impacting SDP, giving us the opportunity to make necessary changes. + +When a new change is introduced to the ipcache bpf map, the userspace code to access data from ipcache needs to be refactored out into a library. This library would then be used in both the cilium agent and SDP. The library needs to support reading data from both old and new formats (we'll likely need to carry this for a couple of stable releases). This would mean that when upgrading to a new cilium version with updates to ipcache bpf map, SDP should always be upgraded first. This would allow SDP to seamlessly switch over to reading from the new ipcache map when the actual map migration is performed by cilium agent after it is upgraded. The actual heuristics to determine when to switch will depend on the nature of changes to the bpf map. Upgrading the cilium agent first would prevent SDP from reading ipcache data. So, the only supported upgrade order would be SDP first and then cilium agent.