Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFP-30984 : Add CFP for DNS proxy HA - V2 #54

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 109 additions & 0 deletions cilium/CFP-30984-dns-proxy-ha-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# CFP-30984: toFQDN DNS proxy HA

**SIG: SIG-POLICY**

**Begin Design Discussion:** 2024-08-19

**Cilium Release:** 1.17

**Authors:** Hemanth Malla <[email protected]>, Vipul Singh <[email protected]>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Administrative point:

If we think we agree on the basics here, but would like more of the design to be fleshed out:

Suggested change
Status: Provisional

(If provisional, please highlight anything that still needs to be expanded with the text "UNRESOLVED: description of unresolved design point").

Or alternatively if we think the design is satisfactorily complete:

Suggested change
Status: Implementable

See status in the readme for more details. https://github.com/cilium/design-cfps?tab=readme-ov-file#status

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the point I made at the end of the discussion with @cilium/sig-policy today. Whatever we agree upon, we should be able to merge in. If there are points left to iterate on, we can highlight those points. If we've resolved the points of difference we can mark it implementable. Goal is to make iteration easier so we don't have to overthink an entire design that is subject to change during implementation, but rather we can more incrementally change the design in-tree as we discuss, debate, implement and learn.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a status field once we close out the open discussion items.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also looks like we haven't documented Provisional in the readme yet ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also looks like we haven't documented Provisional in the readme yet ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, provisional is just under discussion in #48 but not documented yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mark this as implementable.

## Summary

Cilium agent uses a proxy to intercept all DNS queries and obtain necessary information to enforce FQDN network policies. However, the lifecycle of this proxy is coupled with the cilium agent. When an endpoint has a toFQDN network policy in place, cilium installs a redirect to capture all DNS traffic. So, when the agent is unavailable, all DNS requests time out, including when DNS name to IP address mappings are already in place for this name. DNS policy unload on shutdown can be enabled on the agent, but it works only when the L7 policy is set to * and the agent is shutdown gracefully.

This CFP introduces a standalone DNS proxy that can run alongside the Cilium agent, which should eliminate hard dependency for names that already have policy map entries in place.

## Motivation

Users rely on toFQDN policies to enforce network policies against traffic to destinations outside the cluster, typically to blob storage / other services on the internet. Rolling out the Cilium agent should not result in packet drops. Introducing a high availability (HA) mode will allow for increased adoption of toFQDN network policies in critical environments.

## Goals

* Introduce a streaming gRPC API for exchanging FQDN policy related information.
* Introduce a standalone DNS proxy (SDP) that binds on the same port as built-in proxy with SO_REUSEPORT.
* Enforce L7 DNS policy via SDP.

## Non-Goals

* Updating new DNS <> IP mappings when the agent is down.

## Proposal

### Overview

There are two parts to enforcing toFQDN network policy. L4 policy enforcement against IP addresses resolved from an FQDN and policy enforcement on DNS requests (L7 DNS policy). To enforce L4 policy, per endpoint policy bpf maps need to be updated. We'd like to avoid multiple processes writing entries to policy maps, so the standalone DNS proxy (SDP) needs a mechanism to notify agent of newly resolved FQDN <> IP address mappings. This CFP proposes exposing a new gRPC streaming API from the cilium agent. Since the connection is bi-directional, the cilium agent can reuse the same connection to notify the SDP of L7 DNS policy changes.
hemanthmalla marked this conversation as resolved.
Show resolved Hide resolved

Additionally, SDP needs to translate the IP address to cilium identity to enforce the policy. Our proposal involves retrieving the identity mapping from the cilium_ipcache BPF map. Currently L7 proxy (envoy) relies on accessing ipcache directly as well. We aren't aware of any efforts to introduce an abstraction to avoid reading bpf maps owned by the cilium agent beyond the agent process. If / when such abstraction is introduced, SDP can also be updated to implement a similar mechanism. We brainstormed a few options on how the API might look like if we exchange IP to identity mappings via the API as well, but it brings in a lot of additional complexity to keep the mappings in sync as endpoints churn. This CFP will focus on the contract between SDP and Cilium agent to exchange minimum information for implementing the high availability mode.
hemanthmalla marked this conversation as resolved.
Show resolved Hide resolved

In addition to existing unix domain socket (UDS) opened by the agent to host HTTP APIs, we'll need a new UDS for the gRPC streaming service with similar permissions.

### RPC Methods

Method : UpdateMappings (Invoked from SDP to agent)

_rpc UpdatesMappings(steam FQDNMapping) returns (Result){}_
hemanthmalla marked this conversation as resolved.
Show resolved Hide resolved
Request :
```
message FQDNMapping {
hemanthmalla marked this conversation as resolved.
Show resolved Hide resolved
string FQDN = 1; // DNS Name of the request made by the client
repeated bytes IPS = 2; // Resolved IP addresses
uint32 TTL = 3;
uint64 source_identity = 4; // Identity of the client making the DNS request
int dns_response_code = 5;
}
```
Response :
```
message Result {
bool success = 1;
}
```

Method : UpdatesDNSRules ( Invoked from agent to SDP via bi-directional stream )

_rpc UpdatesDNSRules(stream DNSPolicies) returns (Result){}_
Request :
```
message DNSPolicy {
uint64 source_identity = 1; // Identity of the workload this L7 DNS policy should apply to
hemanthmalla marked this conversation as resolved.
Show resolved Hide resolved
repeated string dns_pattern = 2; // Allowed DNS pattern this identity is allowed to resolve
hemanthmalla marked this conversation as resolved.
Show resolved Hide resolved
uint64 dns_server_identity = 3; // Identity of destination DNS server
hemanthmalla marked this conversation as resolved.
Show resolved Hide resolved
uint16 dns_server_port = 4;
uint8 dns_server_proto = 5;
}


message DNSPolicies {
repeated DNSPolicy l7_dns_policy = 1;
}

```

Response :
```
message Result {
bool success = 1;
}
```

### Load balancing

SDP and agent's DNS proxy will run on the same port using SO_REUSEPORT. By default, kernel will use round robin algorithm to distribute load evenly between all sockets in the reuseport group. If cilium agent's DNS proxy goes down, kernel will automatically switch all traffic to SDP and vice versa. In the future, we can consider using a custom bpf program to make SDP only act as a hot standby. See (PoC)[https://github.com/hemanthmalla/reuseport_ebpf/blob/main/bpf/reuseport_select.c] / eBPF summit 2023 talk for more details.


### High Level Information Flow

* Agent starts up with gRPC streaming service.
hemanthmalla marked this conversation as resolved.
Show resolved Hide resolved
* SDP starts up.
* Connects to gRPC service, retrying periodically until success.
* Agent sends current snapshot for L7 DNS Policy enforcement via UpdatesDNSRules to SDP.
* On policy recomputation, agent invokes UpdatesDNSRules.
* On DNS request from the client, DNS request redirects to DNS proxy port.
* Kernel round robin load balances between SDP and built in proxy.
* Assuming SDP gets the request, SDP enforces L7 DNS policy.
* Lookup identity based on IP address via bpf map.
hemanthmalla marked this conversation as resolved.
Show resolved Hide resolved
* Check against policy snapshot if this identity is allowed to resolve the current DNS name and is allowed to talk to DNS server target identity (also needs lookup).
* Make upstream DNS request from SDP.
* On response, SDP invokes UpdatesMappings() to notify agent of new mappings.
* Release DNS response after success from UpdatesMappings() / timeout.