-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: optimize TCP and UDP checksum handling with eBPF #220
Conversation
Welcome @tzssangglass! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good @tzssangglass Thanks so much for doing this! Just some questions and comments, also looks like you need to run cargo fmt
Otherwise my general comments are
- Try and keep the unsafe closures as minimal as possible
- Can we condense alot of this code into a separate helper function with a good comment describing what it does? Something like (
packet_redirect_cksum_update
) or something like that?
dataplane/ebpf/src/ingress/tcp.rs
Outdated
} | ||
} | ||
|
||
update_tcp_conns(tcp_hdr_ref, &client_key, &mut lb_mapping)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be moved to after the cksum calculation code block as it was prior to this patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact, I found that I cannot put
if tcp_hdr_ref.rst() == 1 {
unsafe {
LB_CONNECTIONS.remove(&client_key)?;
}
}
let mut lb_mapping = LoadBalancerMapping {
backend,
backend_key,
tcp_state,
};
update_tcp_conns(tcp_hdr_ref, &client_key, &mut lb_mapping)?;
behind
let backend_ip = backend.daddr.to_be();
let ret = set_ipv4_ip_dst(&ctx, TCP_CSUM_OFF, &original_daddr, backend_ip);
if ret != 0 {
return Ok(TC_ACT_OK);
}
let backend_port = (backend.dport as u16).to_be();
let ret = set_ipv4_dest_port(&ctx, TCP_CSUM_OFF, &original_dport, backend_port);
if ret != 0 {
return Ok(TC_ACT_OK);
}
otherwise eBPF program would load failed, eBPF verifier shows that access out of bounds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think accessing tcp_hdr
after bpf_redirect_neigh
will cause an out-of-bounds error, but I don't know why the original code did not have this issue.
dataplane/ebpf/src/ingress/tcp.rs
Outdated
TCP_CSUM_OFF, | ||
original_daddr as u64, | ||
backend_ip as u64, | ||
IS_PSEUDO | (mem::size_of_val(&backend_ip) as u64), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the IS_PSEUDO
const doing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to my understanding, IS_PSEUDO | (mem::size_of_val(&backend_ip) as u64)
as the last argument of bpf_l4_csum_replace()
, it can simultaneously set the IS_PSEUDO flag and specify the size of the field to be updated, allowing the function to correctly recalculate the checksum of the transport layer and consider the impact of the pseudo header.
see more:
-
result of
IS_PSEUDO | (mem::size_of_val(&backend_ip) as u64)
:00010000 | 0100
->00010100
-
switch (flags & BPF_F_HDR_FIELD_MASK) { case 0: if (unlikely(from != 0)) return -EINVAL; inet_proto_csum_replace_by_diff(ptr, skb, to, is_pseudo); break; case 2: inet_proto_csum_replace2(ptr, skb, from, to, is_pseudo); break; case 4: inet_proto_csum_replace4(ptr, skb, from, to, is_pseudo); break;
here
BPF_F_HDR_FIELD_MASK
isenum { BPF_F_HDR_FIELD_MASK = 0xfULL, };
flags & BPF_F_HDR_FIELD_MASK
:00010100 & 00001111
->00000100
, and would choosecase 4
to update l4 checksum.
ok, I will update based on the comments. |
@tzssangglass needs a rebase on @astoycos when you're done with your review, if you'd like me to take a pass as well please do ping me |
udp ingress: 1. use `bpf_l4_csum_replace` `bpf_l3_csum_replace` to recalculate the checksums; 2. and `bpf_skb_store_bytes` to update dest ip and port; Signed-off-by: tzssangglass <[email protected]>
Signed-off-by: tzssangglass <[email protected]>
Signed-off-by: tzssangglass <[email protected]>
…estination IP and port modifications for TCP and UDP eBPF programs, streamlining the code by removing duplicate checksum and byte storage operations. Signed-off-by: tzssangglass <[email protected]>
Signed-off-by: tzssangglass <[email protected]>
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Huge improvement here thanks and sorry for the slowness @tzssangglass !!!
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: astoycos, tzssangglass The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
oops, it looks like k8s-ci-robot didn't squash the commit. |
udp and tcp ingress:
bpf_l4_csum_replace
bpf_l3_csum_replace
to recalculate the checksums;bpf_skb_store_bytes
to update dest ip and port;Signed-off-by: tzssangglass [email protected]