Skip to content

Commit

Permalink
netdev-native-tnl: Fix use of uninitialized RSS hash.
Browse files Browse the repository at this point in the history
RSS hash calculation for a packet may be skipped in some cases.  One
of them is a simple match optimization.  Packet is not fully parsed
for the simple match, so there is no enough data to calculate the full
5-tuple hash.  However, when such a packet needs tunnel encapsulation,
we need RSS hash to calculate the source port for the outer UDP header.
And netdev_tnl_get_src_port() function doesn't check if the hash is
valid before using it.  So, such packets will likely end up with
different and unpredictable source ports potentially causing packet
reordering or other issues in the network:

 WARNING: MemorySanitizer: use-of-uninitialized-value
  0 0x10c129c in dp_packet_get_rss_hash lib/dp-packet.h:1029:5
  1 0x10b264c in netdev_tnl_get_src_port lib/netdev-native-tnl.h:131:12
  2 0x10b171a in netdev_tnl_push_udp_header lib/netdev-native-tnl.c:286:20
  3 0xb772fe in netdev_push_header lib/netdev.c:1037:13
  4 0x9673c4 in push_tnl_action lib/dpif-netdev.c:9067:11
  5 0x961abe in dp_execute_cb lib/dpif-netdev.c:9226:13
  6 0xbcb4b1 in odp_execute_actions lib/odp-execute.c:1008:17
  7 0x8e939f in dp_netdev_execute_actions lib/dpif-netdev.c:9524:5
  8 0x968f3f in dp_execute_userspace_action lib/dpif-netdev.c:9093:9
  9 0x962e54 in dp_execute_cb lib/dpif-netdev.c:9307:17
 10 0xbcb4b1 in odp_execute_actions lib/odp-execute.c:1008:17
 11 0x8e939f in dp_netdev_execute_actions lib/dpif-netdev.c:9524:5
 12 0x950fef in packet_batch_per_flow_execute lib/dpif-netdev.c:8271:5
 13 0x8ec8db in dp_netdev_input__ lib/dpif-netdev.c:8899:9
 14 0x8eb8ec in dp_netdev_input lib/dpif-netdev.c:8908:5
 15 0x92d5e8 in dp_netdev_process_rxq_port lib/dpif-netdev.c:5660:19
 16 0x8ee2c4 in dpif_netdev_run lib/dpif-netdev.c:6993:25
 17 0x9b442f in dpif_run lib/dpif.c:471:16
 18 0x5f8e3a in type_run ofproto/ofproto-dpif.c:367:9
 19 0x56c508 in ofproto_type_run ofproto/ofproto.c:1879:31
 20 0x4cb388 in bridge_run__ vswitchd/bridge.c:3281:9
 21 0x4c9b00 in bridge_run vswitchd/bridge.c:3346:5
 22 0x526043 in main vswitchd/ovs-vswitchd.c:130:9
 23 0x7f1192 in __libc_start_call_main
 24 0x7f1192 in __libc_start_main@GLIBC_2.2.5
 25 0x432b24 in _start (vswitchd/ovs-vswitchd+0x432b24)

The issue is caught by running the 'debug_slow' test under the memory
sanitizer.  Another way to reproduce is by sending two packets at once
through the datapath.  The first one will get the same memory chunk as
the upcalled packet with already calculated RSS, the second one will
get the brand new memory chunk without the calculated RSS, so these
two packets will have different source ports after encapsulation.
The test is updated to cover this case.

Fix the issue by checking if the hash is valid before using, re-parsing
and calculating if it is not.  The netdev_tnl_get_src_port() function
moved to the .c file, since there is no real reason for it to be in the
header.  Compiler can decide on inlining it.  The declaration kept in
the header, since all the other functions declared there, even if there
is no reason for that.

In the future we may want to consolidate all the places where we
re-calculate RSS hash into a single function, but it's a little tricky.
This is also a larger change that would be harder to backport.  So, not
touching that aspect for now.

Re-parsing the packet eliminates advantages of the simple match, but
it was designed primarily for very simple setups that do not involve
tunneling or any other complex processing, so it should not be a big
problem.  And simple match can still be used with tunneling when the
input port provides the RSS hash.

Also, checking if the hash is valid is a right thing to do anyways.

Next step might be to not use simple match when there is no RSS hash
and there is a tunnel push action, but it seems hard to implement,
especially since we don't know the actions until we lookup the flow.

Fixes: e7e9973 ("dpif-netdev: Forwarding optimization for flows with a simple match.")
Signed-off-by: Ilya Maximets <[email protected]>
  • Loading branch information
igsilya committed Nov 29, 2024
1 parent 54ba1e2 commit f8a741f
Show file tree
Hide file tree
Showing 3 changed files with 44 additions and 13 deletions.
36 changes: 33 additions & 3 deletions lib/netdev-native-tnl.c
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,27 @@ static struct vlog_rate_limit err_rl = VLOG_RATE_LIMIT_INIT(60, 5);
uint16_t tnl_udp_port_min = 32768;
uint16_t tnl_udp_port_max = 61000;

ovs_be16
netdev_tnl_get_src_port(struct dp_packet *packet)
{
uint32_t hash;

if (OVS_LIKELY(dp_packet_rss_valid(packet))) {
hash = dp_packet_get_rss_hash(packet);
} else {
struct flow flow;

flow_extract(packet, &flow);
hash = flow_hash_5tuple(&flow, 0);

dp_packet_set_rss_hash(packet, hash);
}

hash = ((uint64_t) hash * (tnl_udp_port_max - tnl_udp_port_min)) >> 32;

return htons(hash + tnl_udp_port_min);
}

void *
netdev_tnl_ip_extract_tnl_md(struct dp_packet *packet, struct flow_tnl *tnl,
unsigned int *hlen)
Expand Down Expand Up @@ -276,14 +297,18 @@ netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED,
uint16_t l3_ofs = packet->l3_ofs;
uint16_t l4_ofs = packet->l4_ofs;
struct udp_header *udp;
ovs_be16 udp_src;
int ip_tot_size;

/* We may need to re-calculate the hash and this has to be done before
* modifying the packet. */
udp_src = netdev_tnl_get_src_port(packet);

dp_packet_tnl_ol_process(packet, data);
udp = netdev_tnl_push_ip_header(packet, data->header, data->header_len,
&ip_tot_size, 0);

/* set udp src port */
udp->udp_src = netdev_tnl_get_src_port(packet);
udp->udp_src = udp_src;
udp->udp_len = htons(ip_tot_size);

if (udp->udp_csum) {
Expand Down Expand Up @@ -831,13 +856,18 @@ netdev_gtpu_push_header(const struct netdev *netdev,
struct netdev_vport *dev = netdev_vport_cast(netdev);
struct udp_header *udp;
struct gtpuhdr *gtpuh;
ovs_be16 udp_src;
int ip_tot_size;
unsigned int payload_len;

/* We may need to re-calculate the hash and this has to be done before
* modifying the packet. */
udp_src = netdev_tnl_get_src_port(packet);

payload_len = dp_packet_size(packet);
udp = netdev_tnl_push_ip_header(packet, data->header, data->header_len,
&ip_tot_size, 0);
udp->udp_src = netdev_tnl_get_src_port(packet);
udp->udp_src = udp_src;
udp->udp_len = htons(ip_tot_size);
/* Postpone checksum to the egress netdev. */
dp_packet_hwol_set_csum_udp(packet);
Expand Down
11 changes: 1 addition & 10 deletions lib/netdev-native-tnl.h
Original file line number Diff line number Diff line change
Expand Up @@ -123,16 +123,7 @@ netdev_tnl_ip_build_header(struct ovs_action_push_tnl *data,
extern uint16_t tnl_udp_port_min;
extern uint16_t tnl_udp_port_max;

static inline ovs_be16
netdev_tnl_get_src_port(struct dp_packet *packet)
{
uint32_t hash;

hash = dp_packet_get_rss_hash(packet);

return htons((((uint64_t) hash * (tnl_udp_port_max - tnl_udp_port_min)) >> 32) +
tnl_udp_port_min);
}
ovs_be16 netdev_tnl_get_src_port(struct dp_packet *);

void *
netdev_tnl_ip_extract_tnl_md(struct dp_packet *packet, struct flow_tnl *tnl,
Expand Down
10 changes: 10 additions & 0 deletions tests/tunnel-push-pop.at
Original file line number Diff line number Diff line change
Expand Up @@ -774,6 +774,16 @@ dnl Sending again to exercise the non-miss upcall path.
AT_CHECK([ovs-appctl netdev-dummy/receive int-br "${packet}4"])
OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep -E "${encap}${packet}4" | wc -l` -ge 2])

dnl Send two more packets at the same time to make sure they are distinct
dnl memory buffers.
AT_CHECK([ovs-appctl netdev-dummy/receive int-br "${packet}4" "${packet}4"])
OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep -E "${encap}${packet}4" | wc -l` -ge 4])

dnl Make sure all the packets are the same, i.e. have the same source port.
AT_CHECK([ovs-pcap p0.pcap | sed 's/.$//' | sort | uniq \
| grep -E -c "${encap}${packet}"], [0], [1
])

dnl Output to tunnel from the controller.
AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out int-br CONTROLLER "debug_slow,output:2" "${packet}5"])
OVS_WAIT_UNTIL([test `ovs-pcap p0.pcap | grep -E "${encap}${packet}5" | wc -l` -ge 1])
Expand Down

0 comments on commit f8a741f

Please sign in to comment.