Skip to content

Commit

Permalink
ofproto-dpif-upcall: Pause revalidators when purging.
Browse files Browse the repository at this point in the history
This issue has been observed when running traffic tests with a dpdk
enabled userspace datapath (though those tests are added in a separate
series).
However, the described issue also affects the kernel datapath which is
why this patch is sent separately.

A main thread executing the 'revalidator/purge' command could race with
revalidator threads that can be dumping/sweeping the purged flows at the
same time.

This race can be reproduced (with dpif debug logs) by running the
conntrack - ICMP related unit tests with the userspace datapath:

2023-10-09T14:11:55.242Z|00177|unixctl|DBG|received request
	revalidator/purge[], id=0
2023-10-09T14:11:55.242Z|00044|dpif(revalidator17)|DBG|netdev@ovs-netdev:
	flow_dump ufid:68ff6817-fb3b-4b30-8412-9cf175318294 <empty>,
	packets:0, bytes:0, used:never
2023-10-09T14:11:55.242Z|00178|dpif|DBG|netdev@ovs-netdev: flow_del
	ufid:07046e91-30a6-4862-9048-1a76b5a88a5b
	recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0),
	ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),
	packet_type(ns=0,id=0),
	eth(src=a6:0a:bf:e2:f3:f2,dst=62:23:0f:f6:2c:75),
	eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0,
	ttl=64,frag=no),udp(src=37380,dst=10000), packets:0, bytes:0,
	used:never
...
2023-10-09T14:11:55.242Z|00049|dpif(revalidator17)|WARN|netdev@ovs-netdev:
	failed to flow_get (No such file or directory)
	ufid:07046e91-30a6-4862-9048-1a76b5a88a5b <empty>, packets:0,
	bytes:0, used:never
2023-10-09T14:11:55.242Z|00050|ofproto_dpif_upcall(revalidator17)|WARN|
	Failed to acquire udpif_key corresponding to unexpected flow
	(No such file or directory):
	ufid:07046e91-30a6-4862-9048-1a76b5a88a5b
...
2023-10-09T14:11:55.242Z|00183|unixctl|DBG|replying with success, id=0: ""

To avoid this race, a first part of the fix is to pause (if not already
paused) the revalidators while the main thread is purging the datapath
flows.

Then a second issue is observed by running the same unit test with the
kernel datapath. Its dpif implementation dumps flows via a netlink request
(see dpif_flow_dump_create(), dpif_netlink_flow_dump_create(),
nl_dump_start(), nl_sock_send__()) in the leader revalidator thread,
before pausing revalidators:

2023-10-09T14:44:28.742Z|00122|unixctl|DBG|received request
	revalidator/purge[], id=0
...
2023-10-09T14:44:28.742Z|00125|dpif|DBG|system@ovs-system: flow_del
	ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 recirc_id(0),dp_hash(0),
	skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0),
	ct_mark(0),ct_label(0),eth(src=a6:0a:bf:e2:f3:f2,
	dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.1.1.1,
	tip=10.1.1.2,op=1,sha=a6:0a:bf:e2:f3:f2,tha=00:00:00:00:00:00),
	packets:0, bytes:0, used:never
...
2023-10-09T14:44:28.742Z|00129|unixctl|DBG|replying with success, id=0: ""
...
2023-10-09T14:44:28.742Z|00006|dpif(revalidator21)|DBG|system@ovs-system:
	flow_dump ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>,
	packets:0, bytes:0, used:never
...
2023-10-09T14:44:28.742Z|00012|dpif(revalidator21)|WARN|system@ovs-system:
	failed to flow_get (No such file or directory)
	ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>, packets:0,
	bytes:0, used:never
2023-10-09T14:44:28.742Z|00013|ofproto_dpif_upcall(revalidator21)|WARN|
	Failed to acquire udpif_key corresponding to unexpected flow
	(No such file or directory):
	ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9

To avoid evaluating already deleted flows, the second part of the fix is
to ensure that dumping from the leader revalidator thread is done out of
any pause request.

As a result of this patch, the unit test "offloads - delete ufid mapping
if device not exist - offloads enabled" does not need to waive the random
warning logs when purging dp flows.

Fixes: 98bb428 ("tests: Add command to purge revalidators of flows.")
Acked-by: Eelco Chaudron <[email protected]>
Acked-by: Simon Horman <[email protected]>
Signed-off-by: David Marchand <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
  • Loading branch information
david-marchand authored and igsilya committed Oct 20, 2023
1 parent d581473 commit bd86266
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 4 deletions.
17 changes: 15 additions & 2 deletions ofproto/ofproto-dpif-upcall.c
Original file line number Diff line number Diff line change
Expand Up @@ -990,7 +990,7 @@ udpif_revalidator(void *arg)
udpif->reval_exit = latch_is_set(&udpif->exit_latch);

start_time = time_msec();
if (!udpif->reval_exit) {
if (!udpif->reval_exit && !udpif->pause) {
bool terse_dump;

terse_dump = udpif_use_ufid(udpif);
Expand All @@ -1000,10 +1000,15 @@ udpif_revalidator(void *arg)
}
}

/* Wait for the leader to start the flow dump. */
/* Wait for the leader to reach this point. */
ovs_barrier_block(&udpif->reval_barrier);
if (udpif->pause) {
revalidator_pause(revalidator);
if (!udpif->reval_exit) {
/* The main thread resumed all validators, but the leader
* didn't start the dump, go to next iteration. */
continue;
}
}

if (udpif->reval_exit) {
Expand Down Expand Up @@ -3217,11 +3222,19 @@ upcall_unixctl_purge(struct unixctl_conn *conn, int argc OVS_UNUSED,
struct udpif *udpif;

LIST_FOR_EACH (udpif, list_node, &all_udpifs) {
bool wake_up = false;
int n;

if (!latch_is_set(&udpif->pause_latch)) {
udpif_pause_revalidators(udpif);
wake_up = true;
}
for (n = 0; n < udpif->n_revalidators; n++) {
revalidator_purge(&udpif->revalidators[n]);
}
if (wake_up) {
udpif_resume_revalidators(udpif);
}
}
unixctl_command_reply(conn, "");
}
Expand Down
2 changes: 0 additions & 2 deletions tests/system-offloads-traffic.at
Original file line number Diff line number Diff line change
Expand Up @@ -799,8 +799,6 @@ AT_CHECK([test $(ovs-appctl dpctl/dump-flows | grep -c "eth_type(0x0800)") -eq 0

OVS_TRAFFIC_VSWITCHD_STOP(["/could not open network device ovs-p0/d
/on nonexistent port/d
/failed to flow_get/d
/Failed to acquire udpif_key/d
/No such device/d
/failed to offload flow/d
"])
Expand Down

0 comments on commit bd86266

Please sign in to comment.