From 032ff0924e8e8ed3c4eb20e2bf12bf105410026e Mon Sep 17 00:00:00 2001
From: David Marchand
Date: Mon, 31 Jan 2022 18:35:17 +0100
Subject: [PATCH] dpdk: Disable initial PCI probe.
By default, DPDK probes all available resources (like PCI devices) and
takes over them.
This may not be desirable:
- for PCI devices bound to vfio-pci, the first application taking over
them "wins", meaning that OVS would prevent qemu from using some VF
devices,
- for mlx5 devices, the driver will maintain link status of all ports
even when OVS only uses a subset of them. Besides, kernel netdevices
to those probed (yet unused) devices lose Rx capabilities,
Disable the initial PCI probing by passing a 0000:00:00.0 allow list.
This change breaks setups that were using the
class=eth,mac=XX:XX:XX:XX:XX:XX as OVS was relying on the fact that all
DPDK ports were probed.
This can be restored by passing a 'dpdk-probe-at-init=true' option.
Add a warning for users of this syntax, and update the documentation.
Signed-off-by: David Marchand
---
Documentation/howto/dpdk.rst | 5 +++++
Documentation/intro/install/dpdk.rst | 7 +++++++
NEWS | 4 ++++
lib/dpdk.c | 8 ++++++++
lib/netdev-dpdk.c | 2 +-
vswitchd/vswitch.xml | 15 +++++++++++++++
6 files changed, 40 insertions(+), 1 deletion(-)
diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index 04609b20bd2..b0d9d63abc2 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -62,6 +62,11 @@ is suggested::
.. important::
+ Using this syntax requires that DPDK probes the PCI device owning those
+ multiple ports. This can be achieved by either setting an allowed list
+ of PCI devices in the ``dpdk-extra`` configuration, or by asking for
+ probing all PCI devices available (setting ``dpdk-probe-at-init`` to true).
+
Hotplugging physical interfaces is not supported using the above syntax.
This is expected to change with the release of DPDK v18.05. For information
on hotplugging physical interfaces, you should instead refer to
diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
index ebd29a45a96..190d73d4fa7 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -293,6 +293,13 @@ listed below. Defaults will be provided for all values not explicitly set.
sockets. If not specified, this option will not be set by default. DPDK
default will be used instead.
+``dpdk-probe-at-init``
+ Let DPDK EAL probe all available PCI devices at initialisation.
+ This consumes more resources as OVS may not use all probed devices and this
+ may have undesired side effects (like losing receiving capabilities for mlx5
+ VF kernel netdevs). However, this option must be enabled when using the
+ ``class=eth,mac=XX:XX:XX:XX:XX:XX`` syntax for DPDK ports.
+
``dpdk-hugepage-dir``
Directory where hugetlbfs is mounted
diff --git a/NEWS b/NEWS
index 6e3f56d731e..8fe2b18579f 100644
--- a/NEWS
+++ b/NEWS
@@ -25,6 +25,10 @@ Post-v3.4.0
formats.
- DPDK:
* OVS validated with DPDK 23.11.2.
+ * Probing of devices at DPDK init has been disabled to avoid wasting
+ resources on unused devices. This breaks DPDK netdev ports using
+ "class=eth,mac=" syntax (though it can be restored, see
+ Documentation/howto/dpdk.rst).
v3.4.0 - 15 Aug 2024
diff --git a/lib/dpdk.c b/lib/dpdk.c
index b7516257c5e..338298536d3 100644
--- a/lib/dpdk.c
+++ b/lib/dpdk.c
@@ -368,6 +368,14 @@ dpdk_init__(const struct smap *ovs_other_config)
svec_add_nocopy(&args, xasprintf("%d", cpu));
}
+ if (!args_contains(&args, "-a") && !args_contains(&args, "-b"))
+ && !smap_get_bool(ovs_other_config, "dpdk-probe-at-init", false) {
+ /* Prevent DPDK from probing all devices, unless some -a/-b is set in
+ * config. */
+ svec_add(&args, "-a");
+ svec_add(&args, "0000:00:00.0");
+ }
+
svec_terminate(&args);
optind = 1;
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 972e0dbb6d4..4de6eb6a3c1 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2058,7 +2058,7 @@ netdev_dpdk_get_port_by_mac(const char *mac_str, const char **extra_err)
}
}
- *extra_err = ", unknown mac";
+ *extra_err = ", unknown mac (dpdk-probe-at-init=true may be needed)";
return DPDK_ETH_PORT_ID_INVALID;
}
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 76255845911..11119295c89 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -425,6 +425,21 @@
+
+
+ Specifies whether DPDK should probe all devices available at the
+ time DPDK is initialised. This is required when declaring DPDK ports
+ using the "class=eth,mac=XX:XX:XX:XX:XX:XX" syntax but beware that
+ it implies more resources consumption and undesired side effects
+ with some devices (like mlx5).
+
+
+ If not specified, DPDK will probe no device at initialisation
+ which should be fine in most cases.
+
+
+