From 4727e2f9a60b1ddaa755d3f107a83d385a9d08e6 Mon Sep 17 00:00:00 2001 From: amaslennikov Date: Mon, 11 Mar 2024 19:30:29 +0300 Subject: [PATCH] Add 'IB VF GUID Configuration' design doc Signed-off-by: amaslennikov --- doc/design/ib-vf-configuration.md | 116 ++++++++++++++++++++++++++++++ 1 file changed, 116 insertions(+) create mode 100644 doc/design/ib-vf-configuration.md diff --git a/doc/design/ib-vf-configuration.md b/doc/design/ib-vf-configuration.md new file mode 100644 index 0000000000..8bacf558ee --- /dev/null +++ b/doc/design/ib-vf-configuration.md @@ -0,0 +1,116 @@ +--- +title: IB VF GUID Configuration +authors: + - almaslennikov +reviewers: + - SchSeba + - adrianchiris +creation-date: 11-03-2024 +last-updated: 13-03-2024 +--- + +# IB VF GUID Configuration + +## Summary +Allow SR-IOV Network Operator to use a static configuration file from the host filesystem to assign GUIDs to IB VFs + +## Motivation +We have customers using the SR-IOV operator to create IB VFs, and they need a way to automate GUID assignment, +so that IB VFs are automatically bound to the required PKeys and no additional manual configuration is needed. +In this use case, the GUID configuration is static and known in advance. +Now the GUIDs are assigned by the sriov-network-config-daemon randomly. + +### Use Cases + +### Goals + +* IB GUID configuration can be read from a static json file on the host +* IB GUID configuration is static and created in advance +* Static config file is available at the same path on every host + +### Non-Goals + +* Dynamic GUID allocation is out of scope of this proposal + +## Proposal + +Mount a configuration file as a part of `/host` hostPath in the config daemon and read it to retrieve the GUID configuration. + +### Workflow Description + +In the [`pkg/host/internal/sriov/sriov.go:configSriovVFDevices`](https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/82a6d6fdce71bd88a0d9368fb1750488e9a8e4e2/pkg/host/internal/sriov/sriov.go#L458) read from the static IB config file +and assign the GUIDs according to the configuration. Each PF is described by either PCI address +or the PF GUID and has either a list or a range of VF GUIDs. + +1. A script creating the config file is deployed to the host and creates a static GUID config file +2. SR-IOV network operator reads the file and assign GUIDs when IB VFs are created +3. Users employ [PKey selector](https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin/pull/517) in sriov-network-device-plugin to create PKey-specific resource pools +4. Users create an IB network CR +5. Users create pods and request resources from the specific pools + +### Implementation Details/Notes/Constraints + +There can be fewer VFs created than GUIDs. To persist the dynamic nature of the SR-IOV Network operator, +it’s proposed not to return an error in this case but assign as many GUIDs as possible. +To ensure that nothing breaks when users add/remove VFs, the GUID distribution order should always be the same for each individual host. + +If there are fewer GUIDs than VFs, then all the GUIDs should be assigned. + +### Config file + +Example of the config file: + +```json +[ + { + "pci_address": "", + "guids": [ + "02:00:00:00:00:00:00:00", + "02:00:00:00:00:00:00:01" + ] + }, + { + "pf_guid": "", + "rangeStart": "02:00:00:00:00:aa:00:02", + "rangeEnd": "02:00:00:00:00:aa:00:0a" + } +] +``` + +Requirements for the config file: + +* `pci_adress` and `pf_guid` cannot be set at the same time for a single device - should return an error +* if the list contains multiple entries for the same device, the first one shall be taken +* `rangeStart` and `rangeEnd` are both included in the range + +### Test Plan + +* Unit tests will be implemented for new logic. + +## Alternative solution + +The alternative solution is also based on the GUID configuration file being deployed on the host. +The difference here is that GUID assignment is done on the cni level when a VF is allocated to a pod. +ib-sriov-cni manages a host-local per-PF pool of allocated/free GUIDs and dynamically allocates the next free GUID to an allocated VF. + +### Workflow: + +1. A script is deployed to the host and creates a static GUID config file +2. SR-IOV network operator creates IB VFs with random GUIDs (as done now) +3. ib-sriov-cni is deployed, reads the config file and creates a cached GUID pool +4. Users create a single resource pool / per PF resource pools for IB VFs +5. Users create an IB network CR (and provide a PKey here) +6. Users create a pod requesting an IB VF +7. ib-sriov-cni allocates a next free GUID from the pool depending on the PKey and assigns it to the allocated VF + +## Comparison between the two alternatives + +The SR-IOV Network Operator approach: +* Easier to implement and less error-prone +* Manages the whole lifecycle of the VF (GUID is assigned at creation and never changes throughout the lifecycle) +* Operator has better visibility into the amount of configured VFs + +The IB-SRIOV-CNI approach: +* Offers more flexibility (Only when a VF is requested for an IB network will it be assigned a GUID) +* Easier to maintain complex use cases + * 2 PFs on the node evenly split between 2 PKeys. The CNI approach will require 2 per-PF resource pools and 4 network attachments. The operator approach will require 4 resource pools and 4 network attachments, one for each PKey-PF pair. \ No newline at end of file