Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SriovNetworkNodeConfigPolicy priority inconsistent #194

Closed
mskrocki opened this issue Oct 26, 2021 · 8 comments · Fixed by #196
Closed

SriovNetworkNodeConfigPolicy priority inconsistent #194

mskrocki opened this issue Oct 26, 2021 · 8 comments · Fixed by #196

Comments

@mskrocki
Copy link
Contributor

I have looked at the code and how it handles policies for the same root PF. I understand that the policy priorities are suppose to resolve competing configurations, and merge them if priorities are same.
Thing is I am unable to determine what is the "higher" priority in this project?
With: policy1 with p=30 and policy2 with p=40 which policy should overwrite which?

As a context will note that the polices are sorted by the priority in decreasing order: 100, 40, 30 and this is how they are handled thru the rest of the life-cycle.

Having defined that, the issue is either in: https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/api/v1/helper.go#L264 where we select the last policy i.e. lower priority value (30 over 40) overwrites the final VFs number

or the issue is in https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/pkg/utils/utils.go#L263 where we first apply the config of the first VfGroup to the VFs (here its 40 over 30).

@zshi-redhat
Copy link
Collaborator

zshi-redhat commented Oct 26, 2021

@mskrocki Regarding the priority number, low number has high priority. For example, priority 30 > priority 40.

There are two major steps for a policy to be applied on device:

  1. sriov operator aggregate all policies and generates the per-node device configuration, then write the generated configuration to sriovnetworknodestate.spec
  2. sriov config daemonset takes the sriovnetworknodestate.spec as input and apply the configuration to actual device

Priority number takes effect only in the first step.

@mskrocki
Copy link
Contributor Author

mskrocki commented Oct 26, 2021

@zshi-redhat thx for the quick reply.
What is the use case (or requirement) for such behavior?

As it stands, if I have:

kind: SriovNetworkNodePolicy
metadata:
  name: policy-1
spec:
  deviceType: vfio-pci
  isRdma: false
  linkType: eth
  nicSelector:
    deviceID: "1572"
    rootDevices:
    - "0000:41:00.0"
    vendor: "8086"
  nodeSelector:
    feature.node.kubernetes.io/network-sriov.capable: "true"
  numVfs: 4
  priority: 70
  resourceName: intelvfio

and

kind: SriovNetworkNodePolicy
metadata:
  name: policy-2
spec:
  deviceType: netdevice
  isRdma: false
  linkType: eth
  nicSelector:
    deviceID: "1572"
    rootDevices:
    - "0000:41:00.0"
    vendor: "8086"
  nodeSelector:
    feature.node.kubernetes.io/network-sriov.capable: "true"
  numVfs: 2
  priority: 50
  resourceName: intel

I end up with 2 VFs that have vfio-pci bound to them, instead of the netdevice that is defined in policy-2.

@zshi-redhat
Copy link
Collaborator

What is the generated sriovnetworknodestate.spec in the above example? If it's vfio-pci, then it is a bug.

@mskrocki
Copy link
Contributor Author

So for the above example resulting node state is:

  spec:
    dpConfigVersion: "5594931"
    interfaces:
    - linkType: eth
      name: enp65s0f0
      numVfs: 2
      pciAddress: "0000:41:00.0"
      vfGroups:
      - deviceType: vfio-pci
        policyName: policy-intl-vfio
        resourceName: intelvfio
        vfRange: 0-3
      - deviceType: netdevice
        policyName: policy-intl
        resourceName: intel
        vfRange: 0-1
  status:
    interfaces:
    - Vfs:
      - deviceID: 154c
        driver: vfio-pci
        pciAddress: "0000:41:02.0"
        vendor: "8086"
        vfID: 0
      - deviceID: 154c
        driver: vfio-pci
        pciAddress: "0000:41:02.1"
        vendor: "8086"
        vfID: 1
      deviceID: "1572"
      driver: i40e
      linkSpeed: 10000 Mb/s
      linkType: ETH
      mac: f8:f2:1e:a7:ac:a0
      mtu: 1500
      name: enp65s0f0
      numVfs: 2
      pciAddress: "0000:41:00.0"
      totalvfs: 64
      vendor: "8086"
    - deviceID: "1572"
      driver: i40e
      linkSpeed: 10000 Mb/s
      linkType: ETH
      mac: f8:f2:1e:a7:ac:a1
      mtu: 1500
      name: enp65s0f1
      pciAddress: "0000:41:00.1"
      totalvfs: 64
      vendor: "8086"
    syncStatus: Succeeded

@zshi-redhat
Copy link
Collaborator

I see, this is a bug, it shall generate something like:

    interfaces:
    - linkType: eth
      name: enp65s0f0
      numVfs: 2
      pciAddress: "0000:41:00.0"
      vfGroups:
      - deviceType: netdevice
        policyName: policy-intl
        resourceName: intel
        vfRange: 0-1

@mskrocki would you mind proposing a fix for the issue?

@mskrocki
Copy link
Contributor Author

so the vfGroups is the main indicator, and that list should be ordered from highest to lowest priority (30, 40, 100) ?

@zshi-redhat
Copy link
Collaborator

zshi-redhat commented Oct 26, 2021

so the vfGroups is the main indicator, and that list should be ordered from highest to lowest priority (30, 40, 100) ?

In the above case, the vfio-pci vfGroup with range 0-3 should not be listed in the sriovnetworknodestate, since the 0-1 devices are in the high priority group which shall not appear in the low priority group. 2-3 devices shouldn't exist given the numvfs is 2 in the high priority policy.

@mskrocki
Copy link
Contributor Author

Looking into it. To confirm https://github.com/k8snetworkplumbingwg/sriov-network-operator/blob/master/api/v1/helper.go#L272, we merge only when:
partition is configured and the policies have the same priority ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants