Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix priority handling for same-pf VFgroups. #196

Merged
merged 1 commit into from
Feb 17, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 16 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,9 +121,9 @@ metadata:
spec:
interfaces:
- deviceType: vfio-pci
mtu: 1500
numVfs: 4
pciAddress: 0000:86:00.0
mtu: 1500
numVfs: 4
pciAddress: 0000:86:00.0
status:
interfaces:
- deviceID: "1583"
Expand Down Expand Up @@ -198,6 +198,19 @@ In a virtual deployment:
- The numVfs parameter has no effect as there is always 1 VF
- The deviceType field depends upon whether the underlying device/driver is [native-bifurcating or non-bifurcating](https://doc.dpdk.org/guides/howto/flow_bifurcation.html) For example, the supported Mellanox devices support native-bifurcating drivers and therefore deviceType should be netdevice (default). The support Intel devices are non-bifurcating and should be set to vfio-pci.

#### Multiple policies

When multiple SriovNetworkNodeConfigPolicy CRs are present, the `priority` field
(0 is the highest priority) is used to resolve any conflicts. Conflicts occur
only when same PF is referenced by multiple policies. The final desired
configuration is saved in `SriovNetworkNodeState.spec.interfaces`.

Policies processing order is based on priority (lowest first), followed by `name`
field (starting from `a`). Policies with same **priority** or **non-overlapping
VF groups** (when #-notation is used in pfName field) are merged, otherwise only
the highest priority policy is applied. In case of same-priority policies and
overlapping VF groups, only the last processed policy is applied.

## Components and design

This operator is split into 2 components:
Expand Down
78 changes: 50 additions & 28 deletions api/v1/helper.go
Original file line number Diff line number Diff line change
Expand Up @@ -252,12 +252,12 @@ func UniqueAppend(inSlice []string, strings ...string) []string {
}

// Apply policy to SriovNetworkNodeState CR
func (p *SriovNetworkNodePolicy) Apply(state *SriovNetworkNodeState, merge bool) {
func (p *SriovNetworkNodePolicy) Apply(state *SriovNetworkNodeState, equalPriority bool) error {
s := p.Spec.NicSelector
if s.Vendor == "" && s.DeviceID == "" && len(s.RootDevices) == 0 && len(s.PfNames) == 0 &&
len(s.NetFilter) == 0 {
// Empty NicSelector match none
return
return nil
}
for _, iface := range state.Status.Interfaces {
if s.Selected(&iface) {
Expand All @@ -268,55 +268,76 @@ func (p *SriovNetworkNodePolicy) Apply(state *SriovNetworkNodeState, merge bool)
Name: iface.Name,
LinkType: p.Spec.LinkType,
EswitchMode: p.Spec.EswitchMode,
NumVfs: p.Spec.NumVfs,
}
var group *VfGroup
if p.Spec.NumVfs > 0 {
result.NumVfs = p.Spec.NumVfs
group, _ = p.generateVfGroup(&iface)
group, err := p.generateVfGroup(&iface)
if err != nil {
return err
}
result.VfGroups = []VfGroup{*group}
found := false
for i := range state.Spec.Interfaces {
if state.Spec.Interfaces[i].PciAddress == result.PciAddress {
found = true
// merge PF configurations when:
// 1. SR-IOV partition is configured
// 2. SR-IOV partition policies have the same priority
result = state.Spec.Interfaces[i].mergePfConfigs(result, merge)
result.VfGroups = state.Spec.Interfaces[i].mergeVfGroups(group)
state.Spec.Interfaces[i].mergeConfigs(&result, equalPriority)
state.Spec.Interfaces[i] = result
break
}
}
if !found {
result.VfGroups = []VfGroup{*group}
state.Spec.Interfaces = append(state.Spec.Interfaces, result)
}
}
}
}
return nil
}

func (iface Interface) mergePfConfigs(input Interface, merge bool) Interface {
if merge {
if input.Mtu < iface.Mtu {
input.Mtu = iface.Mtu
}
if input.NumVfs < iface.NumVfs {
input.NumVfs = iface.NumVfs
// mergeConfigs merges configs from multiple polices where the last one has the
// highest priority. This merge is dependent on: 1. SR-IOV partition is
// configured with the #-notation in pfName, 2. The VF groups are
// non-overlapping or SR-IOV policies have the same priority.
func (iface Interface) mergeConfigs(input *Interface, equalPriority bool) {
m := false
// merge VF groups (input.VfGroups already contains the highest priority):
// - skip group with same ResourceName,
// - skip overlapping groups (use only highest priority)
for _, gr := range iface.VfGroups {
if gr.ResourceName == input.VfGroups[0].ResourceName || gr.isVFRangeOverlapping(input.VfGroups[0]) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't merge when a higher priority policy(input.vfGroup[0]) with same resourceName or overlapping VF group, but it should overwrite, right?

if gr.ResourceName == input.VfGroups[0].ResourceName ... {

    if !equalPriority {
        input.VfGroups[0] = gr
    }

}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind what is used outside mergeConfigs, i.e. *input is the final object we use, not iface. So, input.VfGroups[0] already contains the highest priority group, and here we need to verify the other groups do not overlap or have same resourceName. If they dont we add them to the input.VfGroups list (line 311).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input.VfGroups[0] already contains the highest priority group - right.
when the highest priority group overlaps or has the same resourceName as lower priority one (iface), should it override (because highest priority wins)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, because this is a conflicting situation: either we have duplicate resourceName, or there is VF range overlap. In such cases we will use the high priority one (lower value).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add that "override" in this "for" is done by just skipping given group and not adding it to the final "input" object.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, because this is a conflicting situation: either we have duplicate resourceName, or there is VF range overlap. In such cases we will use the high priority one (lower value).

Yes, this is a conflicting situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any other doubts or questions about this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I‘d expect line 314 will also be removed based on the above change, will it?
I don't have other comments beside this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, line 314 is required as per my reply below. We cannot update mtu and numvfs blindly for every call of mergeConfigs i.e. for case where we did NOT merge (enter "continue", line 308, every time), but just override we need to preserve the original values.

continue
}
m = true
input.VfGroups = append(input.VfGroups, gr)
}

if !equalPriority && !m {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not needed as we would always want to update mtu and numvfs to highest number.
For example:

if equalPriority {
      // update mtu and numVfs 
      // because higher value from same priority policies is applied
}
if !equalPriority {
      // update mtu and numVfs 
      // because value from higher priority policy gets applied
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is, we do not want to merge mtu and numvfs when we just overwrite i.e. m=false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add that as mentioned above, *input is the final object we use, not iface and it already contains the highest priority policy values for mtu and NumVfs.

return
}

// mtu configuration we take the highest value
if input.Mtu < iface.Mtu {
input.Mtu = iface.Mtu
}
if input.NumVfs < iface.NumVfs {
input.NumVfs = iface.NumVfs
}
return input
}

func (iface Interface) mergeVfGroups(input *VfGroup) []VfGroup {
groups := iface.VfGroups
for i := range groups {
if groups[i].ResourceName == input.ResourceName {
groups[i] = *input
return groups
}
func (gr VfGroup) isVFRangeOverlapping(group VfGroup) bool {
rngSt, rngEnd, err := parseRange(gr.VfRange)
if err != nil {
return false
}
rngSt2, rngEnd2, err := parseRange(group.VfRange)
if err != nil {
return false
}
// compare minimal range has overlap
if rngSt < rngSt2 {
return IndexInRange(rngSt2, gr.VfRange) || IndexInRange(rngEnd2, gr.VfRange)
}
groups = append(groups, *input)
return groups
return IndexInRange(rngSt, group.VfRange) || IndexInRange(rngEnd, group.VfRange)
}

func (p *SriovNetworkNodePolicy) generateVfGroup(iface *InterfaceExt) (*VfGroup, error) {
Expand All @@ -327,6 +348,7 @@ func (p *SriovNetworkNodePolicy) generateVfGroup(iface *InterfaceExt) (*VfGroup,
for _, selector := range p.Spec.NicSelector.PfNames {
pfName, rngStart, rngEnd, err = ParsePFName(selector)
if err != nil {
log.Error(err, "Unable to parse PF Name.")
return nil, err
}
if pfName == iface.Name {
Expand Down
Loading