-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix priority handling for same-pf VFgroups. #196
Conversation
c9c1a72
to
ef13217
Compare
api/v1/helper.go
Outdated
// configured with the #-notation in pfName and 2. SR-IOV policies have the same | ||
// priority. | ||
func (iface Interface) mergeConfigs(input *Interface, equalPriority bool) { | ||
if !equalPriority { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we would still want to allow VF partition policies with different priority numbers, for example:
Policy 1:
spec:
deviceType: netdevice
resourceName: kernel-device
numVfs: 8
priority: 50
nicSelector:
pfNames:
- "ens1f0#0-3"
Policy 2:
spec:
deviceType: vfio-pci
resourceName: dpdk-device
numVfs: 8
priority: 70
nicSelector:
pfNames:
- "ens1f0#4-7"
With current logic, policy-2 won't be applied if I understand correctly, but we should allow the policy-2 vfGroup be created since it doesn't conflict with policy-1 configuration.
This means the conditional check !equalPriority
shall not be applied to vfGroup merging in line 308.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From logical point of view I agree, these policies could be merged, but on the other hand wont that complicate the "priority's" field meaning? Now it is simple, merge only when same priority.
Why those policies cannot have same priority? If user explicitly defined the partitioning, the intention is to merge these, so in my opinion, they should have same priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is it may break existing deployment if we change current logic. For example, if user already has VF partition + different priorities settings, upgrade of operator will result in unexpected result.
I agree with you on that priority field may not be necessary in most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Existing users can be instructed via documentation (or release notes), we are not changing any APIs. They would only have to adjust their priorities.
Additionally, this repo docs do not specify how exactly the priority field works, probably smth I can add to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Existing users can be instructed via documentation (or release notes), we are not changing any APIs.
User may not want to change their configuration in a production environment that is working, to me this is like a regression I'd like to avoid.
They would only have to adjust their priorities.
In my opinion, priority shall apply when there is conflict between policies (only apply high priority one over the other), but not when there is no conflict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this in Today's meeting.
In general, the priority
field is not well documented (except of the one-line in the sriovnetworknodepolicy CRD)
more so, the interaction between VF partitioning and priority when merging configurations.
so we should definitely document this feature :)
AFAIU, today, if you use VF partitioning with multiple policies, if there is an overlap in VF group you might end up with an un-expected configuration (lower priority takes precedence)
I can think of additional edge cases, where you apply configuration on a per device basis and not per vf group like switchdev. e.g two sriovnetworkpolicies one with switchdev enabled the other without, same PF with different vf groups. (not sure if it would fail or not)
IMHO, if one wants to partition a PF's VFs to different resources he should do that in the same priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the code to be backwards compatible. Added README changes.
This change fixes VFGroups assigned to nodes based on the policies priorities: highest priority (lowest value of priority) should be the only one present in the SriovNetworkNodeState.spec. Exception is made for policies with non-overlapping VFGroups, which will be merged. For same priority policies we discard overlapping VF ranges, only the highest priority is present. Added description of this behaviour to README.
Thanks for your PR,
To skip the vendors CIs use one of:
|
// - skip group with same ResourceName, | ||
// - skip overlapping groups (use only highest priority) | ||
for _, gr := range iface.VfGroups { | ||
if gr.ResourceName == input.VfGroups[0].ResourceName || gr.isVFRangeOverlapping(input.VfGroups[0]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't merge when a higher priority policy(input.vfGroup[0]) with same resourceName or overlapping VF group, but it should overwrite, right?
if gr.ResourceName == input.VfGroups[0].ResourceName ... {
if !equalPriority {
input.VfGroups[0] = gr
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind what is used outside mergeConfigs, i.e. *input
is the final object we use, not iface
. So, input.VfGroups[0] already contains the highest priority group, and here we need to verify the other groups do not overlap or have same resourceName. If they dont we add them to the input.VfGroups list (line 311).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input.VfGroups[0] already contains the highest priority group - right.
when the highest priority group overlaps or has the same resourceName as lower priority one (iface
), should it override (because highest priority wins)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, because this is a conflicting situation: either we have duplicate resourceName, or there is VF range overlap. In such cases we will use the high priority one (lower value).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add that "override" in this "for" is done by just skipping given group and not adding it to the final "input" object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, because this is a conflicting situation: either we have duplicate resourceName, or there is VF range overlap. In such cases we will use the high priority one (lower value).
Yes, this is a conflicting situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any other doubts or questions about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I‘d expect line 314 will also be removed based on the above change, will it?
I don't have other comments beside this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, line 314 is required as per my reply below. We cannot update mtu and numvfs blindly for every call of mergeConfigs i.e. for case where we did NOT merge (enter "continue", line 308, every time), but just override we need to preserve the original values.
input.VfGroups = append(input.VfGroups, gr) | ||
} | ||
|
||
if !equalPriority && !m { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may not needed as we would always want to update mtu and numvfs to highest number.
For example:
if equalPriority {
// update mtu and numVfs
// because higher value from same priority policies is applied
}
if !equalPriority {
// update mtu and numVfs
// because value from higher priority policy gets applied
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is, we do not want to merge mtu and numvfs when we just overwrite i.e. m=false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add that as mentioned above, *input is the final object we use, not iface and it already contains the highest priority policy values for mtu and NumVfs.
/lgtm |
ping, could we merge this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mskrocki sorry for late review.
got a follow-up questions in the above comment.
// - skip group with same ResourceName, | ||
// - skip overlapping groups (use only highest priority) | ||
for _, gr := range iface.VfGroups { | ||
if gr.ResourceName == input.VfGroups[0].ResourceName || gr.isVFRangeOverlapping(input.VfGroups[0]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input.VfGroups[0] already contains the highest priority group - right.
when the highest priority group overlaps or has the same resourceName as lower priority one (iface
), should it override (because highest priority wins)?
/ping |
This change fixes VFGroups assigned to nodes based
on the policies priorities: highest priority (lowest
value of priority) should be the only one present in the
SriovNetworkNodeState.spec.
Additionally, for same priority policies we discard
overlapping VF ranges, only the highest priority is
present.
FIXES: #194