-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OVS HWOL: use generic plugin to write switchdev device configuration #106
OVS HWOL: use generic plugin to write switchdev device configuration #106
Conversation
This pull request introduces 1 alert when merging 2356622 into f17625b - view on LGTM.com new alerts:
|
/cc @Mmduh-483 @pliurh |
2356622
to
2c662b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test the code on k8s cluster?
@@ -212,3 +196,25 @@ func needDrainNode(desired sriovnetworkv1.Interfaces, current sriovnetworkv1.Int | |||
} | |||
return | |||
} | |||
|
|||
func needRebootNode(state *sriovnetworkv1.SriovNetworkNodeState, loadVfioDriver *uint) (needReboot bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add logs in this function why reboot is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
@@ -127,24 +127,6 @@ func (p *K8sPlugin) OnNodeStateChange(old, new *sriovnetworkv1.SriovNetworkNodeS | |||
} | |||
} | |||
|
|||
// Check switchdev config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K8s plugin is only checking the services now, can you reformat the condition on line 121 to be
if !utils.IsSwitchdevModeSpec(new.Spec) {
continue
}
Can you also remove the word "not" in the comment in line 120
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mmduh-483 I thought we still need the p.servicesStateUpdate()
to set the right flags or content for K8s_plugin struct, isn't it? Otherwise, in line p.updateTarget.needUpdate()
, it will always be false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation:
-
p.servicesStateUpdate runs when if condition
utils.IsSwitchdevModeSpec(new.Spec)
is met, I suggest a simple refactoring for the if condtion -
At the start of
OnNOdeChange
p.updateTarget is reset to be all falsep.updateTarget.needUpdate()
is always false if there is no switchdev spec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So your suggestion is to change:
if utils.IsSwitchdevModeSpec(new.Spec) {
// Check services
err = p.servicesStateUpdate()
if err != nil {
glog.Errorf("k8s-plugin OnNodeStateChange(): failed : %v", err)
return
}
}
To:
if !utils.IsSwitchdevModeSpec(new.Spec) {
continue
}
err = p.servicesStateUpdate()
if err != nil {
glog.Errorf("k8s-plugin OnNodeStateChange(): failed : %v", err)
return
}
Is it correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/daemon/daemon.go
Outdated
glog.V(0).Infof("nodeStateSyncHandler(): plugin %s: reqDrain %v, reqReboot %v", k, d, r) | ||
reqDrain = reqDrain || d | ||
reqReboot = reqReboot || r | ||
} | ||
glog.V(0).Infof("nodeStateSyncHandler(): reqDrain %v, reqReboot %v disableDrain %v", reqDrain, reqReboot, dn.disableDrain) | ||
|
||
for k, p := range dn.LoadedPlugins { | ||
if k != GenericPlugin && k != McoPlugin { | ||
if k != GenericPlugin { | ||
if k == K8sPlugin && utils.ClusterType == utils.ClusterTypeOpenshift { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why this condition needed for the k8s plugin and openshift cluster type?
on line 663 you ensure that plugin is not loaded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove, there is no need for this conditional check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
I didn't, I only tried on OpenShift. |
2c662b7
to
b030b6b
Compare
@zshi-redhat lgtm for me, can you verify it is working on k8s cluster? |
@Mmduh-483 I'd like to, but I don't have an env that can be easily deployed with k8s. |
// MachineConfigPool is selected when its NodeSelectors are included in MachineConfigPoolSelector | ||
for _, pool := range dc.Spec.OvsHardwareOffload { | ||
if selector.Matches(labels.Set(pool.MachineConfigPoolSelector)) { | ||
mcpMap[mcp.GetName()] = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we store this information in the status
, so that users can know which MCP is matched?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
@@ -24,6 +24,15 @@ type SriovOperatorConfigSpec struct { | |||
DisableDrain bool `json:"disableDrain,omitempty"` | |||
// Flag to enable OVS hardware offload. Set to 'true' to provision switchdev-configuration.service and enable OpenvSwitch hw-offload on nodes. | |||
EnableOvsOffload bool `json:"enableOvsOffload,omitempty"` | |||
// OvsHardwareOffload describes the OVS HWOL configuration for selected MachineConfigPool | |||
OvsHardwareOffload []OvsHardwareOffloadConfig `json:"ovsHardwareOffload,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen such an API design in K8S API. Normally, if users need to select objects, they shall use a selector which matches the label of the objects. Personally, I think this implementation is hard to understand.
Shall we ask users to label the MCP by themself, and we use a selector here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen such an API design in K8S API. Normally, if users need to select objects, they shall use a selector which matches the label of the objects. Personally, I think this implementation is hard to understand.
My previous thinking was that we may want to re-use the same API for k8s case in the future.
Current OvsHardwareOffload.MachineConfigPoolSelector
matches on node labels, where for OpenShift, it matches the (node) labels defined in MCP Spec.NodeSelector
, for k8s, it could match the node labels directly.
Shall we ask users to label the MCP by themself, and we use a selector here?
One idea I had before was to label or annotate MCP (with ovs-hw-offload) directly, then not change the SR-IOV API, instead, we can have SR-IOV Operator watch on MCP update for creating/deleting HWOL MCs. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen such an API design in K8S API. Normally, if users need to select objects, they shall use a selector which matches the label of the objects. Personally, I think this implementation is hard to understand.
There is a similar example in tuned Operator, which was suggested by MCO team: https://github.com/openshift/cluster-node-tuning-operator/blob/master/pkg/apis/tuned/v1/tuned_types.go#L79
It defines the MachineConfigLabels in Tuned API to match on MCPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My previous thinking was that we may want to re-use the same API for k8s case in the future.
If we plan to reuse it for k8s, the name MachineConfigPoolSelector
is kind of misleading. How about just use NodeSelector
instead?
One idea I had before was to label or annotate MCP (with ovs-hw-offload) directly, then not change the SR-IOV API, instead, we can have SR-IOV Operator watch on MCP update for creating/deleting HWOL MCs. wdyt?
I prefer we can have a common API for both OCP and K8S. Otherwise, we still need a way to tell when the ovs hw offload shall be applied in the k8s case. Currently, it's hard bonded with the switchdev configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to NodeSelector
} else { | ||
return fmt.Errorf("Failed to get MachineConfig: %v", err) | ||
for mcpName, enable := range mcpMap { | ||
mcName := "00-" + mcpName + "-" + OVS_HWOL_MACHINE_CONFIG_NAME_SUFFIX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to create one MC for each selected pool? Have you considered if spec.machineConfigSelector
in MCP can be used instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, with current API change, we probably can use single MC be matched by all MCP.
I was thinking for advanced cases that we might have HWOL configurations that differ between each MCP in the future (e.g. the tc_policy config), in which case, having one MC per MCP would be needed.
} | ||
} | ||
|
||
update, remove, err := utils.WriteSwitchdevConfFile(state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to distinguish 'remove' from 'update' here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't, will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
foundMC := &mcfgv1.MachineConfig{} | ||
foundMCP := &mcfgv1.MachineConfigPool{} | ||
// Offload is not supported for master MachineConfigPool | ||
delete(mcpMap, "master") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of deletion, can we check the MCP name before adding to mcpMap
, and log an error if users select the master pool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Zenghui Shi <[email protected]>
One node can only be added in one custom MCP pool by design of MCO. Since current ovs-hw-offload MCs are applied to hardcoded MCP ovs-hw-offload-worker, other existing custome MCPs cannot be enabled with ovs hw offload configuration. This commits allows ovs-hw-offload MCs be applied to multiple MCPs as specified in default SriovOperatorConfig. Signed-off-by: Zenghui Shi <[email protected]>
Signed-off-by: Zenghui Shi <[email protected]>
Signed-off-by: Zenghui Shi <[email protected]>
b030b6b
to
047f6b5
Compare
Added SriovOperatorConfig.Spec.OvsHardwareOffload API field to define Labels that matches with MachineConfigPool NodeSelector. Nodes that's selected by the Labels will be applied with "bindata/manifest/switchdev-config".
Removed MCO plugins and moved the writing of switchdev device config file into generic plugin (shared by OpenShift and Kubernetes deployment). Removed the writing of switchdev device config file from K8s plugin.
For configuring OVS HWOL on OpenShift, the steps are changed to:
For configuring OVS HWOL on Kubernetes, the steps are not changed.