Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OVS HWOL: support multiple MCPs #125

Merged

Conversation

zshi-redhat
Copy link
Collaborator

@zshi-redhat zshi-redhat commented May 7, 2021

Introduced a new CRD SriovNetworkPoolConfig for pool (not device)
related configurations such as OVS HWOL. This may be extended
to support future features that require pool specific configuration.
For example, SmartNIC offload, Accelerated bridge offload etc.

Removed MCO plugins and moved the writing of switchdev device config
file into generic plugin (shared by OpenShift and Kubernetes deployment).
Removed the writing of switchdev device config file from K8s plugin.

For configuring OVS HWOL on OpenShift, the steps are changed to:

  • Create SriovNetworkPoolConfig to specify the name of MachineConfigPool
  • Configure SriovNetworkNodePolicy with eSwitchMode: switchdev (This writes switchdev.conf file and trigger drain/reboot where needed)

For configuring OVS HWOL on Kubernetes, the steps are not changed.

@zshi-redhat
Copy link
Collaborator Author

This PR replaces #106

@zshi-redhat zshi-redhat force-pushed the ovs-hw-offload-new-api branch from f5c463c to 224afd3 Compare May 7, 2021 04:03
@pliurh
Copy link
Collaborator

pliurh commented May 8, 2021

It's better to put the operator-sdk upgrade changes in separate PR.

@zshi-redhat
Copy link
Collaborator Author

It's better to put the operator-sdk upgrade changes in separate PR.

New PR for operator-sdk project update: #129

@zshi-redhat zshi-redhat force-pushed the ovs-hw-offload-new-api branch from 224afd3 to dc87455 Compare May 20, 2021 05:36
@zshi-redhat zshi-redhat force-pushed the ovs-hw-offload-new-api branch from dc87455 to 2ce71df Compare June 25, 2021 04:27
@zshi-redhat
Copy link
Collaborator Author

  • Resolved conflict
  • Updated SriovNetworkNodeConfig API to add Name match on MachineConfigPool

func (r *SriovNetworkNodeConfigReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
_ = context.Background()
// // Fetch SriovNetworkNodeConfigList
configList := &sriovnetworkv1.SriovNetworkNodeConfigList{}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a one-to-one mapping, we don't need to fetch all the SriovNetworkNodeConfigList CRs here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// OVS HWOL configurations are generated automatically by Operator
// Labels in NodeSelector are ANDed when selecting Kubernetes Nodes
// On OpenShift:
// NodeSelector matches on Labels defined in MachineConfigPoolSpec.NodeSelector
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we just omit this flag for openshift? It would be more clear if we have a one-to-one mapping between the SriovNetworkNodeConfig CR and MCP CR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the NodeSelector field, and focus this PR on openshift support.
NodeSelector can be added later for k8s implementation.

}
}

update, err := utils.WriteSwitchdevConfFile(state)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shall only invoke WriteSwitchdevConfFile when the swichdev is enabled on one PF. So maybe add IsSwitchdevModeSpec check here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

err = r.Get(context.TODO(), types.NamespacedName{Name: mcName}, foundMC)
if err != nil {
if !errors.IsNotFound(err) {
err = r.Delete(context.TODO(), foundMC)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shall be invoked when the MC CR is found without error. It shall be moved to line 177.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

// NodeSelector matches on Labels defined in MachineConfigPoolSpec.NodeSelector
// OVS HWOL MachineConfigs are generated and applied to Nodes in MachineConfigPool
// Labels in NodeSelector are ANDed when matching on MachineConfigPoolSpec.NodeSelector
NodeSelector map[string]string `json:"nodeSelector,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we also add the logic to check the NodeSelector in the k8s_plugin for the Kubernetes scenario?

@zshi-redhat zshi-redhat force-pushed the ovs-hw-offload-new-api branch from 2ce71df to b4f2945 Compare July 1, 2021 03:07
@zshi-redhat
Copy link
Collaborator Author

/hold

@github-actions github-actions bot added the hold label Jul 1, 2021
@zshi-redhat zshi-redhat force-pushed the ovs-hw-offload-new-api branch 2 times, most recently from bb82c48 to 4bd9cc9 Compare July 2, 2021 06:38
@zshi-redhat
Copy link
Collaborator Author

/unhold

mcpMap := make(map[string]bool)

mcpList := &mcfgv1.MachineConfigPoolList{}
err := r.List(context.TODO(), mcpList, &client.ListOptions{})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we fetch the whole list of MCPs, instead of getting the MCP by Spec.OvsHardwareOffloadConfig.Name? We have a one-to-one mapping between SriovNetworkPoolConfig, MCP CR here, don't we?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, changed to get MCP according to Spec.OvsHardwareOffloadConfig.Name.

@zshi-redhat zshi-redhat force-pushed the ovs-hw-offload-new-api branch from 24116da to d4786b7 Compare July 6, 2021 02:01
foundMC := &mcfgv1.MachineConfig{}
mcp := &mcfgv1.MachineConfigPool{}

if mcpName == "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we move the MCP name validation to the operator webhook?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add the webhook check in a separate PR.

return nil
}

err := r.Get(context.TODO(), types.NamespacedName{Name: mcpName}, mcp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable mcp is not used after getting? Is that only for checking the existence of the MCP?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, only for checking the existence of MCP.

@zshi-redhat zshi-redhat force-pushed the ovs-hw-offload-new-api branch from d4786b7 to add7d5b Compare July 6, 2021 08:37
One node can only be added in one custom MCP pool by
design of MCO. Since current ovs-hw-offload MCs are
applied to hardcoded MCP ovs-hw-offload-worker, other
existing custome MCPs cannot be enabled with ovs hw
offload configuration.

This commits allows ovs-hw-offload MCs be applied
to multiple MCPs as specified in default SriovNetworkNodeConfig.
@zshi-redhat zshi-redhat force-pushed the ovs-hw-offload-new-api branch from add7d5b to 463afa9 Compare July 9, 2021 05:01
@zshi-redhat
Copy link
Collaborator Author

Rebased on master

@pliurh
Copy link
Collaborator

pliurh commented Jul 9, 2021

/lgtm

@github-actions github-actions bot added the lgtm label Jul 9, 2021
@pliurh pliurh merged commit 372cf5b into k8snetworkplumbingwg:master Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants