Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support multiple network on pod using OVN k8s CNI #2775

Closed
wants to merge 8 commits into from

Conversation

cathy-zhou
Copy link
Contributor

- What this PR does and why is it needed

- Special notes for reviewers

- How to verify it

- Description for the changelog

@cathy-zhou cathy-zhou changed the title draft: support multiple network on pod using OVN k8s CNI [WIP] support multiple network on pod using OVN k8s CNI Jan 26, 2022
@coveralls
Copy link

coveralls commented Jan 27, 2022

Coverage Status

Coverage decreased (-1.3%) to 49.4% when pulling ada462b on cathy-zhou:multinetwork into 5efa78f on ovn-org:master.

@cathy-zhou cathy-zhou changed the title [WIP] support multiple network on pod using OVN k8s CNI support multiple network on pod using OVN k8s CNI Jan 27, 2022
@cathy-zhou cathy-zhou force-pushed the multinetwork branch 3 times, most recently from a752cf6 to f7bfedb Compare February 3, 2022 06:05
@cathy-zhou cathy-zhou force-pushed the multinetwork branch 5 times, most recently from ada462b to 9b96c2a Compare February 12, 2022 04:53
@cathy-zhou
Copy link
Contributor Author

/retest

MTU int `json:"mtu,omitempty"`
// captures net-attach-def name in the form of namespace/name
NadName string `json:"netAttachDefName,omitempty"`
// set true if it is not the default net-attach-def
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the default net-attach-def" is unclear to me. Is that 'default network interface', other than 'secondary using net-attach-def"? If so, NadName could distinguish between default network and secondary network (i.e. NadName == "" must be secondary).

To support multi OVN networks on a single Pod, we'd need to allow
setting/deleting Pod annoation per OVN network. Also, ovnkube-master
needs to update Pod annotation with RetryOnConflict(), because
multi-networks could update the same Pod annotation and introduce
conflict.

Signed-off-by: Yun Zhou <[email protected]>
To support multi OVN networks on a single Pod, we'd need to allow
setting/deleting node subnet annoation per OVN network. Also,
ovnkube-master needs to update node annotation with RetryOnConflict(),
because multi-networks could update the same node subnet annotation and
introduce conflict.

Signed-off-by: Yun Zhou <[email protected]>
to support multiple OVN networks on a single Pod, we'd need to
set/unset smart-nic annoation on the per network basis.

Signed-off-by: Yun Zhou <[email protected]>
allow OVN k8s CNI attaching multiple network interfaces to pods.

Signed-off-by: Yun Zhou <[email protected]>
Copy link
Contributor

@maiqueb maiqueb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of, thank you for your work in this feature. The PR is huge, but, given the sheer amount of new functionality you're adding, I would have expected it to be a even bigger. Very nice work.

I have left a bunch of opinionated suggestions, but I am also asking some important questions.

Most importantly, I have the following concerns:

  • scale. I am unsure your solution to indicate to which ovs bridge will be traffic be output to (when using the localnet topology) will scale. Our use cases imply tens (possibly hundreds) of net-attach-defs, meaning the length of the bindings string will be huge. I'm unsure it has a hard limit, but, it will impair performance if its length is too big.
  • the localnet topology requires IP addresses. We have a use case for IPAM less localnet topologies . We need to sync on this.
  • absence of multi-net policies without IPAM. I'm aware you don't consider this use case. We would also like to sync on this, since that use case is important for us at Red Hat. We could use your help / input to come up with a simple(st) way to define a way for this to happen .
  • we also have a use case for pure overlays. IIUC, if we don't use the localnet port, it'll behave this way. Can you confirm that if we don't setup the ovn-bridge-bindings on each node's it will communicate via tunnel ?

Comment on lines +111 to +115
namespace := pr.PodNamespace
podName := pr.PodName
if namespace == "" || podName == "" {
return nil, fmt.Errorf("required CNI variable missing")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this code duplicated below; would it make sense to export it to an helper function ?

func (pr *PodRequest) validate() error {
	if pr.PodNamespace == "" || pr.PodName == "" {
		return fmt.Errorf("required CNI variable missing")
	}
	return nil
}

Comment on lines +151 to +154
mtu := pr.CNIConf.MTU
if mtu == 0 {
mtu = config.Default.MTU
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also export this into an helper:

func (pr *PodRequest) mtu() int { // I'm unsure about the return type, but you get the idea ...
    if pr.CNIConf.MTU == 0 {
        return config.Default.MTU
    }
    return pr.CNIConf.MTU
}

This enables you to inline this is the call below:

...
podInterfaceInfo, err := PodAnnotation2PodInfo(annotations, useOVSExternalIDs, pr.PodUID, vfNetdevName,
		pr.effectiveNADName, pr.mtu(), netNameInfo)
...

Comment on lines +179 to +183
namespace := pr.PodNamespace
podName := pr.PodName
if namespace == "" || podName == "" {
return nil, fmt.Errorf("required CNI variable missing")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the helper func could be re-used here.

Comment on lines +201 to +206
condString := "external-ids:sandbox=" + pr.SandboxID
if pr.CNIConf.IsSecondary {
condString += " external_ids:network_name=" + pr.effectiveNADName
} else {
condString += " external_ids:network_name{=}[]"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use add yet another helper here, which allows you to simply:

ovsIfNames, err := ovsFind("Interface", "name", pr.condString())

I unfortunately could not come up with a decent name for it ...

Comment on lines +181 to +189
req.effectiveNetName = types.DefaultNetworkName
req.effectiveNADName = types.DefaultNetworkName
if conf.IsSecondary {
req.effectiveNetName = conf.Name
if conf.NadName == "" {
return nil, fmt.Errorf("OVN Netconf %q doesn't contain network-attachment-definition name", conf.Name)
}
req.effectiveNADName = conf.NadName
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add (yet another ...) helper req receiver function here ?


// Find all the logical node switches for the non-default networks and delete the ones that belong to the
// obsolete networks
nodeSwitches, err := libovsdbops.FindSwitchesWithOtherConfig(mc.nbClient)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these only the "logical node switches" - i.e. the logical switches for routed topology ? I find the comment a bit misleading - I wouldn't expect to see the localnet topology switches listed here after reading that comment.

I may have grossly misinterpreted your comment though.

Comment on lines +452 to +455
err1 := oc.addLogicalPort4Nad(pod, nadName, nodeName, network)
if err1 != nil {
err = err1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if more than one failed ? You'd swallow all errors except the last one. Is that OK ? Shouldn't we at least log the error ?

Comment on lines +444 to +445
if portInfo, err := oc.logicalPortCache.get(portName); err != nil {
klog.Errorf(err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you explain what would this scenario represent - i.e. a port missing from the cache ? I'm having trouble understanding this one.

@@ -258,17 +256,17 @@ func SetL3GatewayConfig(nodeAnnotator kube.Annotator, cfg *L3GatewayConfig) erro
func ParseNodeL3GatewayAnnotation(node *kapi.Node) (*L3GatewayConfig, error) {
l3GatewayAnnotation, ok := node.Annotations[ovnNodeL3GatewayConfig]
if !ok {
return nil, newAnnotationNotSetError("%s annotation not found for node %q", ovnNodeL3GatewayConfig, node.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: imho, this read better previously.

Comment on lines +141 to +159
type NetNameInfo struct {
// netconf's name
NetName string
Prefix string
IsSecondary bool
}

type NetAttachDefInfo struct {
NetNameInfo
// net-attach-defs shared the same CNI Conf, key is <Namespace>/<Name> of net-attach-def.
// Note that it means they share the same logical switch (subnet cidr/MTU etc), but they might
// have different resource requirement (requires or not require VF, or different VF resource set)
NetAttachDefs sync.Map
NetCidr string
MTU int
TopoType string
VlanId int
ExcludeIPs []net.IP
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these structs should be defined in a dedicated package, instead of a utils file with a bunch of unreleated code.

// Network MTU
MTU int `json:"mtu,omitempty"`
// set to localnet if it needs public interface
TopoType string `json:"topology,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very confusing to have this attribute set to localnet when you just want an overlay (and relying on the ovn bridge bindings to have the underlay configured).

I think TopologyType should allow 3 values: routed, overlay, and localnet, for instance.

@cathy-zhou cathy-zhou marked this pull request as draft June 29, 2022 20:16
@cathy-zhou cathy-zhou closed this Dec 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants