-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add selectors nicNames
#434
Add selectors nicNames
#434
Conversation
9526731
to
77dc299
Compare
Pull Request Test Coverage Report for Build 4316249828
💛 - Coveralls |
@ian-howell does PfName selector satisfy your use-case based on community discussion we had on jul 18th ? if so we can close this PR and related issue |
77dc299
to
e62de29
Compare
nicNames
and macAddresses
nicNames
Hey @adrianchiris, after some investigation, I've found that the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add unit-tests for new device selector
@e0ne Thanks for the review! I think I've resolved all of your comments. Let me know if there's anything else I need to do. |
@ian-howell Is this for VF passthrough to VM or is it also applied to virtio interfaces? |
This is just for VF passthrough |
Gentle bump on this one |
I will review this PR sorry for the long wait |
@SchSeba Any update on this? |
42b0f79
to
70fa4ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am really sorry for the late review..
this looks fine to me just a small comment for the documentation the code looks good
Thanks @SchSeba! I've addressed the documentation when you get another chance |
/lgtm |
@e0ne Could you give this a review when you get a moment? I think I just need one more approval for merge. Thanks in advance! |
Not trying to be pushy, but could I please get an update on this when someone finds the time? |
Hi Ian, appologies for the late response. looking at this PR you essentially add a netdevice selector right ? it will select devices based on their netdev name. this is problematic since sriov-network-device plugin is often used with CNI which means netdevice will be moved to the container network namespace. if at that point device plugin is restarted, resources that are already allocated to running pods will not be discovered. in VM, can device selection be done using PCI addresses ? |
Hey Adrian, thanks for looking over this. Yes, the goal is to create a netdevice selector. At the time of instantiation of the device-plugin ConfigMap, all that we have is the interface name. This is because we are creating templates of kubevirt machines, and the concrete VM definitions won't come until afterward. We could specify PCI addresses, but due to libvirt limitations, we can't guarantee that what we specified will be what appears in the VM. We did some testing to get a better understanding of the problem you've pointed out, and sure enough, the resources are lost when the device-plugin restarts. We will continue to look for a solution for this, but if you happen to have any ideas that might help us, I would be greatly appreciative. |
23909e2
to
c6b3e6d
Compare
c6b3e6d
to
5b78495
Compare
@adrianchiris I have a solution which appears to work. Basically, we iterate over the child namespaces, switching as we go, and search within for an interface whose alias matches the interface as it were named in the parent namespace. I've verified that this works on our hardware. When you get a second, I would appreciate it if you could take a second look at this and let me know what you think. Thanks! |
@adrianchiris Just a gentle bump on this one. Please let me know if there's anything wrong with the latest approach |
|
||
for _, fInfo := range fInfos { | ||
ifaceName := fInfo.Name() | ||
alias, _ := GetInterfaceAlias(ifaceName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this contain the original interface name ?
if statement below indicates that this might not be set for netdev ?
i need to give this one a try on a system to be able to provide proper feedback.
with this, we could potentially determine the link type as well when creating the device in gen_net.go
we have a community meeting tommorow so i will add this PR to the agenda to get some attention to this PR and have others provide feedback as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some comments from todays community meeting:
Unsure about the approach - going over all network namesapces
-
Need to ensure it does not incur a performance penalty
Currently PR scans all namespaces per pci net device for which it did not find interface name.
One way to overcome this is to scan once, store results and use that as an additional source of information when creating pci net device obj. there can be ~100 ns on a node and between 1-8 NICs each with two PFs so between 2 and 16 pci functions. -
Host interface name might not match interface alias (e.g if one set by udev and the other by driver)
-
Can alias be changed by CNI or by container (given enough privileges) ?
} | ||
linkType = la.EncapType | ||
} else if len(netNamesFromNs) > 0 { | ||
ifName = netNamesFromNs[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ifname now may contain either:
- empty string
- the actual interface name
- interface name alias derived when searching for device in network namespaces
just a though (and am not sure we want to go this way) , perhaps have a separate attribute ifalias
in GenNetDevice which contains this info.
then nicNames selector (maybe it should be named ifAlias ?) will match on that field.
Hey @ian-howell , couple quick questions: When you say iterate over namespaces, are these pod namespaces or host namespaces? NDP should not be iterating over pod namespaces. What is the difference between pfName and ifaceName ? only that it is looking for an alias? |
I'm curious why not? I don't think what we're trying to achieve is possible without doing so. |
NDP scope should be limited to host network namespace for device discovery on the host. if NDP starts to snoop over other net namespaces, it is kind of a breach of trust. This could pose further issues if NDP starts messing with pod network namespaces Can you explain the use case a bit more? |
The use case is a bit more involved. We're creating a VM with kubevirt, and within that VM, we're spinning up an inner k8s cluster. So the "PF" in this case is technically a VF within the VM. The only information about that function we have to work with is its name as viewed from within the VM What I want to do is schedule a pod to the inner k8s cluster, and pass the kubevirt pod's VF into the inner pod. With our testing, we've found that |
Hi @ian-howell , we discussed this again in a meeting today. would you be able to join a community meeting on the 4th April - link to the meeting doc, there is a zoom link at the top. It would be easier to have a discussion over a call on this :) |
I will be present. |
Would it be possible/acceptable to implement this function: sriov-network-device-plugin/pkg/resources/pool_stub.go Lines 74 to 78 in abf6548
If I'm understanding correctly, this could be used to periodically refresh the device list, which would allow my original implementation (without the namespace-switching) to work. It also looks like this is functionality that may be useful to others as well. I believe a good implementation could partially satisfy #276 and provide boilerplate for #362 |
Hi @ian-howell , apologies, I gave the wrong time, it's 13:00 UTC, our doc must be outdated, due to timezone change. |
Hi @ian-howell , we discussed the new proposal you mentioned above to use the Probe func, to solve the use case. would you be able to join the Tuesday April 18th 13:00 UTC ? :) |
Hi @Eoghan1232, looks like I might have just missed the meeting. Thanks for keeping this on the agenda, and yes, I will plan on attending the meeting on the 18th. |
Hi @ian-howell I contact the kubevirt team. I wanted to ask you if you can try to add apci config to the VM definition
Then check `cat /sys/bus/pci/devices/<PCI>/acpi_index if this exist on your node we can implement a new acpi selector that will always work :) |
second option we can use today kubevirt already support to add tags for the VM
and they are using the same API as OpenStack so I think with a minimal change in the sriov-operator we can support also tags in the virtual plugin we already have https://kubevirt.io/user-guide/virtual_machines/startup_scripts/#device-role-tagging |
@ian-howell any updates ? were the alternatives working for you ? (acpi index or vm tags) ? |
Hi @SchSeba, @adrianchiris, sorry for the prolonged delay. I took a look into the acpi index and have found that it fits our use case, while also being dramatically simpler than the previous solutions. I've created a new PR at #488. I'll close this PR when we've finalized the new one |
as PR #488 has merged 🚀 🚀 this one can be closed. |
This addresses #433 by adding the
nicNames
fields as selectors(8/8/2022 Update): This no longer addresses
macAddresses