-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vhost-vdpa support #467
vhost-vdpa support #467
Conversation
Thanks for your PR,
To skip the vendors CIs use one of:
|
Pull Request Test Coverage Report for Build 5948839312
💛 - Coveralls |
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks for your PR,
To skip the vendors CIs use one of:
|
@bn222 @SchSeba @adrianchiris @zshi-redhat Please review and approve if looks good |
@@ -69,7 +71,7 @@ func (p *GenericPlugin) OnNodeStateChange(new *sriovnetworkv1.SriovNetworkNodeSt | |||
p.DesireState = new | |||
|
|||
needDrain = needDrainNode(new.Spec.Interfaces, new.Status.Interfaces) | |||
needReboot = needRebootNode(new, &p.LoadVfioDriver, &p.LoadVirtioVdpaDriver) | |||
needReboot = needRebootNode(new, &p.LoadVfioDriver, &p.LoadVirtioVdpaDriver, &p.LoadVhostVdpaDriver) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now starts to look cumbersome. Can we start this PR with a refactor of p.LoadVfioDriver and vdpa driver? What's happening here is that the needRebootNode actually returns a bunch of things through its arguments. That function is now doing much more than "just checking if we need to reboot".
We also have an enum and a small state machine to load drivers. This could be simplified much further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bn222 pushing the refactoring of driver loading logic
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks for your PR,
To skip the vendors CIs use one of:
|
return p.DriverStateMap | ||
} | ||
|
||
func (p *GenericPlugin) TestDriverLoading(state *sriovnetworkv1.SriovNetworkNodeState) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: LoadDriverForTests ?
if needVfioDriver(state) { | ||
*loadVfioDriver = loading | ||
update, err := tryEnableIommuInKernelArgs() | ||
func nodeToRebootIfVfio(state *sriovnetworkv1.SriovNetworkNodeState, driverMap DriverStateMapType) (updateNode bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodeToRebootIfVfio ? maybe needRebootIfVfio ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also lets use "needReboot" as return val as thats what it means.
update, err := tryEnableIommuInKernelArgs() | ||
func nodeToRebootIfVfio(state *sriovnetworkv1.SriovNetworkNodeState, driverMap DriverStateMapType) (updateNode bool) { | ||
updateNode = false | ||
for driverID, driverState := range driverMap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
driverMap is a .... map.
no need to iterate over it. instead just index it by Vfio
*loadVfioDriver = loading | ||
update, err := tryEnableIommuInKernelArgs() | ||
func nodeToRebootIfVfio(state *sriovnetworkv1.SriovNetworkNodeState, driverMap DriverStateMapType) (updateNode bool) { | ||
updateNode = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is by default false. so can be removed.
} | ||
} | ||
func needRebootNode(state *sriovnetworkv1.SriovNetworkNodeState, driverMap DriverStateMapType) (needReboot bool) { | ||
needReboot = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is by default false. so can be removed.
|
||
if needReboot { | ||
needDrain = true | ||
} | ||
return | ||
} | ||
|
||
// ////////////// for testing purposes only /////////////////////// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move those to the end of the file ?
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks for your PR,
To skip the vendors CIs use one of:
|
if [ $vfid -le $maxId ] && [ $vfid -ge $minId ] && [ $vdpaType == "virtio" ]; then | ||
vdpa_cmd="vdpa dev add name vdpa:"${VfPciAddr}" mgmtdev pci/"${VfPciAddr} | ||
if [ $vfid -le $maxId ] && [ $vfid -ge $minId ] && [ $vdpaType != "" ]; then | ||
vdpa_cmd="vdpa dev add name vdpa:"${VfPciAddr}" mgmtdev pci/"${VfPciAddr}" max_vqp 32" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what this means? max_vqp 32
it will always be this number? should it be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right, this will be configurable in the future by extending the API? For the time being, the idea is to enable the multiqueue support by providing by default 32 queues. We think it is the best value, considering that the virtio-net driver will use the number of cores as a maximum. The reason for enabling multiqueue is to increase the performances of the card and be ready to demonstrate it to Verizon
if err := p.HostManager.LoadKernelModule("vfio_pci"); err != nil { | ||
glog.Errorf("generic-plugin Apply(): fail to load vfio_pci kmod: %v", err) | ||
return err | ||
func syncDriverState(p *GenericPlugin) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please have this function as part of the struct no need to pass it
for _, driverState := range p.DriverStateMap { | ||
if !driverState.DriverLoaded && driverState.NeedDriverFunc(p.DesireState, driverState) { | ||
if err := p.HostManager.LoadKernelModule(driverState.DriverName); err != nil { | ||
glog.Errorf("generic-plugin Apply(): fail to load %s kmod: %v", driverState.DriverName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please change the function on the log generic-plugin Apply()
func syncDriverState(p *GenericPlugin) error { | ||
for _, driverState := range p.DriverStateMap { | ||
if !driverState.DriverLoaded && driverState.NeedDriverFunc(p.DesireState, driverState) { | ||
if err := p.HostManager.LoadKernelModule(driverState.DriverName); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a V(2) log say what driver it's loading (good for debug)
@@ -104,6 +144,11 @@ func (p *GenericPlugin) Apply() error { | |||
} | |||
} | |||
|
|||
err := syncDriverState(p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please just simply this like
if err := p.syncDrvierStates(); err != nil {
return err
}
return | ||
} | ||
|
||
// ////////////// for testing purposes only /////////////////////// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you have a space here :)
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks for your PR,
To skip the vendors CIs use one of:
|
Thanks for your PR,
To skip the vendors CIs use one of:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm from my side.
one small comment RE log verbosity
func (p *GenericPlugin) syncDriverState() error { | ||
for _, driverState := range p.DriverStateMap { | ||
if !driverState.DriverLoaded && driverState.NeedDriverFunc(p.DesireState, driverState) { | ||
glog.Infof("loading driver %s", driverState.DriverName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use glog.V(2).Infof as @SchSeba suggested below.
Thanks for your PR,
To skip the vendors CIs use one of:
|
thanks @lmilleri ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one nit comment
Overall, LGTM and good testing coverage. Thanks
Signed-off-by: Leonardo Milleri <[email protected]>
Max number of queues is set to 32 by default. This setting should help with the performance of NVIDIA network cards. Signed-off-by: Leonardo Milleri <[email protected]>
Thanks for your PR,
To skip the vendors CIs use one of:
|
SriovNetworkNodePolicy API: VDPA device type can take now the following values: "virtio" and "vhost" Vhost/vdpa: Same behaviour of virtio/vdpa apart for the drivers that are loaded in the kernel (vhost-vdpa instead of virtio-vdpa) Signed-off-by: Leonardo Milleri <[email protected]>
Thanks for your PR,
To skip the vendors CIs use one of:
|
This PR implements the changes for vhost-vdpa support.