Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not skip VFs configuration for BlueField 2 NICs in legacy mode #235

Closed
wants to merge 1 commit into from

Conversation

e0ne
Copy link
Collaborator

@e0ne e0ne commented Jan 21, 2022

Let's configure VFs for BlueField 2 NICs in legacy mode if user
wants to use these NICs in a such way.

Signed-off-by: Ivan Kolodyazhny [email protected]

Let's configure VFs for BlueField 2 NICs in legacy mode if user
wants to use these NICs in a such way.

Signed-off-by: Ivan Kolodyazhny <[email protected]>
@github-actions
Copy link

Thanks for your PR,
To run vendors CIs use one of:

  • /test-all: To run all tests for all vendors.
  • /test-e2e-all: To run all E2E tests for all vendors.
  • /test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.

To skip the vendors CIs use one of:

  • /skip-all: To skip all tests for all vendors.
  • /skip-e2e-all: To skip all E2E tests for all vendors.
  • /skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor.
    Best regards.

@adrianchiris
Copy link
Collaborator

As background, the use-case is using DPU as a "regular NIC".

a mode which is called "separated" for Bluefield DPU

@@ -36,7 +36,6 @@ const (
ClusterTypeOpenshift = "openshift"
ClusterTypeKubernetes = "kubernetes"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This const isn't used anywhere and should be cleaned up sometime

@zshi-redhat
Copy link
Collaborator

zshi-redhat commented Jan 26, 2022

@e0ne @adrianchiris We may still want to skip BF2 in SkipConfigVf for non-separated mode for the following reasons:

  1. BF2 VF configuration is moved to use systemd service (similar to switchdev device configuration)
  2. BF2 VF configuration doesn't require eswitchdev mode set to switchdev in BF2 embedded mode.
  3. Use systemd service for BF2 would allow VF with index 0 to become ready before any ovn-k8s pods are running on the host (ovnkube-node renames VF0 to ovn-k8s-mp0)

With this commit, BF2 configuration may be erased or reset once config-daemon starts.
IIUC, we can still configure BF2 in separated mode w/o this commit, right?

@e0ne
Copy link
Collaborator Author

e0ne commented Jan 26, 2022

@zshi-redhat, thanks for you feedback

  1. BF2 VF configuration is moved to use systemd service (similar to switchdev device configuration)

Unfortunately, this assumption is not valid now. I'm going to fix it in scope of #81. It works for a switchdev only now

  1. BF2 VF configuration doesn't require eswitchdev mode set to switchdev in BF2 embedded mode.

Could you please elaborate more on this case? It's not clear to me.

  1. Use systemd service for BF2 would allow VF with index 0 to become ready before any ovn-k8s pods are running on the host (ovnkube-node renames VF0 to ovn-k8s-mp0)

How does it work/supposed to be working with SR-IOV Operator?

With this commit, BF2 configuration may be erased or reset once config-daemon starts.

The current implementation resets all found devices. We need to fix it in a PR to not reset devices which aren't included in policy.

IIUC, we can still configure BF2 in separated mode w/o this commit, right?

Without this commit we can't configure BF2 in a separated mode.

@zshi-redhat
Copy link
Collaborator

@zshi-redhat, thanks for you feedback

  1. BF2 VF configuration is moved to use systemd service (similar to switchdev device configuration)

Unfortunately, this assumption is not valid now. I'm going to fix it in scope of #81. It works for a switchdev only now

It was done through this PR: #201
Currently systemd service only applied to switchdev device and BF2.
The #81 will enable systemd service for all other devices.

  1. BF2 VF configuration doesn't require eswitchdev mode set to switchdev in BF2 embedded mode.

Could you please elaborate more on this case? It's not clear to me.

Switchdev mode is required to enable OVS HWOL feature for CX- series cards, but in BF2 case (embedded cpu mode), offload happens in BF2 arm core and switchdev mode configuration is moved from host to BF2 system. On the corresponding x86 host, switchdev mode is not needed.

  1. Use systemd service for BF2 would allow VF with index 0 to become ready before any ovn-k8s pods are running on the host (ovnkube-node renames VF0 to ovn-k8s-mp0)

How does it work/supposed to be working with SR-IOV Operator?

There is no big difference on how sriov operator is used here, user just needs to create a sriovnetworknodepolicy (w/o specifying switchdev mode), all the rest configuration in sriovnetworknodepolicy are the same.

With this commit, BF2 configuration may be erased or reset once config-daemon starts.

The current implementation resets all found devices. We need to fix it in a PR to not reset devices which aren't included in policy.

IIUC, we can still configure BF2 in separated mode w/o this commit, right?

Without this commit we can't configure BF2 in a separated mode.

I think the config-daemon needs to know which mode BF2 is being configured, then use systemd for embedded cpu mode and config-daemon runtime config for separated mode.

@e0ne
Copy link
Collaborator Author

e0ne commented Jan 27, 2022

@zshi-redhat PR #201 confused me too. Unfortunately it doesn't work as expected. In a current implementation BF2 configuration is ignored if we don't use switchdev mode.

@e0ne
Copy link
Collaborator Author

e0ne commented Jan 27, 2022

I think the config-daemon needs to know which mode BF2 is being configured, then use systemd for embedded cpu mode and config-daemon runtime config for separated mode.

It's what my PR does, isn't it?

@zshi-redhat
Copy link
Collaborator

@zshi-redhat PR #201 confused me too. Unfortunately it doesn't work as expected. In a current implementation BF2 configuration is ignored if we don't use switchdev mode.

In current implementation, bluefield-2 is ignored w/ or w/o switchdev config.
If it is switchdev, then it will be ignore at:

if ifSpec.EswitchMode == sriovnetworkv1.ESWITCHMODE_SWITCHDEV

if it is not switchdev, then it will be ignore at:

if ifStatus.Vendor == VendorMellanox && ifStatus.DeviceID == DeviceBF2

Here ignore means not to run configSriovDevice in config-daemon.

@zshi-redhat
Copy link
Collaborator

zshi-redhat commented Jan 27, 2022

I think the config-daemon needs to know which mode BF2 is being configured, then use systemd for embedded cpu mode and config-daemon runtime config for separated mode.

It's what my PR does, isn't it?

For BlueFIeld in ConnectX mode, I think we may need the following as a temp solution:

if ifSpec.EswitchMode == sriovnetworkv1.ESWITCHMODE_SWITCHDEV && ifStatus.Vendor == VendorMellanox && ifStatus.DeviceID == DeviceBF2
{
        return false
}


@e0ne
Copy link
Collaborator Author

e0ne commented Jan 27, 2022

@zshi-redhat I understood your point now. Will test your suggestion and update PR shortly

@e0ne
Copy link
Collaborator Author

e0ne commented Jan 27, 2022

For BlueFIeld in ConnectX mode, I think we may need the following as a temp solution:

if ifSpec.EswitchMode == sriovnetworkv1.ESWITCHMODE_SWITCHDEV && ifStatus.Vendor == VendorMellanox && ifStatus.DeviceID == DeviceBF2
{
        return false
}

It doesn't work to me. We it even breaks BF2 configuration for switchdev

@zshi-redhat
Copy link
Collaborator

For BlueFIeld in ConnectX mode, I think we may need the following as a temp solution:

if ifSpec.EswitchMode == sriovnetworkv1.ESWITCHMODE_SWITCHDEV && ifStatus.Vendor == VendorMellanox && ifStatus.DeviceID == DeviceBF2
{
        return false
}

It doesn't work to me. We it even breaks BF2 configuration for switchdev

Could you paste the change or issue ?

@e0ne
Copy link
Collaborator Author

e0ne commented Feb 2, 2022

Will be fixed in #240

@e0ne e0ne closed this Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants