Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Unable to form fastdp with 10Ge Broadcom NIC nodes #3746

Open
yatsura opened this issue Dec 9, 2019 · 7 comments
Open

Unable to form fastdp with 10Ge Broadcom NIC nodes #3746

yatsura opened this issue Dec 9, 2019 · 7 comments

Comments

@yatsura
Copy link

yatsura commented Dec 9, 2019

What you expected to happen?

Weave should form a fastdp for the new nodes that have the been joined to the cluster.

What happened?

Weave falls back to sleeve, however, other nodes remain connected via fastdp.

How to reproduce it?

Issue only appears to be related to the Broadcom BCM57416 NetXtreme-E Dual-Media 10G NIC. Connecting the same node with the Broadcom BCM5720 NetXtreme Gigabit forms a fastdp network. No other changes where made to the node.

Anything else we need to know?

NIC details:
(BCM57416)
driver: bnxt_en
version: 1.8.0
firmware-version: 214.0.253.1/pkg 21.40.25.31
expansion-rom-version:
bus-info: 0000:18:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: no
supports-priv-flags: no

(BCM5720)
driver: tg3
version: 3.137
firmware-version: FFV20.8.4 bc 5720-v1.39
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

Versions:

$ weave version 2.6.0
$ docker version 18.09.7
$ uname -a Linux pm-dell-01 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ kubectl version v1.16.3

Logs:

First 1000 lines for failing node:
pm-dell-01 weave-net debug log

@murali-reddy
Copy link
Contributor

@yatsura can you please share MTU details for both the devices.

As noted here https://www.weave.works/docs/net/latest/tasks/manage/fastdp/#packet-size-mtu if the WEAVE_MTU sized packets can not be sent then weave net falls back to sleeve mode.

@yatsura
Copy link
Author

yatsura commented Dec 17, 2019

Thanks @murali-reddy,

I tried changing it to 1316 for a small value, however, the current value is 8196. The cluster is communicating via fastdp using the 1GB Boradcom devices that are also available on the servers.

The switch supports jumbo frames and is working on the 1GB Broadcom device. I can ping with a packet size of 8916 (ping IP.ADDRESS -c 8888) on 10GB cards to other 10GB cards of the same type.

I suspect that the card device settings are the issue. I tried enabling OVS offloading support but it didn't make any difference.

@rescuetrack-florian-schenk

We are experiencing the same problem with a node (A) using a BCM57416 NIC. This node is unable to establish a fastdp connection to another node (B) which itself can establish fastdp connections to other nodes except node A.

We narrowed it down to the heartbeat response sent by node B not being received on node A. Tcpdump on node A is not showing the packet at all on the inbound interface.

It seems that offloading causes this issue. After setting rx-udp_tunnel-port-offload to off the packet is received and a fastdp connection is established.

This is not a viable solution since turning off offloading decreases speed too much.

@bboreham
Copy link
Contributor

Thanks for the note, @florian2323.

We had something similar with checksum offload back in the beginning - see #1255.
It appears that tunnel offload is under active development in the kernel - which version are you running?

@rescuetrack-florian-schenk
Copy link

rescuetrack-florian-schenk commented Jan 25, 2021

@bboreham we are using weaveworks/weave-kube:2.6.2 and kernel 5.4.0-62-generic #70-Ubuntu SMP. The funny thing is that fastdp is working correctly until the next heartbeat times out. Is there any filter table offloaded to the NIC filtering VINs?

@bboreham
Copy link
Contributor

Is there any filter table offloaded to the NIC filtering VINs?

Sadly I didn't understand this question, and attempting to Google it resulted in lots of things to do with automobiles.

Weave Net communicates not with the NIC but with the OpenVSwitch datapath kernel module (which we abbreviate as ODP).
You can get a dump of that info by running weave report (it is verbose).

@yatsura
Copy link
Author

yatsura commented Jan 27, 2021

@florian2323 Sorry for the late reply. We discovered the same thing with the heartbeat response and turning off the offload option seemed to fix it. The cluster is only a small development cluster so we've left it running in sleeve mode. I noticed that they are other problems reported with that Broadcom card and OpenVSwitch. I tried various versions of the firmware without success.

The cluster is current running on:
driver: bnxt_en
version: 1.10.0
firmware-version: 214.4.65.4/pkg 21.60.29.38
expansion-rom-version:
bus-info: 0000:18:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: no
supports-priv-flags: no

And a 5.4.0 kernel. Sorry that this isn't really any help.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants