-
Notifications
You must be signed in to change notification settings - Fork 674
Unable to form fastdp with 10Ge Broadcom NIC nodes #3746
Comments
@yatsura can you please share MTU details for both the devices. As noted here https://www.weave.works/docs/net/latest/tasks/manage/fastdp/#packet-size-mtu if the WEAVE_MTU sized packets can not be sent then weave net falls back to sleeve mode. |
Thanks @murali-reddy, I tried changing it to 1316 for a small value, however, the current value is 8196. The cluster is communicating via fastdp using the 1GB Boradcom devices that are also available on the servers. The switch supports jumbo frames and is working on the 1GB Broadcom device. I can ping with a packet size of 8916 (ping IP.ADDRESS -c 8888) on 10GB cards to other 10GB cards of the same type. I suspect that the card device settings are the issue. I tried enabling OVS offloading support but it didn't make any difference. |
We are experiencing the same problem with a node (A) using a BCM57416 NIC. This node is unable to establish a fastdp connection to another node (B) which itself can establish fastdp connections to other nodes except node A. We narrowed it down to the heartbeat response sent by node B not being received on node A. Tcpdump on node A is not showing the packet at all on the inbound interface. It seems that offloading causes this issue. After setting This is not a viable solution since turning off offloading decreases speed too much. |
Thanks for the note, @florian2323. We had something similar with checksum offload back in the beginning - see #1255. |
Sadly I didn't understand this question, and attempting to Google it resulted in lots of things to do with automobiles. Weave Net communicates not with the NIC but with the OpenVSwitch datapath kernel module (which we abbreviate as ODP). |
@florian2323 Sorry for the late reply. We discovered the same thing with the heartbeat response and turning off the offload option seemed to fix it. The cluster is only a small development cluster so we've left it running in sleeve mode. I noticed that they are other problems reported with that Broadcom card and OpenVSwitch. I tried various versions of the firmware without success. The cluster is current running on: And a 5.4.0 kernel. Sorry that this isn't really any help. |
What you expected to happen?
Weave should form a fastdp for the new nodes that have the been joined to the cluster.
What happened?
Weave falls back to sleeve, however, other nodes remain connected via fastdp.
How to reproduce it?
Issue only appears to be related to the Broadcom BCM57416 NetXtreme-E Dual-Media 10G NIC. Connecting the same node with the Broadcom BCM5720 NetXtreme Gigabit forms a fastdp network. No other changes where made to the node.
Anything else we need to know?
NIC details:
(BCM57416)
driver: bnxt_en
version: 1.8.0
firmware-version: 214.0.253.1/pkg 21.40.25.31
expansion-rom-version:
bus-info: 0000:18:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: no
supports-priv-flags: no
(BCM5720)
driver: tg3
version: 3.137
firmware-version: FFV20.8.4 bc 5720-v1.39
expansion-rom-version:
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
Versions:
Logs:
First 1000 lines for failing node:
pm-dell-01 weave-net debug log
The text was updated successfully, but these errors were encountered: