Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[action] [PR:14859] image_config: copp: Enable rate limiting for bgp, lacp, dhcp, lldp, macsec and udld #17111

Merged
merged 1 commit into from
Nov 7, 2023

Conversation

mssonicbld
Copy link
Collaborator

Why I did it

It was observed that a flood of DHCP packets without rate-limiting can cause BGP flaps or lacp keepalive losses.
This change attempts to prevent or reduce such BGP flaps by enabling appropriate rate-limiting in SONiC for all traffic types.

Work item tracking
  • Microsoft ADO 17964421:

How I did it

Set a reasonable CIR/CBS value of 300 for queue4_group3 (dhcp, lldp, macsec) and 6000 for queue4_group1.
The value 300 was arrived at after testing with dhcp flooding using ptf (using multiple threads). Throttling at this rate was necessary to ensure that dhcp flooding does not cause BGP flaps.

How to verify it

  1. Verified with this script running from ptf, that BGP flaps don't happen when CBS/CIR is set at 300 for queue4_group3.

import threading
from scapy.all import *

def send_dhcp_discover(intf):
dhcp_discover = Ether(dst='ff:ff:ff:ff:ff:ff',src=RandMAC())
/IP(src='1.1.1.1',dst='255.255.255.255')
/UDP(sport=68,dport=67)
/DHCP(options=[('message-type','discover'),('end')])
sendp(dhcp_discover,count=100000,iface=intf)

if name == "main":
t1 = threading.Thread(target=send_dhcp_discover, args=("eth1",))
t2 = threading.Thread(target=send_dhcp_discover, args=("eth2",))
t1.start()
t2.start()
t1.join()
t2.join()

  1. Verified on Arista-7260CX3-D108C8 running 202012 that the copp rule for queue4_group1 and queue4_group3 do NOT affect BGP packets. To verify this using PTF, the copp rules were modified to set the "CBS" and "CIR" for queue4_group1 and queue4_group3 at 600pps and 50k packets each of "BGP open" and "DHCP Discover" were simultaneously sent from the same PTF port to the DUT. It was verified using "show c cpu" that packets are hitting the cpu queue at 1200 pps (double the configured CIR/CBS for these packet types). This helped conclude that throttling rate is per trap (or packet type) and not per queue.

  2. Verified with updated sonic-mgmt tests ([tests/copp]: Update copp mgmt tests to support new rate-limits sonic-mgmt#8199) on broadcom and mellanox platforms that these traffic types are rate-limited.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211

Tested branch (Please provide the tested image version)

  • 20220531.33
  • 20201231.107

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

…ld (sonic-net#14859)

Why I did it
It was observed that a flood of DHCP packets without rate-limiting can cause BGP flaps or lacp keepalive losses.
This change attempts to prevent or reduce such BGP flaps by enabling appropriate rate-limiting in SONiC for all traffic types.

Work item tracking
Microsoft ADO 17964421:

How I did it
Set a reasonable CIR/CBS value of 300 for queue4_group3 (dhcp, lldp, macsec) and 6000 for queue4_group1.
The value 300 was arrived at after testing with dhcp flooding using ptf (using multiple threads). Throttling at this rate was necessary to ensure that dhcp flooding does not cause BGP flaps.

How to verify it
Verified with this script running from ptf, that BGP flaps don't happen when CBS/CIR is set at 300 for queue4_group3.

 import threading
 from scapy.all import *
 
 def send_dhcp_discover(intf):
     dhcp_discover = Ether(dst='ff:ff:ff:ff:ff:ff',src=RandMAC()) \
                         /IP(src='1.1.1.1',dst='255.255.255.255') \
                         /UDP(sport=68,dport=67) \
                         /DHCP(options=[('message-type','discover'),('end')])
     sendp(dhcp_discover,count=100000,iface=intf)
 
 
 if __name__ == "__main__":
     t1 = threading.Thread(target=send_dhcp_discover, args=("eth1",))
     t2 = threading.Thread(target=send_dhcp_discover, args=("eth2",))
     t1.start()
     t2.start()
     t1.join()
     t2.join()

Verified on Arista-7260CX3-D108C8 running 202012 that the copp rule for queue4_group1 and queue4_group3 do NOT affect BGP packets. To verify this using PTF, the copp rules were modified to set the "CBS" and "CIR" for queue4_group1 and queue4_group3 at 600pps and 50k packets each of "BGP open" and "DHCP Discover" were simultaneously sent from the same PTF port to the DUT. It was verified using "show c cpu" that packets are hitting the cpu queue at 1200 pps (double the configured CIR/CBS for these packet types). This helped conclude that throttling rate is per trap (or packet type) and not per queue.

Verified with updated sonic-mgmt tests ([tests/copp]: Update copp mgmt tests to support new rate-limits sonic-mgmt#8199) on broadcom and mellanox platforms that these traffic types are rate-limited.

Signed-off-by: Prabhat Aravind <[email protected]>
@mssonicbld
Copy link
Collaborator Author

Original PR: #14859

@mssonicbld mssonicbld merged commit 78cc6cf into sonic-net:202305 Nov 7, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants