-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zebra: backpressure - Zebra push back on Buffer/Stream creation #15411
zebra: backpressure - Zebra push back on Buffer/Stream creation #15411
Conversation
Is this related to #15390? |
It is insofar as that 15390 has changes for both zebra and bgp. I believe we are running into some internal discussions in regards to the bgp changes and these zebra changes are independent of the bgp changes. So the goal is to get these in while we continue to work on the bgp changes. These changes are independent enough that all it would do is add more data to the buffer side of bgp at this point in time. This is in opposition to the backup being spread between bgp and zebra currently. |
b8a20dd
to
f8d4565
Compare
Testing/observation: Testing.pdf |
Looking at testing.pdf, the question comes to my head is: why the heap is increasing with the code change while the streams size remains OK? |
The Idea in the testing.pdf is to compare Streams of "initial state" and "after trigger" for baseline and with code changes seperately. The heap is more in one case than the other because the topology used is diferent for with/without code changes. Heap increases because the dplane_ctx and other modules taking extra memory. But the increase before and after trigger is ~200-250MB. Apologies for the confusion. Here is the new test results where the setup is same for baseline and with code changes and complete show memory output is provided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I'll let others look at this also. /cc @mjstapp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
f8d4565
to
20cf48d
Compare
a24eaa4
to
7efe441
Compare
Final Test Results @ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, gosh, do you really think the cli changes are necessary? I don't really think we need to expose the inner workings of the module.
7efe441
to
34cbec9
Compare
34cbec9
to
7e99688
Compare
Final test results @ |
Currently, the way zebra works is it creates pthread per client (BGP is of interest in this case) and this thread loops itself in zserv_read() to check for any incoming data. If there is one, then it reads, validates and adds it in the ibuf_fifo signalling the main thread to process the message. The main thread when it gets a change, processes the message, and invokes the function pointer registered in the header command. (Ex: zserv_handlers). Finally, if all of this was successful, this task reschedules itself and loops in zserv_read() again However, if there are already items on ibuf FIFO, that means zebra is slow in processing. And with the current mechanism if Zebra main is busy, the ibuf FIFO keeps growing holding up the memory. Show memory zebra:(Example: 15k streams hoarding ~160 MB of data) --- qmem libfrr --- Stream : 44 variable 3432352 15042 161243800 Fix: Client IO Thread: (zserv_read) - Stop doing the read events when we know there are X number of items on the FIFO already.(X - zebra zapi-packets <1-10000> (Default-1000) - Determine the number of items on the zserv->ibuf_fifo. Subtract this from the work items and only pull the number of items off that would take us to X items on the ibuf_fifo again. - If the number of items in the ibuf_fifo has reached to the maximum * Either initially when zserv_read() is called (or) * when processing the remainders of the incoming buffer the client IO thread is woken by the the zebra main. Main thread: (zserv_process_message) If the client ibuf always schedules a wakeup to the client IO to read more items from the socked buffer. This way we ensure - Client IO thread always tries to read the socket buffer and add more items to the ibuf_fifo (until max limit) - hidden config change (zebra zapi-packets <>) is taken into account Ticket: #3390099 Signed-off-by: Donald Sharp <[email protected]> Signed-off-by: Rajasekar Raja <[email protected]>
7e99688
to
a8efa99
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - looks good to me now
…rns (#19717) Added the below patches which are part of BGP Zebra back pressure feature required to keep the memory usage in check during route churns How I did it New patches that were added: Patch FRR Pull request 0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch FRRouting/frr#15411 0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch FRRouting/frr#15524 0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch FRRouting/frr#15326 0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch FRRouting/frr#15524 0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch FRRouting/frr#15524 0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch FRRouting/frr#15624 0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch FRRouting/frr#15728 0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch FRRouting/frr#15727 0038-zebra-Actually-display-I-O-buffer-sizes.patch FRRouting/frr#15708 0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch FRRouting/frr#15769 0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch FRRouting/frr#16034 0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch FRRouting/frr#16035 0042-zebra-Use-built-in-data-structure-counter.patch FRRouting/frr#16221 0043-zebra-Use-the-ctx-queue-counters.patch FRRouting/frr#16220 0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch FRRouting/frr#16220 0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch FRRouting/frr#16220 0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch FRRouting/frr#16220 0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch FRRouting/frr#16234 0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch FRRouting/frr#16368 0049-bgpd-backpressure-Improve-debuggability.patch FRRouting/frr#16368 0050-bgpd-backpressure-Avoid-use-after-free.patch FRRouting/frr#16437 0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch FRRouting/frr#16416 0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch FRRouting/frr#16416
…rns (sonic-net#19717) Added the below patches which are part of BGP Zebra back pressure feature required to keep the memory usage in check during route churns How I did it New patches that were added: Patch FRR Pull request 0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch FRRouting/frr#15411 0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch FRRouting/frr#15524 0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch FRRouting/frr#15326 0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch FRRouting/frr#15524 0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch FRRouting/frr#15524 0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch FRRouting/frr#15624 0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch FRRouting/frr#15728 0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch FRRouting/frr#15727 0038-zebra-Actually-display-I-O-buffer-sizes.patch FRRouting/frr#15708 0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch FRRouting/frr#15769 0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch FRRouting/frr#16034 0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch FRRouting/frr#16035 0042-zebra-Use-built-in-data-structure-counter.patch FRRouting/frr#16221 0043-zebra-Use-the-ctx-queue-counters.patch FRRouting/frr#16220 0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch FRRouting/frr#16220 0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch FRRouting/frr#16220 0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch FRRouting/frr#16220 0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch FRRouting/frr#16234 0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch FRRouting/frr#16368 0049-bgpd-backpressure-Improve-debuggability.patch FRRouting/frr#16368 0050-bgpd-backpressure-Avoid-use-after-free.patch FRRouting/frr#16437 0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch FRRouting/frr#16416 0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch FRRouting/frr#16416
…rns (sonic-net#19717) Added the below patches which are part of BGP Zebra back pressure feature required to keep the memory usage in check during route churns How I did it New patches that were added: Patch FRR Pull request 0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch FRRouting/frr#15411 0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch FRRouting/frr#15524 0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch FRRouting/frr#15326 0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch FRRouting/frr#15524 0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch FRRouting/frr#15524 0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch FRRouting/frr#15624 0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch FRRouting/frr#15728 0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch FRRouting/frr#15727 0038-zebra-Actually-display-I-O-buffer-sizes.patch FRRouting/frr#15708 0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch FRRouting/frr#15769 0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch FRRouting/frr#16034 0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch FRRouting/frr#16035 0042-zebra-Use-built-in-data-structure-counter.patch FRRouting/frr#16221 0043-zebra-Use-the-ctx-queue-counters.patch FRRouting/frr#16220 0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch FRRouting/frr#16220 0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch FRRouting/frr#16220 0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch FRRouting/frr#16220 0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch FRRouting/frr#16234 0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch FRRouting/frr#16368 0049-bgpd-backpressure-Improve-debuggability.patch FRRouting/frr#16368 0050-bgpd-backpressure-Avoid-use-after-free.patch FRRouting/frr#16437 0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch FRRouting/frr#16416 0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch FRRouting/frr#16416
…rns (#19717) Added the below patches which are part of BGP Zebra back pressure feature required to keep the memory usage in check during route churns How I did it New patches that were added: Patch FRR Pull request 0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch FRRouting/frr#15411 0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch FRRouting/frr#15524 0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch FRRouting/frr#15326 0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch FRRouting/frr#15524 0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch FRRouting/frr#15524 0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch FRRouting/frr#15624 0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch FRRouting/frr#15728 0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch FRRouting/frr#15727 0038-zebra-Actually-display-I-O-buffer-sizes.patch FRRouting/frr#15708 0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch FRRouting/frr#15769 0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch FRRouting/frr#16034 0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch FRRouting/frr#16035 0042-zebra-Use-built-in-data-structure-counter.patch FRRouting/frr#16221 0043-zebra-Use-the-ctx-queue-counters.patch FRRouting/frr#16220 0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch FRRouting/frr#16220 0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch FRRouting/frr#16220 0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch FRRouting/frr#16220 0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch FRRouting/frr#16234 0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch FRRouting/frr#16368 0049-bgpd-backpressure-Improve-debuggability.patch FRRouting/frr#16368 0050-bgpd-backpressure-Avoid-use-after-free.patch FRRouting/frr#16437 0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch FRRouting/frr#16416 0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch FRRouting/frr#16416
Currently, the way zebra works is it creates pthread per client (BGP is of interest in this case) and this thread loops itself in zserv_read() to check for any incoming data. If there is one, then it reads, validates and adds it in the ibuf_fifo signalling the main thread to process the message. The main thread when it gets a change, processes the message, and invokes the function pointer registered in the header command. (Ex: zserv_handlers).
Finally, if all of this was successful, this task reschedules itself and loops in zserv_read() again
However, if there are already items on the ibuf FIFO, that means zebra is slow in processing. And with the current mechanism if Zebra main is busy, the ibuf FIFO keeps growing holding up the memory.
Show memory zebra:(Example: 15k streams hoarding ~160 MB of data)
--- qmem libfrr ---
Stream : 44 variable 3432352 15042 161243800
Stream FIFO : 45 72 3240 48 3456
Fix:
Stop doing the read events when we know there are X number of items on the FIFO already.(X - zebra zapi-packets <1-10000> (Default-1000)
In zserv_read(), determine the number of items on the zserv->ibuf_fifo. Subtract this from the work items and only pull the number of items off that would take us to X items on the ibuf_fifo again.
If the number of items in the ibuf_fifo has reached to the maximum * Either initially when zserv_read() is called (or) * when processing the remainders of the incoming buffer the zserv_read which runs in client’s pthread expects the zebra main to schedule a wake up in zserv_process_message after the processing of all the items on the buffer.
NOTE
These code changes are on the server(Zebra) side.
The code changes from the client (BGP) will be raised as a separate review.
Ticket: #3390099