Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
zebra: fix assert in process_subq_route
Bug is reporoduced in case of switching interfaces betwean VRFs. ospf6d is enabled and configured in each VRF. 'dest' can be removed from the route node in the time when the same route node waiting processing in another sub-queue. A route node must only be in one sub-queue at a time. Details: 1. Config: interface if0 ipv6 address 2001:db8:cafe:2::2/64 ipv6 nat inside ipv6 ospf6 area 0.0.0.51 ipv6 ospf6 cost 10 vrf test2 exit ! interface if1 ipv6 address 2001:db8:cafe:4::1/64 ipv6 nat outside ipv6 ospf6 area 0.0.0.0 ipv6 ospf6 cost 10 vrf test2 exit ! router ospf6 ospf6 router-id 2.2.2.2 exit ! router ospf6 vrf test1 ospf6 router-id 2.2.2.2 exit ! router ospf6 vrf test2 ospf6 router-id 2.2.2.2 exit I just quickly switched interfaces between different VRFs (default/test1/test2). 2. Log messages: Aug 02 16:51:56 ubuntu zebra[386985]: [MFYWV-KH3MC] process_subq_early_route_add: (0:?):2001:db8:cafe:2::/64: Inserting route rn 0x56267593de90, re 0x56267595ae40 (connected) existing 0x0, same_count 0 Aug 02 16:51:56 ubuntu zebra[386985]: [Q4T2G-E2SQF] process_subq_early_route_add: dumping RE entry 0x56267595ae40 for 2001:db8:cafe:2::/64 vrf default(0) Aug 02 16:51:56 ubuntu zebra[386985]: [GCGMT-SQR82] rib_link: (0:?):2001:db8:cafe:2::/64: rn 0x56267593de90 adding dest Aug 02 16:51:56 ubuntu zebra[386985]: [JF0K0-DVHWH] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: queued rn 0x56267593de90 into sub-queue Connected Routes Aug 02 16:51:56 ubuntu zebra[386985]: [QE6V0-J8BG5] rib_delnode: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595ae40, removing Aug 02 16:51:56 ubuntu zebra[386985]: [KMPGN-JBRKW] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90 is already queued in sub-queue Connected Routes Aug 02 16:51:56 ubuntu zebra[386985]: [MFYWV-KH3MC] process_subq_early_route_add: (0:254):2001:db8:cafe:2::/64: Inserting route rn 0x56267593de90, re 0x56267595abf0 (ospf6) existing 0x0, same_count 1 Aug 02 16:51:56 ubuntu zebra[386985]: [Q4T2G-E2SQF] process_subq_early_route_add: dumping RE entry 0x56267595abf0 for 2001:db8:cafe:2::/64 vrf default(0) Aug 02 16:51:56 ubuntu zebra[386985]: [KMPGN-JBRKW] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90 is already queued in sub-queue Connected Routes Aug 02 16:51:56 ubuntu zebra[386985]: [YEYFX-TDSC2] process_subq_early_route_add: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, removing unneeded re 0x56267595ae40 Aug 02 16:51:56 ubuntu zebra[386985]: [Y53JX-CBC5H] rib_unlink: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595ae40 Aug 02 16:51:56 ubuntu zebra[386985]: [QE6V0-J8BG5] rib_delnode: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595abf0, removing Aug 02 16:51:56 ubuntu zebra[386985]: [JF0K0-DVHWH] rib_meta_queue_add: (0:254):2001:db8:cafe:2::/64: queued rn 0x56267593de90 into sub-queue RIP/OSPF/ISIS/EIGRP/NHRP Routes Aug 02 16:51:56 ubuntu zebra[386985]: [NZNZ4-7P54Y] default(0:254):2001:db8:cafe:2::/64: Processing rn 0x56267593de90 Aug 02 16:51:56 ubuntu zebra[386985]: [ZJVZ4-XEGPF] default(0:254):2001:db8:cafe:2::/64: Examine re 0x56267595abf0 (ospf6) status: Removed Changed flags: None dist 110 metric 10 Aug 02 16:51:56 ubuntu zebra[386985]: [NM15X-X83N9] rib_process: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, removing re 0x56267595abf0 Aug 02 16:51:56 ubuntu zebra[386985]: [Y53JX-CBC5H] rib_unlink: (0:254):2001:db8:cafe:2::/64: rn 0x56267593de90, re 0x56267595abf0 Aug 02 16:51:56 ubuntu zebra[386985]: [KT8QQ-45WQ0] rib_gc_dest: (0:?):2001:db8:cafe:2::/64: removing dest from table Aug 02 16:51:56 ubuntu zebra[386985]: [HH6N2-PDCJS] default(0:0):2001:db8:cafe:2::/64 rn 0x56267593de90 dequeued from sub-queue Connected Routes 3. ...and then assert: (gdb) bt #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140662163115136) at ./nptl/pthread_kill.c:44 sonic-net#1 __pthread_kill_internal (signo=6, threadid=140662163115136) at ./nptl/pthread_kill.c:78 sonic-net#2 __GI___pthread_kill (threadid=140662163115136, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 sonic-net#3 0x00007fee76753476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 sonic-net#4 0x00007fee767397f3 in __GI_abort () at ./stdlib/abort.c:79 sonic-net#5 0x00007fee76a420fd in _zlog_assert_failed () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 sonic-net#6 0x0000562674efe0f0 in process_subq_route (qindex=7 '\a', lnode=0x562675940c60) at zebra/zebra_rib.c:2540 sonic-net#7 process_subq (qindex=META_QUEUE_NOTBGP, subq=0x562675574580) at zebra/zebra_rib.c:3055 sonic-net#8 meta_queue_process (dummy=<optimized out>, data=0x56267556d430) at zebra/zebra_rib.c:3091 sonic-net#9 0x00007fee76a386e8 in work_queue_run () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 sonic-net#10 0x00007fee76a31c91 in thread_call () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 sonic-net#11 0x00007fee769ee528 in frr_run () from target:/usr/lib/x86_64-linux-gnu/frr/libfrr.so.0 sonic-net#12 0x0000562674e97ec5 in main (argc=5, argv=0x7ffd1e275958) at zebra/main.c:478 (gdb) print lnode->data $10 = (void *) 0x56267593de90 (gdb) p/x *(struct route_node *)0x56267593de90 $11 = { p = { family = 0xa, prefixlen = 0x40, u = { prefix = 0x20, prefix4 = { s_addr = 0xb80d0120 }, prefix6 = { __in6_u = { __u6_addr8 = {0x20, 0x1, 0xd, 0xb8, 0xca, 0xfe, 0x0, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, __u6_addr16 = {0x120, 0xb80d, 0xfeca, 0x200, 0x0, 0x0, 0x0, 0x0}, __u6_addr32 = {0xb80d0120, 0x200feca, 0x0, 0x0} } }, ... table = 0x5626755ae010, parent = 0x5626755ae070, link = {0x0, 0x0}, lock = 0x4, nodehash = { hi = { next = 0x5626755ae0d0, hashval = 0xebe8bdbf } }, info = 0x0 3. What's happen: We removed unneeded re 0x56267595ae40 while adding re 0x56267595abf0. It was the last connected re, but rn 0x56267593de90 is still in the connected sub-queue. Then rib_delnode was called for 0x56267595abf0. (rn 0x56267593de90 is still in the connected sub-queue). rib_delnode have called rib_meta_queue_add which have checked, that rn is absent in sub-queue RIP/OSPF/ISIS/EIGRP/NHRP and have added rn in the second sub-queue. Fixes: d7ac4c4 ("zebra: Introduce early route processing on the MetaQ") Signed-off-by: Pavel Ivashchenko <[email protected]>
- Loading branch information