[linux-fslc]: libtraceevent: Fix build with binutils 2.35 #95

zandrey · 2020-08-03T21:27:22Z

This is a cherry-pick of upstream commit to resolve build issues with upcoming update of binutils to version 2.35.

In binutils 2.35, 'nm -D' changed to show symbol versions along with
symbol names, with the usual @@ separator. When generating
libtraceevent-dynamic-list we need just the names, so strip off the
version suffix if present.

Signed-off-by: Ben Hutchings [email protected]
Tested-by: Salvatore Bonaccorso [email protected]
Reviewed-by: Steven Rostedt [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Arnaldo Carvalho de Melo [email protected]
(cherry picked from commit 39efdd9)
Signed-off-by: Andrey Zhizhikin [email protected]

In binutils 2.35, 'nm -D' changed to show symbol versions along with symbol names, with the usual @@ separator. When generating libtraceevent-dynamic-list we need just the names, so strip off the version suffix if present. Signed-off-by: Ben Hutchings <[email protected]> Tested-by: Salvatore Bonaccorso <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> (cherry picked from commit 39efdd9) Signed-off-by: Andrey Zhizhikin <[email protected]>

commit fca3a45 upstream. The minimum reserve size was adjusted to take into account the height of the tree we are merging, however we can have a root with a level == 0. What we want is root_level + 1 to get the number of nodes we may have to cow. This fixes the enospc_debug warning pops with btrfs/101. Nikolay: this fixes failures on btrfs/060 btrfs/062 btrfs/063 and btrfs/195 That I was seeing, the call trace was: [ 3680.515564] ------------[ cut here ]------------ [ 3680.515566] BTRFS: block rsv returned -28 [ 3680.515585] WARNING: CPU: 2 PID: 8339 at fs/btrfs/block-rsv.c:521 btrfs_use_block_rsv+0x162/0x180 [ 3680.515587] Modules linked in: [ 3680.515591] CPU: 2 PID: 8339 Comm: btrfs Tainted: G W 5.9.0-rc8-default Freescale#95 [ 3680.515593] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014 [ 3680.515595] RIP: 0010:btrfs_use_block_rsv+0x162/0x180 [ 3680.515600] RSP: 0018:ffffa01ac9753910 EFLAGS: 00010282 [ 3680.515602] RAX: 0000000000000000 RBX: ffff984b34200000 RCX: 0000000000000027 [ 3680.515604] RDX: 0000000000000027 RSI: 0000000000000000 RDI: ffff984b3bd19e28 [ 3680.515606] RBP: 0000000000004000 R08: ffff984b3bd19e20 R09: 0000000000000001 [ 3680.515608] R10: 0000000000000004 R11: 0000000000000046 R12: ffff984b264fdc00 [ 3680.515609] R13: ffff984b13149000 R14: 00000000ffffffe4 R15: ffff984b34200000 [ 3680.515613] FS: 00007f4e2912b8c0(0000) GS:ffff984b3bd00000(0000) knlGS:0000000000000000 [ 3680.515615] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3680.515617] CR2: 00007fab87122150 CR3: 0000000118e42000 CR4: 00000000000006e0 [ 3680.515620] Call Trace: [ 3680.515627] btrfs_alloc_tree_block+0x8b/0x340 [ 3680.515633] ? __lock_acquire+0x51a/0xac0 [ 3680.515646] alloc_tree_block_no_bg_flush+0x4f/0x60 [ 3680.515651] __btrfs_cow_block+0x14e/0x7e0 [ 3680.515662] btrfs_cow_block+0x144/0x2c0 [ 3680.515670] merge_reloc_root+0x4d4/0x610 [ 3680.515675] ? btrfs_lookup_fs_root+0x78/0x90 [ 3680.515686] merge_reloc_roots+0xee/0x280 [ 3680.515695] relocate_block_group+0x2ce/0x5e0 [ 3680.515704] btrfs_relocate_block_group+0x16e/0x310 [ 3680.515711] btrfs_relocate_chunk+0x38/0xf0 [ 3680.515716] btrfs_shrink_device+0x200/0x560 [ 3680.515728] btrfs_rm_device+0x1ae/0x6a6 [ 3680.515744] ? _copy_from_user+0x6e/0xb0 [ 3680.515750] btrfs_ioctl+0x1afe/0x28c0 [ 3680.515755] ? find_held_lock+0x2b/0x80 [ 3680.515760] ? do_user_addr_fault+0x1f8/0x418 [ 3680.515773] ? __x64_sys_ioctl+0x77/0xb0 [ 3680.515775] __x64_sys_ioctl+0x77/0xb0 [ 3680.515781] do_syscall_64+0x31/0x70 [ 3680.515785] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: Nikolay Borisov <[email protected]> Fixes: 44d354a ("btrfs: relocation: review the call sites which can be interrupted by signal") CC: [email protected] # 5.4+ Reviewed-by: Nikolay Borisov <[email protected]> Tested-by: Nikolay Borisov <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

…address into one operation commit 9e9e085 upstream. When compiling kernel source 'make -j $(nproc)' with the up-and-running KASAN-enabled kernel on a 256-core machine, the following soft lockup is shown: watchdog: BUG: soft lockup - CPU#28 stuck for 22s! [kworker/28:1:1760] CPU: 28 PID: 1760 Comm: kworker/28:1 Kdump: loaded Not tainted 6.10.0-rc5 #95 Workqueue: events drain_vmap_area_work RIP: 0010:smp_call_function_many_cond+0x1d8/0xbb0 Code: 38 c8 7c 08 84 c9 0f 85 49 08 00 00 8b 45 08 a8 01 74 2e 48 89 f1 49 89 f7 48 c1 e9 03 41 83 e7 07 4c 01 e9 41 83 c7 03 f3 90 <0f> b6 01 41 38 c7 7c 08 84 c0 0f 85 d4 06 00 00 8b 45 08 a8 01 75 RSP: 0018:ffffc9000cb3fb60 EFLAGS: 00000202 RAX: 0000000000000011 RBX: ffff8883bc4469c0 RCX: ffffed10776e9949 RDX: 0000000000000002 RSI: ffff8883bb74ca48 RDI: ffffffff8434dc50 RBP: ffff8883bb74ca40 R08: ffff888103585dc0 R09: ffff8884533a1800 R10: 0000000000000004 R11: ffffffffffffffff R12: ffffed1077888d39 R13: dffffc0000000000 R14: ffffed1077888d38 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff8883bc400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005577b5c8d158 CR3: 0000000004850000 CR4: 0000000000350ef0 Call Trace: <IRQ> ? watchdog_timer_fn+0x2cd/0x390 ? __pfx_watchdog_timer_fn+0x10/0x10 ? __hrtimer_run_queues+0x300/0x6d0 ? sched_clock_cpu+0x69/0x4e0 ? __pfx___hrtimer_run_queues+0x10/0x10 ? srso_return_thunk+0x5/0x5f ? ktime_get_update_offsets_now+0x7f/0x2a0 ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f ? hrtimer_interrupt+0x2ca/0x760 ? __sysvec_apic_timer_interrupt+0x8c/0x2b0 ? sysvec_apic_timer_interrupt+0x6a/0x90 </IRQ> <TASK> ? asm_sysvec_apic_timer_interrupt+0x16/0x20 ? smp_call_function_many_cond+0x1d8/0xbb0 ? __pfx_do_kernel_range_flush+0x10/0x10 on_each_cpu_cond_mask+0x20/0x40 flush_tlb_kernel_range+0x19b/0x250 ? srso_return_thunk+0x5/0x5f ? kasan_release_vmalloc+0xa7/0xc0 purge_vmap_node+0x357/0x820 ? __pfx_purge_vmap_node+0x10/0x10 __purge_vmap_area_lazy+0x5b8/0xa10 drain_vmap_area_work+0x21/0x30 process_one_work+0x661/0x10b0 worker_thread+0x844/0x10e0 ? srso_return_thunk+0x5/0x5f ? __kthread_parkme+0x82/0x140 ? __pfx_worker_thread+0x10/0x10 kthread+0x2a5/0x370 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Debugging Analysis: 1. The following ftrace log shows that the lockup CPU spends too much time iterating vmap_nodes and flushing TLB when purging vm_area structures. (Some info is trimmed). kworker: funcgraph_entry: | drain_vmap_area_work() { kworker: funcgraph_entry: | mutex_lock() { kworker: funcgraph_entry: 1.092 us | __cond_resched(); kworker: funcgraph_exit: 3.306 us | } ... ... kworker: funcgraph_entry: | flush_tlb_kernel_range() { ... ... kworker: funcgraph_exit: # 7533.649 us | } ... ... kworker: funcgraph_entry: 2.344 us | mutex_unlock(); kworker: funcgraph_exit: $ 23871554 us | } The drain_vmap_area_work() spends over 23 seconds. There are 2805 flush_tlb_kernel_range() calls in the ftrace log. * One is called in __purge_vmap_area_lazy(). * Others are called by purge_vmap_node->kasan_release_vmalloc. purge_vmap_node() iteratively releases kasan vmalloc allocations and flushes TLB for each vmap_area. - [Rough calculation] Each flush_tlb_kernel_range() runs about 7.5ms. -- 2804 * 7.5ms = 21.03 seconds. -- That's why a soft lock is triggered. 2. Extending the soft lockup time can work around the issue (For example, # echo 60 > /proc/sys/kernel/watchdog_thresh). This confirms the above-mentioned speculation: drain_vmap_area_work() spends too much time. If we combine all TLB flush operations of the KASAN shadow virtual address into one operation in the call path 'purge_vmap_node()->kasan_release_vmalloc()', the running time of drain_vmap_area_work() can be saved greatly. The idea is from the flush_tlb_kernel_range() call in __purge_vmap_area_lazy(). And, the soft lockup won't be triggered. Here is the test result based on 6.10: [6.10 wo/ the patch] 1. ftrace latency profiling (record a trace if the latency > 20s). echo 20000000 > /sys/kernel/debug/tracing/tracing_thresh echo drain_vmap_area_work > /sys/kernel/debug/tracing/set_graph_function echo function_graph > /sys/kernel/debug/tracing/current_tracer echo 1 > /sys/kernel/debug/tracing/tracing_on 2. Run `make -j $(nproc)` to compile the kernel source 3. Once the soft lockup is reproduced, check the ftrace log: cat /sys/kernel/debug/tracing/trace # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 76) $ 50412985 us | } /* __purge_vmap_area_lazy */ 76) $ 50412997 us | } /* drain_vmap_area_work */ 76) $ 29165911 us | } /* __purge_vmap_area_lazy */ 76) $ 29165926 us | } /* drain_vmap_area_work */ 91) $ 53629423 us | } /* __purge_vmap_area_lazy */ 91) $ 53629434 us | } /* drain_vmap_area_work */ 91) $ 28121014 us | } /* __purge_vmap_area_lazy */ 91) $ 28121026 us | } /* drain_vmap_area_work */ [6.10 w/ the patch] 1. Repeat step 1-2 in "[6.10 wo/ the patch]" 2. The soft lockup is not triggered and ftrace log is empty. cat /sys/kernel/debug/tracing/trace # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 3. Setting 'tracing_thresh' to 10/5 seconds does not get any ftrace log. 4. Setting 'tracing_thresh' to 1 second gets ftrace log. cat /sys/kernel/debug/tracing/trace # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 23) $ 1074942 us | } /* __purge_vmap_area_lazy */ 23) $ 1074950 us | } /* drain_vmap_area_work */ The worst execution time of drain_vmap_area_work() is about 1 second. Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 282631c ("mm: vmalloc: remove global purge_vmap_area_root rb-tree") Signed-off-by: Adrian Huang <[email protected]> Co-developed-by: Uladzislau Rezki (Sony) <[email protected]> Signed-off-by: Uladzislau Rezki (Sony) <[email protected]> Tested-by: Jiwei Sun <[email protected]> Reviewed-by: Baoquan He <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Vincenzo Frascino <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

zandrey changed the title ~~libtraceevent: Fix build with binutils 2.35~~ [linux-fslc]: libtraceevent: Fix build with binutils 2.35 Aug 3, 2020

otavio merged commit 96e9737 into Freescale:5.4.x+fslc Aug 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[linux-fslc]: libtraceevent: Fix build with binutils 2.35 #95

[linux-fslc]: libtraceevent: Fix build with binutils 2.35 #95

zandrey commented Aug 3, 2020

[linux-fslc]: libtraceevent: Fix build with binutils 2.35 #95

[linux-fslc]: libtraceevent: Fix build with binutils 2.35 #95

Conversation

zandrey commented Aug 3, 2020