-
Notifications
You must be signed in to change notification settings - Fork 54.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repaired broken USB ID 0x15E to 0x015E #22
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The USB device with ID 0x015E is known as Huwai WIXUBB116. The module works fine when the ID is changed to 0x015E.
that doesn't change anything... |
tklauser
pushed a commit
to tklauser/linux
that referenced
this pull request
Sep 18, 2012
…d reasons We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Cc: [email protected]
wrobelda
referenced
this pull request
in wrobelda/linux-sunxi
Sep 19, 2012
Fix build using O= (issue linux-sunxi#21) and inline build on CM9
wrobelda
referenced
this pull request
in wrobelda/linux-sunxi
Sep 19, 2012
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 linux-sunxi#1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] linux-sunxi#2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f linux-sunxi#3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 linux-sunxi#4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] linux-sunxi#5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] linux-sunxi#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 linux-sunxi#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 linux-sunxi#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 linux-sunxi#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f linux-sunxi#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e linux-sunxi#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f linux-sunxi#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad linux-sunxi#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 linux-sunxi#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a linux-sunxi#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 linux-sunxi#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b linux-sunxi#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 linux-sunxi#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c linux-sunxi#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 linux-sunxi#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 linux-sunxi#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] linux-sunxi#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] linux-sunxi#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 linux-sunxi#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 linux-sunxi#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
heftig
referenced
this pull request
in zen-kernel/zen-kernel
Sep 29, 2012
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 #24 [ffff8810343bfee8] kthread at ffffffff8108dd96 #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
koenkooi
pushed a commit
to koenkooi/linux
that referenced
this pull request
Oct 2, 2012
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Ben Hutchings <[email protected]>
koenkooi
pushed a commit
to koenkooi/linux
that referenced
this pull request
Oct 4, 2012
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Ben Hutchings <[email protected]>
noamc
referenced
this pull request
in Mellanox/linux
Oct 16, 2012
…fpga] IPI fixes 1. IPI was not going from cpu1 -> cpu0. Turns out that the IRQ mode programmed IDU_IRQ_MOD_TCPU_ALLRECP mismatched in value in kernel and ISS. kernel (per IDU specs) defines it as '3' whereas ISS believes it shd be 7 (which is clearly a BUG). 2. the IDU IRQs are hardwaired hence need to be setup as IRQF_PERCPU to disallow any irq-affinity changes. Signed-off-by: Vineet Gupta <[email protected]>
koenkooi
pushed a commit
to koenkooi/linux
that referenced
this pull request
Oct 17, 2012
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Ben Hutchings <[email protected]>
hknkkn
pushed a commit
to hknkkn/linux-dynticks
that referenced
this pull request
Oct 29, 2012
Printing the "start_ip" for every secondary cpu is very noisy on a large system - and doesn't add any value. Drop this message. Console log before: Booting Node 0, Processors #1 smpboot cpu 1: start_ip = 96000 #2 smpboot cpu 2: start_ip = 96000 #3 smpboot cpu 3: start_ip = 96000 #4 smpboot cpu 4: start_ip = 96000 ... torvalds#31 smpboot cpu 31: start_ip = 96000 Brought up 32 CPUs Console log after: Booting Node 0, Processors #1 #2 #3 #4 #5 torvalds#6 torvalds#7 Ok. Booting Node 1, Processors torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 Ok. Booting Node 0, Processors torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 Ok. Booting Node 1, Processors torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 Brought up 32 CPUs Acked-by: Borislav Petkov <[email protected]> Signed-off-by: Tony Luck <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
koenkooi
pushed a commit
to koenkooi/linux
that referenced
this pull request
Oct 31, 2012
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Ben Hutchings <[email protected]>
vineetgarc
referenced
this pull request
in foss-for-synopsys-dwc-arc-processors/linux
Oct 31, 2012
1. IPI was not going from cpu1 -> cpu0. Turns out that the IRQ mode programmed IDU_IRQ_MOD_TCPU_ALLRECP mismatched in value in kernel and ISS. kernel (per IDU specs) defines it as '3' whereas ISS believes it shd be 7 (which is clearly a BUG). 2. the IDU IRQs are hardwaired hence need to be setup as IRQF_PERCPU to disallow any irq-affinity changes. Signed-off-by: Vineet Gupta <[email protected]>
koenkooi
pushed a commit
to koenkooi/linux
that referenced
this pull request
Nov 14, 2012
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Ben Hutchings <[email protected]>
kees
pushed a commit
to kees/linux
that referenced
this pull request
Nov 16, 2012
…d reasons BugLink: http://bugs.launchpad.net/bugs/1035435 commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Herton Ronaldo Krzesinski <[email protected]>
koenkooi
pushed a commit
to koenkooi/linux
that referenced
this pull request
Nov 21, 2012
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Ben Hutchings <[email protected]>
vineetgarc
referenced
this pull request
in foss-for-synopsys-dwc-arc-processors/linux
Dec 31, 2012
1. IPI was not going from cpu1 -> cpu0. Turns out that the IRQ mode programmed IDU_IRQ_MOD_TCPU_ALLRECP mismatched in value in kernel and ISS. kernel (per IDU specs) defines it as '3' whereas ISS believes it shd be 7 (which is clearly a BUG). 2. the IDU IRQs are hardwaired hence need to be setup as IRQF_PERCPU to disallow any irq-affinity changes. Signed-off-by: Vineet Gupta <[email protected]>
cianmcgovern
pushed a commit
to cianmcgovern/linux
that referenced
this pull request
Mar 10, 2013
…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Ben Hutchings <[email protected]>
torvalds
pushed a commit
that referenced
this pull request
Jul 10, 2013
Several people reported the warning: "kernel BUG at kernel/timer.c:729!" and the stack trace is: #7 [ffff880214d25c10] mod_timer+501 at ffffffff8106d905 #8 [ffff880214d25c50] br_multicast_del_pg.isra.20+261 at ffffffffa0731d25 [bridge] #9 [ffff880214d25c80] br_multicast_disable_port+88 at ffffffffa0732948 [bridge] #10 [ffff880214d25cb0] br_stp_disable_port+154 at ffffffffa072bcca [bridge] #11 [ffff880214d25ce8] br_device_event+520 at ffffffffa072a4e8 [bridge] #12 [ffff880214d25d18] notifier_call_chain+76 at ffffffff8164aafc #13 [ffff880214d25d50] raw_notifier_call_chain+22 at ffffffff810858f6 #14 [ffff880214d25d60] call_netdevice_notifiers+45 at ffffffff81536aad #15 [ffff880214d25d80] dev_close_many+183 at ffffffff81536d17 #16 [ffff880214d25dc0] rollback_registered_many+168 at ffffffff81537f68 #17 [ffff880214d25de8] rollback_registered+49 at ffffffff81538101 #18 [ffff880214d25e10] unregister_netdevice_queue+72 at ffffffff815390d8 #19 [ffff880214d25e30] __tun_detach+272 at ffffffffa074c2f0 [tun] #20 [ffff880214d25e88] tun_chr_close+45 at ffffffffa074c4bd [tun] #21 [ffff880214d25ea8] __fput+225 at ffffffff8119b1f1 #22 [ffff880214d25ef0] ____fput+14 at ffffffff8119b3fe #23 [ffff880214d25f00] task_work_run+159 at ffffffff8107cf7f #24 [ffff880214d25f30] do_notify_resume+97 at ffffffff810139e1 #25 [ffff880214d25f50] int_signal+18 at ffffffff8164f292 this is due to I forgot to check if mp->timer is armed in br_multicast_del_pg(). This bug is introduced by commit 9f00b2e (bridge: only expire the mdb entry when query is received). Same for __br_mdb_del(). Tested-by: poma <[email protected]> Reported-by: LiYonghua <[email protected]> Reported-by: Robert Hancock <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: "David S. Miller" <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
swarren
pushed a commit
to swarren/linux-tegra
that referenced
this pull request
Sep 11, 2013
When booting secondary CPUs, announce_cpu() is called to show which cpu has been brought up. For example: [ 0.402751] smpboot: Booting Node 0, Processors #1 #2 #3 #4 #5 OK [ 0.525667] smpboot: Booting Node 1, Processors torvalds#6 torvalds#7 torvalds#8 torvalds#9 torvalds#10 torvalds#11 OK [ 0.755592] smpboot: Booting Node 0, Processors torvalds#12 torvalds#13 torvalds#14 torvalds#15 torvalds#16 torvalds#17 OK [ 0.890495] smpboot: Booting Node 1, Processors torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 But the last "OK" is lost, because 'nr_cpu_ids-1' represents the maximum possible cpu id. It should use the maximum present cpu id in case not all CPUs booted up. Signed-off-by: Libin <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ tweaked the changelog, removed unnecessary line break, tweaked the format to align the fields vertically. ] Signed-off-by: Ingo Molnar <[email protected]>
swarren
pushed a commit
to swarren/linux-tegra
that referenced
this pull request
Oct 1, 2013
If memory allocation of in pcpu_embed_first_chunk() fails, the allocated memory is not released correctly. In the release loop also the non-allocated elements are released which leads to the following kernel BUG on systems with very little memory: [ 0.000000] kernel BUG at mm/bootmem.c:307! [ 0.000000] illegal operation: 0001 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.0 torvalds#22 [ 0.000000] task: 0000000000a20ae0 ti: 0000000000a08000 task.ti: 0000000000a08000 [ 0.000000] Krnl PSW : 0400000180000000 0000000000abda7a (__free+0x116/0x154) [ 0.000000] R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:0 CC:0 PM:0 EA:3 ... [ 0.000000] [<0000000000abdce2>] mark_bootmem_node+0xde/0xf0 [ 0.000000] [<0000000000abdd9c>] mark_bootmem+0xa8/0x118 [ 0.000000] [<0000000000abcbba>] pcpu_embed_first_chunk+0xe7a/0xf0c [ 0.000000] [<0000000000abcc96>] setup_per_cpu_areas+0x4a/0x28c To fix the problem now only allocated elements are released. This then leads to the correct kernel panic: [ 0.000000] Kernel panic - not syncing: Failed to initialize percpu areas. ... [ 0.000000] Call Trace: [ 0.000000] ([<000000000011307e>] show_trace+0x132/0x150) [ 0.000000] [<0000000000113160>] show_stack+0xc4/0xd4 [ 0.000000] [<00000000007127dc>] dump_stack+0x74/0xd8 [ 0.000000] [<00000000007123fe>] panic+0xea/0x264 [ 0.000000] [<0000000000b14814>] setup_per_cpu_areas+0x5c/0x28c tj: Flipped if conditional so that it doesn't need "continue". Signed-off-by: Michael Holzheu <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
swarren
pushed a commit
to swarren/linux-tegra
that referenced
this pull request
Oct 14, 2013
As the new x86 CPU bootup printout format code maintainer, I am taking immediate action to improve and clean (and thus indulge my OCD) the reporting of the cores when coming up online. Fix padding to a right-hand alignment, cleanup code and bind reporting width to the max number of supported CPUs on the system, like this: [ 0.074509] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 OK [ 0.644008] smpboot: Booting Node 1, Processors: torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 OK [ 1.245006] smpboot: Booting Node 2, Processors: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 OK [ 1.864005] smpboot: Booting Node 3, Processors: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 OK [ 2.489005] smpboot: Booting Node 4, Processors: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 OK [ 3.093005] smpboot: Booting Node 5, Processors: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 OK [ 3.698005] smpboot: Booting Node 6, Processors: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 OK [ 4.304005] smpboot: Booting Node 7, Processors: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 OK [ 4.961413] Brought up 64 CPUs and this: [ 0.072367] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 OK [ 0.686329] Brought up 8 CPUs Signed-off-by: Borislav Petkov <[email protected]> Cc: Libin <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
swarren
pushed a commit
to swarren/linux-tegra
that referenced
this pull request
Oct 14, 2013
Turn it into (for example): [ 0.073380] x86: Booting SMP configuration: [ 0.074005] .... node #0, CPUs: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 [ 0.603005] .... node #1, CPUs: torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 [ 1.200005] .... node #2, CPUs: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 [ 1.796005] .... node #3, CPUs: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 [ 2.393005] .... node #4, CPUs: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 [ 2.996005] .... node #5, CPUs: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 [ 3.600005] .... node torvalds#6, CPUs: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 [ 4.202005] .... node torvalds#7, CPUs: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 [ 4.811005] .... node torvalds#8, CPUs: torvalds#64 torvalds#65 torvalds#66 torvalds#67 torvalds#68 torvalds#69 #70 torvalds#71 [ 5.421006] .... node torvalds#9, CPUs: torvalds#72 torvalds#73 torvalds#74 torvalds#75 torvalds#76 torvalds#77 torvalds#78 torvalds#79 [ 6.032005] .... node torvalds#10, CPUs: torvalds#80 torvalds#81 torvalds#82 torvalds#83 torvalds#84 torvalds#85 torvalds#86 torvalds#87 [ 6.648006] .... node torvalds#11, CPUs: torvalds#88 torvalds#89 torvalds#90 torvalds#91 torvalds#92 torvalds#93 torvalds#94 torvalds#95 [ 7.262005] .... node torvalds#12, CPUs: torvalds#96 torvalds#97 torvalds#98 torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103 [ 7.865005] .... node torvalds#13, CPUs: torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111 [ 8.466005] .... node torvalds#14, CPUs: torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119 [ 9.073006] .... node torvalds#15, CPUs: torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127 [ 9.679901] x86: Booted up 16 nodes, 128 CPUs and drop useless elements. Change num_digits() to hpa's division-avoiding, cell-phone-typed version which he went at great lengths and pains to submit on a Saturday evening. Signed-off-by: Borislav Petkov <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
torvalds
pushed a commit
that referenced
this pull request
Dec 2, 2013
…culation Currently mx53 (CortexA8) running at 1GHz reports: Calibrating delay loop... 663.55 BogoMIPS (lpj=3317760) Tom Evans verified that alignments of 0x0 and 0x8 run the two instructions of __loop_delay in one clock cycle (1 clock/loop), while alignments of 0x4 and 0xc take 3 clocks to run the loop twice. (1.5 clock/loop) The original object code looks like this: 00000010 <__loop_const_udelay>: 10: e3e01000 mvn r1, #0 14: e51f201c ldr r2, [pc, #-28] ; 0 <__loop_udelay-0x8> 18: e5922000 ldr r2, [r2] 1c: e0800921 add r0, r0, r1, lsr #18 20: e1a00720 lsr r0, r0, #14 24: e0822b21 add r2, r2, r1, lsr #22 28: e1a02522 lsr r2, r2, #10 2c: e0000092 mul r0, r2, r0 30: e0800d21 add r0, r0, r1, lsr #26 34: e1b00320 lsrs r0, r0, #6 38: 01a0f00e moveq pc, lr 0000003c <__loop_delay>: 3c: e2500001 subs r0, r0, #1 40: 8afffffe bhi 3c <__loop_delay> 44: e1a0f00e mov pc, lr After adding the 'align 3' directive to __loop_delay (align to 8 bytes): 00000010 <__loop_const_udelay>: 10: e3e01000 mvn r1, #0 14: e51f201c ldr r2, [pc, #-28] ; 0 <__loop_udelay-0x8> 18: e5922000 ldr r2, [r2] 1c: e0800921 add r0, r0, r1, lsr #18 20: e1a00720 lsr r0, r0, #14 24: e0822b21 add r2, r2, r1, lsr #22 28: e1a02522 lsr r2, r2, #10 2c: e0000092 mul r0, r2, r0 30: e0800d21 add r0, r0, r1, lsr #26 34: e1b00320 lsrs r0, r0, #6 38: 01a0f00e moveq pc, lr 3c: e320f000 nop {0} 00000040 <__loop_delay>: 40: e2500001 subs r0, r0, #1 44: 8afffffe bhi 40 <__loop_delay> 48: e1a0f00e mov pc, lr 4c: e320f000 nop {0} , which now reports: Calibrating delay loop... 996.14 BogoMIPS (lpj=4980736) Some more test results: On mx31 (ARM1136) running at 532 MHz, before the patch: Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184) On mx31 (ARM1136) running at 532 MHz after the patch: Calibrating delay loop... 528.79 BogoMIPS (lpj=2643968) Also tested on mx6 (CortexA9) and on mx27 (ARM926), which shows the same BogoMIPS value before and after this patch. Reported-by: Tom Evans <[email protected]> Suggested-by: Tom Evans <[email protected]> Signed-off-by: Fabio Estevam <[email protected]> Signed-off-by: Russell King <[email protected]>
gregnietsky
pushed a commit
to Distrotech/linux
that referenced
this pull request
Apr 9, 2014
commit 2cab86b upstream. Sometime the ASCONF_ACK parameters can equal to the fourfold of ASCONF parameters, this only happend in some special case: ASCONF parameter is : Unrecognized Parameter (4 bytes) ASCONF_ACK parameter should be: Error Cause Indication parameter (8 bytes header) + Error Cause (4 bytes header) + Unrecognized Parameter (4bytes) Four 4bytes Unrecognized Parameters in ASCONF chunk will cause panic. Pid: 0, comm: swapper Not tainted 2.6.38-next+ torvalds#22 Bochs Bochs EIP: 0060:[<c0717eae>] EFLAGS: 00010246 CPU: 0 EIP is at skb_put+0x60/0x70 EAX: 00000077 EBX: c09060e2 ECX: dec1dc30 EDX: c09469c0 ESI: 00000000 EDI: de3c8d40 EBP: dec1dc58 ESP: dec1dc2c DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process swapper (pid: 0, ti=dec1c000 task=c09aef20 task.ti=c0980000) Stack: c09469c0 e1894fa4 00000044 00000004 de3c8d00 de3c8d00 de3c8d44 de3c8d40 c09060e2 de25dd80 de3c8d40 dec1dc7c e1894fa4 dec1dcb0 00000040 00000004 00000000 00000800 00000004 00000004 dec1dce0 e1895a2b dec1dcb4 de25d960 Call Trace: [<e1894fa4>] ? sctp_addto_chunk+0x4e/0x89 [sctp] [<e1894fa4>] sctp_addto_chunk+0x4e/0x89 [sctp] [<e1895a2b>] sctp_process_asconf+0x32f/0x3d1 [sctp] [<e188d554>] sctp_sf_do_asconf+0xf8/0x173 [sctp] [<e1890b02>] sctp_do_sm+0xb8/0x159 [sctp] [<e18a2248>] ? sctp_cname+0x0/0x52 [sctp] [<e189392d>] sctp_assoc_bh_rcv+0xac/0xe3 [sctp] [<e1897d76>] sctp_inq_push+0x2d/0x30 [sctp] [<e18a21b2>] sctp_rcv+0x7a7/0x83d [sctp] [<c077a95c>] ? ipv4_confirm+0x118/0x125 [<c073a970>] ? nf_iterate+0x34/0x62 [<c074789d>] ? ip_local_deliver_finish+0x0/0x194 [<c074789d>] ? ip_local_deliver_finish+0x0/0x194 [<c0747992>] ip_local_deliver_finish+0xf5/0x194 [<c074789d>] ? ip_local_deliver_finish+0x0/0x194 [<c0747a6e>] NF_HOOK.clone.1+0x3d/0x44 [<c0747ab3>] ip_local_deliver+0x3e/0x44 [<c074789d>] ? ip_local_deliver_finish+0x0/0x194 [<c074775c>] ip_rcv_finish+0x29f/0x2c7 [<c07474bd>] ? ip_rcv_finish+0x0/0x2c7 [<c0747a6e>] NF_HOOK.clone.1+0x3d/0x44 [<c0747cae>] ip_rcv+0x1f5/0x233 [<c07474bd>] ? ip_rcv_finish+0x0/0x2c7 [<c071dce3>] __netif_receive_skb+0x310/0x336 [<c07221f3>] netif_receive_skb+0x4b/0x51 [<e0a4ed3d>] cp_rx_poll+0x1e7/0x29c [8139cp] [<c072275e>] net_rx_action+0x65/0x13a [<c0445a54>] __do_softirq+0xa1/0x149 [<c04459b3>] ? __do_softirq+0x0/0x149 <IRQ> [<c0445891>] ? irq_exit+0x37/0x72 [<c040a7e9>] ? do_IRQ+0x81/0x95 [<c07b3670>] ? common_interrupt+0x30/0x38 [<c0428058>] ? native_safe_halt+0xa/0xc [<c040f5d7>] ? default_idle+0x58/0x92 [<c0408fb0>] ? cpu_idle+0x96/0xb2 [<c0797989>] ? rest_init+0x5d/0x5f [<c09fd90c>] ? start_kernel+0x34b/0x350 [<c09fd0cb>] ? i386_start_kernel+0xba/0xc1 Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Paul Gortmaker <[email protected]>
ystk
pushed a commit
to ystk/linux-ltsi-work
that referenced
this pull request
Apr 29, 2014
Mike Galbraith captered the following: | >torvalds#11 [ffff88017b243e90] _raw_spin_lock at ffffffff815d2596 | >torvalds#12 [ffff88017b243e90] rt_mutex_trylock at ffffffff815d15be | >torvalds#13 [ffff88017b243eb0] get_next_timer_interrupt at ffffffff81063b42 | >torvalds#14 [ffff88017b243f00] tick_nohz_stop_sched_tick at ffffffff810bd1fd | >torvalds#15 [ffff88017b243f70] tick_nohz_irq_exit at ffffffff810bd7d2 | >torvalds#16 [ffff88017b243f90] irq_exit at ffffffff8105b02d | >torvalds#17 [ffff88017b243fb0] reschedule_interrupt at ffffffff815db3dd | >--- <IRQ stack> --- | >torvalds#18 [ffff88017a2a9bc8] reschedule_interrupt at ffffffff815db3dd | > [exception RIP: task_blocks_on_rt_mutex+51] | >torvalds#19 [ffff88017a2a9ce0] rt_spin_lock_slowlock at ffffffff815d183c | >torvalds#20 [ffff88017a2a9da0] lock_timer_base.isra.35 at ffffffff81061cbf | >torvalds#21 [ffff88017a2a9dd0] schedule_timeout at ffffffff815cf1ce | >torvalds#22 [ffff88017a2a9e50] rcu_gp_kthread at ffffffff810f9bbb | >torvalds#23 [ffff88017a2a9ed0] kthread at ffffffff810796d5 | >torvalds#24 [ffff88017a2a9f50] ret_from_fork at ffffffff815da04c lock_timer_base() does a try_lock() which deadlocks on the waiter lock not the lock itself. This patch takes the waiter_lock with trylock so it should work from interrupt context as well. If the fastpath doesn't work and the waiter_lock itself is taken then it seems that the lock itself taken. This patch also adds a "rt_spin_try_unlock" to keep lockdep happy. If we managed to take the wait_lock in the first place we should also be able to take it in the unlock path. Cc: [email protected] Reported-by: Mike Galbraith <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
ystk
pushed a commit
to ystk/linux-ltsi-work
that referenced
this pull request
May 23, 2014
Mike Galbraith captered the following: | >torvalds#11 [ffff88017b243e90] _raw_spin_lock at ffffffff815d2596 | >torvalds#12 [ffff88017b243e90] rt_mutex_trylock at ffffffff815d15be | >torvalds#13 [ffff88017b243eb0] get_next_timer_interrupt at ffffffff81063b42 | >torvalds#14 [ffff88017b243f00] tick_nohz_stop_sched_tick at ffffffff810bd1fd | >torvalds#15 [ffff88017b243f70] tick_nohz_irq_exit at ffffffff810bd7d2 | >torvalds#16 [ffff88017b243f90] irq_exit at ffffffff8105b02d | >torvalds#17 [ffff88017b243fb0] reschedule_interrupt at ffffffff815db3dd | >--- <IRQ stack> --- | >torvalds#18 [ffff88017a2a9bc8] reschedule_interrupt at ffffffff815db3dd | > [exception RIP: task_blocks_on_rt_mutex+51] | >torvalds#19 [ffff88017a2a9ce0] rt_spin_lock_slowlock at ffffffff815d183c | >torvalds#20 [ffff88017a2a9da0] lock_timer_base.isra.35 at ffffffff81061cbf | >torvalds#21 [ffff88017a2a9dd0] schedule_timeout at ffffffff815cf1ce | >torvalds#22 [ffff88017a2a9e50] rcu_gp_kthread at ffffffff810f9bbb | >torvalds#23 [ffff88017a2a9ed0] kthread at ffffffff810796d5 | >torvalds#24 [ffff88017a2a9f50] ret_from_fork at ffffffff815da04c lock_timer_base() does a try_lock() which deadlocks on the waiter lock not the lock itself. This patch takes the waiter_lock with trylock so it should work from interrupt context as well. If the fastpath doesn't work and the waiter_lock itself is taken then it seems that the lock itself taken. This patch also adds a "rt_spin_try_unlock" to keep lockdep happy. If we managed to take the wait_lock in the first place we should also be able to take it in the unlock path. Cc: [email protected] Reported-by: Mike Galbraith <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
damentz
referenced
this pull request
in zen-kernel/zen-kernel
May 29, 2014
Mike Galbraith captered the following: | >#11 [ffff88017b243e90] _raw_spin_lock at ffffffff815d2596 | >#12 [ffff88017b243e90] rt_mutex_trylock at ffffffff815d15be | >#13 [ffff88017b243eb0] get_next_timer_interrupt at ffffffff81063b42 | >#14 [ffff88017b243f00] tick_nohz_stop_sched_tick at ffffffff810bd1fd | >#15 [ffff88017b243f70] tick_nohz_irq_exit at ffffffff810bd7d2 | >#16 [ffff88017b243f90] irq_exit at ffffffff8105b02d | >#17 [ffff88017b243fb0] reschedule_interrupt at ffffffff815db3dd | >--- <IRQ stack> --- | >#18 [ffff88017a2a9bc8] reschedule_interrupt at ffffffff815db3dd | > [exception RIP: task_blocks_on_rt_mutex+51] | >#19 [ffff88017a2a9ce0] rt_spin_lock_slowlock at ffffffff815d183c | >#20 [ffff88017a2a9da0] lock_timer_base.isra.35 at ffffffff81061cbf | >#21 [ffff88017a2a9dd0] schedule_timeout at ffffffff815cf1ce | >#22 [ffff88017a2a9e50] rcu_gp_kthread at ffffffff810f9bbb | >#23 [ffff88017a2a9ed0] kthread at ffffffff810796d5 | >#24 [ffff88017a2a9f50] ret_from_fork at ffffffff815da04c lock_timer_base() does a try_lock() which deadlocks on the waiter lock not the lock itself. This patch takes the waiter_lock with trylock so it should work from interrupt context as well. If the fastpath doesn't work and the waiter_lock itself is taken then it seems that the lock itself taken. This patch also adds "rt_spin_unlock_after_trylock_in_irq" to keep lockdep happy. If we managed to take the wait_lock in the first place we should also be able to take it in the unlock path. Cc: [email protected] Reported-by: Mike Galbraith <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
May 23, 2024
…uddy pages When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty torvalds#22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Signed-off-by: Miaohe Lin <[email protected]> Fixes: ceaf8fb ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Cc: <[email protected]>
ioworker0
pushed a commit
to ioworker0/linux
that referenced
this pull request
May 24, 2024
…uddy pages When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty torvalds#22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/[email protected] Fixes: ceaf8fb ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
ioworker0
pushed a commit
to ioworker0/linux
that referenced
this pull request
May 24, 2024
…uddy pages When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty torvalds#22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/[email protected] Fixes: ceaf8fb ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
ioworker0
pushed a commit
to ioworker0/linux
that referenced
this pull request
May 24, 2024
…uddy pages When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty torvalds#22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/[email protected] Fixes: ceaf8fb ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
Kaz205
pushed a commit
to Kaz205/linux
that referenced
this pull request
Jun 3, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Jun 3, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Jun 3, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Kaz205
pushed a commit
to Kaz205/linux
that referenced
this pull request
Jun 5, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Jun 6, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Jun 6, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Jun 9, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Jun 9, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
hdeller
pushed a commit
to hdeller/linux
that referenced
this pull request
Jun 12, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
staging-kernelci-org
pushed a commit
to kernelci/linux
that referenced
this pull request
Jun 12, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
staging-kernelci-org
pushed a commit
to kernelci/linux
that referenced
this pull request
Jun 12, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) #2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) #3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 #4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 #5 0x55c94045c776 in ui_browser__show ui/browser.c:288 torvalds#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 torvalds#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 torvalds#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 torvalds#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 torvalds#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 torvalds#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 torvalds#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 torvalds#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 torvalds#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 torvalds#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 torvalds#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 torvalds#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 torvalds#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 torvalds#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 torvalds#20 0x55c9400e53ad in main tools/perf/perf.c:561 torvalds#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 torvalds#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 torvalds#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
intersectRaven
pushed a commit
to intersectRaven/linux
that referenced
this pull request
Jun 16, 2024
…uddy pages commit 8cf360b upstream. When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty torvalds#22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/[email protected] Fixes: ceaf8fb ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Jun 19, 2024
…uddy pages [ Upstream commit 8cf360b ] When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty torvalds#22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/[email protected] Fixes: ceaf8fb ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
mj22226
pushed a commit
to mj22226/linux
that referenced
this pull request
Jun 19, 2024
…uddy pages commit 8cf360b upstream. When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty torvalds#22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/[email protected] Fixes: ceaf8fb ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
staging-kernelci-org
pushed a commit
to kernelci/linux
that referenced
this pull request
Jun 21, 2024
…uddy pages commit 8cf360b upstream. When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty torvalds#22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/[email protected] Fixes: ceaf8fb ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
staging-kernelci-org
pushed a commit
to kernelci/linux
that referenced
this pull request
Jun 21, 2024
…uddy pages [ Upstream commit 8cf360b ] When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty torvalds#22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/[email protected] Fixes: ceaf8fb ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
shunghsiyu
pushed a commit
to shunghsiyu/linux
that referenced
this pull request
Jun 27, 2024
commit fde2a38 upstream. Add support precision backtracking in the presence of subprogram frames in jump history. This means supporting a few different kinds of subprogram invocation situations, all requiring a slightly different handling in precision backtracking handling logic: - static subprogram calls; - global subprogram calls; - callback-calling helpers/kfuncs. For each of those we need to handle a few precision propagation cases: - what to do with precision of subprog returns (r0); - what to do with precision of input arguments; - for all of them callee-saved registers in caller function should be propagated ignoring subprog/callback part of jump history. N.B. Async callback-calling helpers (currently only bpf_timer_set_callback()) are transparent to all this because they set a separate async callback environment and thus callback's history is not shared with main program's history. So as far as all the changes in this commit goes, such helper is just a regular helper. Let's look at all these situation in more details. Let's start with static subprogram being called, using an exxerpt of a simple main program and its static subprog, indenting subprog's frame slightly to make everything clear. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; fr0: r6 11: call pc+10; fr0: r1, r6 22: r0 = r1; fr0: r6; fr1: r1 23: exit fr0: r6; fr1: r0 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6 15: exit As can be seen above main function is passing 123 as single argument to an identity (`return x;`) subprog. Returned value is used to adjust map pointer offset, which forces r0 to be marked as precise. Then instruction torvalds#14 does the same for callee-saved r6, which will have to be backtracked all the way to instruction torvalds#9. For brevity, precision sets for instruction torvalds#13 and torvalds#14 are combined in the diagram above. First, for subprog calls, r0 returned from subprog (in frame 0) has to go into subprog's frame 1, and should be cleared from frame 0. So we go back into subprog's frame knowing we need to mark r0 precise. We then see that insn torvalds#22 sets r0 from r1, so now we care about marking r1 precise. When we pop up from subprog's frame back into caller at insn torvalds#11 we keep r1, as it's an argument-passing register, so we eventually find `10: r1 = 123;` and satify precision propagation chain for insn torvalds#13. This example demonstrates two sets of rules: - r0 returned after subprog call has to be moved into subprog's r0 set; - *static* subprog arguments (r1-r5) are moved back to caller precision set. Let's look at what happens with callee-saved precision propagation. Insn torvalds#14 mark r6 as precise. When we get into subprog's frame, we keep r6 in frame 0's precision set *only*. Subprog itself has its own set of independent r6-r10 registers and is not affected. When we eventually made our way out of subprog frame we keep r6 in precision set until we reach `9: r6 = 456;`, satisfying propagation. r6-r10 propagation is perhaps the simplest aspect, it always stays in its original frame. That's pretty much all we have to do to support precision propagation across *static subprog* invocation. Let's look at what happens when we have global subprog invocation. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; fr0: r6 11: call pc+10; # global subprog fr0: r6 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6; 15: exit Starting from insn torvalds#13, r0 has to be precise. We backtrack all the way to insn torvalds#11 (call pc+10) and see that subprog is global, so was already validated in isolation. As opposed to static subprog, global subprog always returns unknown scalar r0, so that satisfies precision propagation and we drop r0 from precision set. We are done for insns torvalds#13. Now for insn torvalds#14. r6 is in precision set, we backtrack to `call pc+10;`. Here we need to recognize that this is effectively both exit and entry to global subprog, which means we stay in caller's frame. So we carry on with r6 still in precision set, until we satisfy it at insn torvalds#9. The only hard part with global subprogs is just knowing when it's a global func. Lastly, callback-calling helpers and kfuncs do simulate subprog calls, so jump history will have subprog instructions in between caller program's instructions, but the rules of propagating r0 and r1-r5 differ, because we don't actually directly call callback. We actually call helper/kfunc, which at runtime will call subprog, so the only difference between normal helper/kfunc handling is that we need to make sure to skip callback simulatinog part of jump history. Let's look at an example to make this clearer. frame 0 frame 1 precision set ======= ======= ============= 8: r6 = 456; 9: r1 = 123; fr0: r6 10: r2 = &callback; fr0: r6 11: call bpf_loop; fr0: r6 22: r0 = r1; fr0: r6 fr1: 23: exit fr0: r6 fr1: 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6; 15: exit Again, insn torvalds#13 forces r0 to be precise. As soon as we get to `23: exit` we see that this isn't actually a static subprog call (it's `call bpf_loop;` helper call instead). So we clear r0 from precision set. For callee-saved register, there is no difference: it stays in frame 0's precision set, we go through insn torvalds#22 and torvalds#23, ignoring them until we get back to caller frame 0, eventually satisfying precision backtrack logic at insn torvalds#8 (`r6 = 456;`). Assuming callback needed to set r0 as precise at insn torvalds#23, we'd backtrack to insn torvalds#22, switching from r0 to r1, and then at the point when we pop back to frame 0 at insn torvalds#11, we'll clear r1-r5 from precision set, as we don't really do a subprog call directly, so there is no input argument precision propagation. That's pretty much it. With these changes, it seems like the only still unsupported situation for precision backpropagation is the case when program is accessing stack through registers other than r10. This is still left as unsupported (though rare) case for now. As for results. For selftests, few positive changes for bigger programs, cls_redirect in dynptr variant benefitting the most: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results.csv ~/subprog-precise-after-results.csv -f @veristat.cfg -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ---------------------------------------- ------------- --------- --------- ---------------- pyperf600_bpf_loop.bpf.linked1.o on_event 2060 2002 -58 (-2.82%) test_cls_redirect_dynptr.bpf.linked1.o cls_redirect 15660 2914 -12746 (-81.39%) test_cls_redirect_subprogs.bpf.linked1.o cls_redirect 61620 59088 -2532 (-4.11%) xdp_synproxy_kern.bpf.linked1.o syncookie_tc 109980 86278 -23702 (-21.55%) xdp_synproxy_kern.bpf.linked1.o syncookie_xdp 97716 85147 -12569 (-12.86%) Cilium progress don't really regress. They don't use subprogs and are mostly unaffected, but some other fixes and improvements could have changed something. This doesn't appear to be the case: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-cilium.csv ~/subprog-precise-after-results-cilium.csv -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ------------- ------------------------------ --------- --------- ------------ bpf_host.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_lxc.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_overlay.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_xdp.o tail_handle_nat_fwd_ipv6 12475 12504 +29 (+0.23%) bpf_xdp.o tail_nodeport_nat_ingress_ipv6 6363 6371 +8 (+0.13%) Looking at (somewhat anonymized) Meta production programs, we see mostly insignificant variation in number of instructions, with one program (syar_bind6_protect6) benefitting the most at -17%. [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-fbcode.csv ~/subprog-precise-after-results-fbcode.csv -e prog,insns -f 'insns_diff!=0' Program Insns (A) Insns (B) Insns (DIFF) ------------------------ --------- --------- ---------------- on_request_context_event 597 585 -12 (-2.01%) read_async_py_stack 43789 43657 -132 (-0.30%) read_sync_py_stack 35041 37599 +2558 (+7.30%) rrm_usdt 946 940 -6 (-0.63%) sysarmor_inet6_bind 28863 28249 -614 (-2.13%) sysarmor_inet_bind 28845 28240 -605 (-2.10%) syar_bind4_protect4 154145 147640 -6505 (-4.22%) syar_bind6_protect6 165242 137088 -28154 (-17.04%) syar_task_exit_setgid 21289 19720 -1569 (-7.37%) syar_task_exit_setuid 21290 19721 -1569 (-7.37%) do_uprobe 19967 19413 -554 (-2.77%) tw_twfw_ingress 215877 204833 -11044 (-5.12%) tw_twfw_tc_in 215877 204833 -11044 (-5.12%) But checking duration (wall clock) differences, that is the actual time taken by verifier to validate programs, we see a sometimes dramatic improvements, all the way to about 16x improvements: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-meta.csv ~/subprog-precise-after-results-meta.csv -e prog,duration -s duration_diff^ | head -n20 Program Duration (us) (A) Duration (us) (B) Duration (us) (DIFF) ---------------------------------------- ----------------- ----------------- -------------------- tw_twfw_ingress 4488374 272836 -4215538 (-93.92%) tw_twfw_tc_in 4339111 268175 -4070936 (-93.82%) tw_twfw_egress 3521816 270751 -3251065 (-92.31%) tw_twfw_tc_eg 3472878 284294 -3188584 (-91.81%) balancer_ingress 343119 291391 -51728 (-15.08%) syar_bind6_protect6 78992 64782 -14210 (-17.99%) ttls_tc_ingress 11739 8176 -3563 (-30.35%) kprobe__security_inode_link 13864 11341 -2523 (-18.20%) read_sync_py_stack 21927 19442 -2485 (-11.33%) read_async_py_stack 30444 28136 -2308 (-7.58%) syar_task_exit_setuid 10256 8440 -1816 (-17.71%) Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Shung-Hsi Yu <[email protected]>
shunghsiyu
pushed a commit
to shunghsiyu/linux
that referenced
this pull request
Jun 28, 2024
commit fde2a38 upstream. Add support precision backtracking in the presence of subprogram frames in jump history. This means supporting a few different kinds of subprogram invocation situations, all requiring a slightly different handling in precision backtracking handling logic: - static subprogram calls; - global subprogram calls; - callback-calling helpers/kfuncs. For each of those we need to handle a few precision propagation cases: - what to do with precision of subprog returns (r0); - what to do with precision of input arguments; - for all of them callee-saved registers in caller function should be propagated ignoring subprog/callback part of jump history. N.B. Async callback-calling helpers (currently only bpf_timer_set_callback()) are transparent to all this because they set a separate async callback environment and thus callback's history is not shared with main program's history. So as far as all the changes in this commit goes, such helper is just a regular helper. Let's look at all these situation in more details. Let's start with static subprogram being called, using an exxerpt of a simple main program and its static subprog, indenting subprog's frame slightly to make everything clear. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; fr0: r6 11: call pc+10; fr0: r1, r6 22: r0 = r1; fr0: r6; fr1: r1 23: exit fr0: r6; fr1: r0 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6 15: exit As can be seen above main function is passing 123 as single argument to an identity (`return x;`) subprog. Returned value is used to adjust map pointer offset, which forces r0 to be marked as precise. Then instruction torvalds#14 does the same for callee-saved r6, which will have to be backtracked all the way to instruction torvalds#9. For brevity, precision sets for instruction torvalds#13 and torvalds#14 are combined in the diagram above. First, for subprog calls, r0 returned from subprog (in frame 0) has to go into subprog's frame 1, and should be cleared from frame 0. So we go back into subprog's frame knowing we need to mark r0 precise. We then see that insn torvalds#22 sets r0 from r1, so now we care about marking r1 precise. When we pop up from subprog's frame back into caller at insn torvalds#11 we keep r1, as it's an argument-passing register, so we eventually find `10: r1 = 123;` and satify precision propagation chain for insn torvalds#13. This example demonstrates two sets of rules: - r0 returned after subprog call has to be moved into subprog's r0 set; - *static* subprog arguments (r1-r5) are moved back to caller precision set. Let's look at what happens with callee-saved precision propagation. Insn torvalds#14 mark r6 as precise. When we get into subprog's frame, we keep r6 in frame 0's precision set *only*. Subprog itself has its own set of independent r6-r10 registers and is not affected. When we eventually made our way out of subprog frame we keep r6 in precision set until we reach `9: r6 = 456;`, satisfying propagation. r6-r10 propagation is perhaps the simplest aspect, it always stays in its original frame. That's pretty much all we have to do to support precision propagation across *static subprog* invocation. Let's look at what happens when we have global subprog invocation. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; fr0: r6 11: call pc+10; # global subprog fr0: r6 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6; 15: exit Starting from insn torvalds#13, r0 has to be precise. We backtrack all the way to insn torvalds#11 (call pc+10) and see that subprog is global, so was already validated in isolation. As opposed to static subprog, global subprog always returns unknown scalar r0, so that satisfies precision propagation and we drop r0 from precision set. We are done for insns torvalds#13. Now for insn torvalds#14. r6 is in precision set, we backtrack to `call pc+10;`. Here we need to recognize that this is effectively both exit and entry to global subprog, which means we stay in caller's frame. So we carry on with r6 still in precision set, until we satisfy it at insn torvalds#9. The only hard part with global subprogs is just knowing when it's a global func. Lastly, callback-calling helpers and kfuncs do simulate subprog calls, so jump history will have subprog instructions in between caller program's instructions, but the rules of propagating r0 and r1-r5 differ, because we don't actually directly call callback. We actually call helper/kfunc, which at runtime will call subprog, so the only difference between normal helper/kfunc handling is that we need to make sure to skip callback simulatinog part of jump history. Let's look at an example to make this clearer. frame 0 frame 1 precision set ======= ======= ============= 8: r6 = 456; 9: r1 = 123; fr0: r6 10: r2 = &callback; fr0: r6 11: call bpf_loop; fr0: r6 22: r0 = r1; fr0: r6 fr1: 23: exit fr0: r6 fr1: 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6; 15: exit Again, insn torvalds#13 forces r0 to be precise. As soon as we get to `23: exit` we see that this isn't actually a static subprog call (it's `call bpf_loop;` helper call instead). So we clear r0 from precision set. For callee-saved register, there is no difference: it stays in frame 0's precision set, we go through insn torvalds#22 and torvalds#23, ignoring them until we get back to caller frame 0, eventually satisfying precision backtrack logic at insn torvalds#8 (`r6 = 456;`). Assuming callback needed to set r0 as precise at insn torvalds#23, we'd backtrack to insn torvalds#22, switching from r0 to r1, and then at the point when we pop back to frame 0 at insn torvalds#11, we'll clear r1-r5 from precision set, as we don't really do a subprog call directly, so there is no input argument precision propagation. That's pretty much it. With these changes, it seems like the only still unsupported situation for precision backpropagation is the case when program is accessing stack through registers other than r10. This is still left as unsupported (though rare) case for now. As for results. For selftests, few positive changes for bigger programs, cls_redirect in dynptr variant benefitting the most: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results.csv ~/subprog-precise-after-results.csv -f @veristat.cfg -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ---------------------------------------- ------------- --------- --------- ---------------- pyperf600_bpf_loop.bpf.linked1.o on_event 2060 2002 -58 (-2.82%) test_cls_redirect_dynptr.bpf.linked1.o cls_redirect 15660 2914 -12746 (-81.39%) test_cls_redirect_subprogs.bpf.linked1.o cls_redirect 61620 59088 -2532 (-4.11%) xdp_synproxy_kern.bpf.linked1.o syncookie_tc 109980 86278 -23702 (-21.55%) xdp_synproxy_kern.bpf.linked1.o syncookie_xdp 97716 85147 -12569 (-12.86%) Cilium progress don't really regress. They don't use subprogs and are mostly unaffected, but some other fixes and improvements could have changed something. This doesn't appear to be the case: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-cilium.csv ~/subprog-precise-after-results-cilium.csv -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ------------- ------------------------------ --------- --------- ------------ bpf_host.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_lxc.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_overlay.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_xdp.o tail_handle_nat_fwd_ipv6 12475 12504 +29 (+0.23%) bpf_xdp.o tail_nodeport_nat_ingress_ipv6 6363 6371 +8 (+0.13%) Looking at (somewhat anonymized) Meta production programs, we see mostly insignificant variation in number of instructions, with one program (syar_bind6_protect6) benefitting the most at -17%. [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-fbcode.csv ~/subprog-precise-after-results-fbcode.csv -e prog,insns -f 'insns_diff!=0' Program Insns (A) Insns (B) Insns (DIFF) ------------------------ --------- --------- ---------------- on_request_context_event 597 585 -12 (-2.01%) read_async_py_stack 43789 43657 -132 (-0.30%) read_sync_py_stack 35041 37599 +2558 (+7.30%) rrm_usdt 946 940 -6 (-0.63%) sysarmor_inet6_bind 28863 28249 -614 (-2.13%) sysarmor_inet_bind 28845 28240 -605 (-2.10%) syar_bind4_protect4 154145 147640 -6505 (-4.22%) syar_bind6_protect6 165242 137088 -28154 (-17.04%) syar_task_exit_setgid 21289 19720 -1569 (-7.37%) syar_task_exit_setuid 21290 19721 -1569 (-7.37%) do_uprobe 19967 19413 -554 (-2.77%) tw_twfw_ingress 215877 204833 -11044 (-5.12%) tw_twfw_tc_in 215877 204833 -11044 (-5.12%) But checking duration (wall clock) differences, that is the actual time taken by verifier to validate programs, we see a sometimes dramatic improvements, all the way to about 16x improvements: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-meta.csv ~/subprog-precise-after-results-meta.csv -e prog,duration -s duration_diff^ | head -n20 Program Duration (us) (A) Duration (us) (B) Duration (us) (DIFF) ---------------------------------------- ----------------- ----------------- -------------------- tw_twfw_ingress 4488374 272836 -4215538 (-93.92%) tw_twfw_tc_in 4339111 268175 -4070936 (-93.82%) tw_twfw_egress 3521816 270751 -3251065 (-92.31%) tw_twfw_tc_eg 3472878 284294 -3188584 (-91.81%) balancer_ingress 343119 291391 -51728 (-15.08%) syar_bind6_protect6 78992 64782 -14210 (-17.99%) ttls_tc_ingress 11739 8176 -3563 (-30.35%) kprobe__security_inode_link 13864 11341 -2523 (-18.20%) read_sync_py_stack 21927 19442 -2485 (-11.33%) read_async_py_stack 30444 28136 -2308 (-7.58%) syar_task_exit_setuid 10256 8440 -1816 (-17.71%) Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> [shung-hsi.yu: use of is_callback_calling_kfunc() is entirely dropped because commit 5d92ddc ("bpf: Add callback validation to kfunc verifier logic") is not present] Signed-off-by: Shung-Hsi Yu <[email protected]>
shunghsiyu
pushed a commit
to shunghsiyu/linux
that referenced
this pull request
Jul 9, 2024
commit fde2a38 upstream. Add support precision backtracking in the presence of subprogram frames in jump history. This means supporting a few different kinds of subprogram invocation situations, all requiring a slightly different handling in precision backtracking handling logic: - static subprogram calls; - global subprogram calls; - callback-calling helpers/kfuncs. For each of those we need to handle a few precision propagation cases: - what to do with precision of subprog returns (r0); - what to do with precision of input arguments; - for all of them callee-saved registers in caller function should be propagated ignoring subprog/callback part of jump history. N.B. Async callback-calling helpers (currently only bpf_timer_set_callback()) are transparent to all this because they set a separate async callback environment and thus callback's history is not shared with main program's history. So as far as all the changes in this commit goes, such helper is just a regular helper. Let's look at all these situation in more details. Let's start with static subprogram being called, using an exxerpt of a simple main program and its static subprog, indenting subprog's frame slightly to make everything clear. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; fr0: r6 11: call pc+10; fr0: r1, r6 22: r0 = r1; fr0: r6; fr1: r1 23: exit fr0: r6; fr1: r0 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6 15: exit As can be seen above main function is passing 123 as single argument to an identity (`return x;`) subprog. Returned value is used to adjust map pointer offset, which forces r0 to be marked as precise. Then instruction torvalds#14 does the same for callee-saved r6, which will have to be backtracked all the way to instruction torvalds#9. For brevity, precision sets for instruction torvalds#13 and torvalds#14 are combined in the diagram above. First, for subprog calls, r0 returned from subprog (in frame 0) has to go into subprog's frame 1, and should be cleared from frame 0. So we go back into subprog's frame knowing we need to mark r0 precise. We then see that insn torvalds#22 sets r0 from r1, so now we care about marking r1 precise. When we pop up from subprog's frame back into caller at insn torvalds#11 we keep r1, as it's an argument-passing register, so we eventually find `10: r1 = 123;` and satify precision propagation chain for insn torvalds#13. This example demonstrates two sets of rules: - r0 returned after subprog call has to be moved into subprog's r0 set; - *static* subprog arguments (r1-r5) are moved back to caller precision set. Let's look at what happens with callee-saved precision propagation. Insn torvalds#14 mark r6 as precise. When we get into subprog's frame, we keep r6 in frame 0's precision set *only*. Subprog itself has its own set of independent r6-r10 registers and is not affected. When we eventually made our way out of subprog frame we keep r6 in precision set until we reach `9: r6 = 456;`, satisfying propagation. r6-r10 propagation is perhaps the simplest aspect, it always stays in its original frame. That's pretty much all we have to do to support precision propagation across *static subprog* invocation. Let's look at what happens when we have global subprog invocation. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; fr0: r6 11: call pc+10; # global subprog fr0: r6 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6; 15: exit Starting from insn torvalds#13, r0 has to be precise. We backtrack all the way to insn torvalds#11 (call pc+10) and see that subprog is global, so was already validated in isolation. As opposed to static subprog, global subprog always returns unknown scalar r0, so that satisfies precision propagation and we drop r0 from precision set. We are done for insns torvalds#13. Now for insn torvalds#14. r6 is in precision set, we backtrack to `call pc+10;`. Here we need to recognize that this is effectively both exit and entry to global subprog, which means we stay in caller's frame. So we carry on with r6 still in precision set, until we satisfy it at insn torvalds#9. The only hard part with global subprogs is just knowing when it's a global func. Lastly, callback-calling helpers and kfuncs do simulate subprog calls, so jump history will have subprog instructions in between caller program's instructions, but the rules of propagating r0 and r1-r5 differ, because we don't actually directly call callback. We actually call helper/kfunc, which at runtime will call subprog, so the only difference between normal helper/kfunc handling is that we need to make sure to skip callback simulatinog part of jump history. Let's look at an example to make this clearer. frame 0 frame 1 precision set ======= ======= ============= 8: r6 = 456; 9: r1 = 123; fr0: r6 10: r2 = &callback; fr0: r6 11: call bpf_loop; fr0: r6 22: r0 = r1; fr0: r6 fr1: 23: exit fr0: r6 fr1: 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6; 15: exit Again, insn torvalds#13 forces r0 to be precise. As soon as we get to `23: exit` we see that this isn't actually a static subprog call (it's `call bpf_loop;` helper call instead). So we clear r0 from precision set. For callee-saved register, there is no difference: it stays in frame 0's precision set, we go through insn torvalds#22 and torvalds#23, ignoring them until we get back to caller frame 0, eventually satisfying precision backtrack logic at insn torvalds#8 (`r6 = 456;`). Assuming callback needed to set r0 as precise at insn torvalds#23, we'd backtrack to insn torvalds#22, switching from r0 to r1, and then at the point when we pop back to frame 0 at insn torvalds#11, we'll clear r1-r5 from precision set, as we don't really do a subprog call directly, so there is no input argument precision propagation. That's pretty much it. With these changes, it seems like the only still unsupported situation for precision backpropagation is the case when program is accessing stack through registers other than r10. This is still left as unsupported (though rare) case for now. As for results. For selftests, few positive changes for bigger programs, cls_redirect in dynptr variant benefitting the most: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results.csv ~/subprog-precise-after-results.csv -f @veristat.cfg -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ---------------------------------------- ------------- --------- --------- ---------------- pyperf600_bpf_loop.bpf.linked1.o on_event 2060 2002 -58 (-2.82%) test_cls_redirect_dynptr.bpf.linked1.o cls_redirect 15660 2914 -12746 (-81.39%) test_cls_redirect_subprogs.bpf.linked1.o cls_redirect 61620 59088 -2532 (-4.11%) xdp_synproxy_kern.bpf.linked1.o syncookie_tc 109980 86278 -23702 (-21.55%) xdp_synproxy_kern.bpf.linked1.o syncookie_xdp 97716 85147 -12569 (-12.86%) Cilium progress don't really regress. They don't use subprogs and are mostly unaffected, but some other fixes and improvements could have changed something. This doesn't appear to be the case: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-cilium.csv ~/subprog-precise-after-results-cilium.csv -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ------------- ------------------------------ --------- --------- ------------ bpf_host.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_lxc.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_overlay.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_xdp.o tail_handle_nat_fwd_ipv6 12475 12504 +29 (+0.23%) bpf_xdp.o tail_nodeport_nat_ingress_ipv6 6363 6371 +8 (+0.13%) Looking at (somewhat anonymized) Meta production programs, we see mostly insignificant variation in number of instructions, with one program (syar_bind6_protect6) benefitting the most at -17%. [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-fbcode.csv ~/subprog-precise-after-results-fbcode.csv -e prog,insns -f 'insns_diff!=0' Program Insns (A) Insns (B) Insns (DIFF) ------------------------ --------- --------- ---------------- on_request_context_event 597 585 -12 (-2.01%) read_async_py_stack 43789 43657 -132 (-0.30%) read_sync_py_stack 35041 37599 +2558 (+7.30%) rrm_usdt 946 940 -6 (-0.63%) sysarmor_inet6_bind 28863 28249 -614 (-2.13%) sysarmor_inet_bind 28845 28240 -605 (-2.10%) syar_bind4_protect4 154145 147640 -6505 (-4.22%) syar_bind6_protect6 165242 137088 -28154 (-17.04%) syar_task_exit_setgid 21289 19720 -1569 (-7.37%) syar_task_exit_setuid 21290 19721 -1569 (-7.37%) do_uprobe 19967 19413 -554 (-2.77%) tw_twfw_ingress 215877 204833 -11044 (-5.12%) tw_twfw_tc_in 215877 204833 -11044 (-5.12%) But checking duration (wall clock) differences, that is the actual time taken by verifier to validate programs, we see a sometimes dramatic improvements, all the way to about 16x improvements: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-meta.csv ~/subprog-precise-after-results-meta.csv -e prog,duration -s duration_diff^ | head -n20 Program Duration (us) (A) Duration (us) (B) Duration (us) (DIFF) ---------------------------------------- ----------------- ----------------- -------------------- tw_twfw_ingress 4488374 272836 -4215538 (-93.92%) tw_twfw_tc_in 4339111 268175 -4070936 (-93.82%) tw_twfw_egress 3521816 270751 -3251065 (-92.31%) tw_twfw_tc_eg 3472878 284294 -3188584 (-91.81%) balancer_ingress 343119 291391 -51728 (-15.08%) syar_bind6_protect6 78992 64782 -14210 (-17.99%) ttls_tc_ingress 11739 8176 -3563 (-30.35%) kprobe__security_inode_link 13864 11341 -2523 (-18.20%) read_sync_py_stack 21927 19442 -2485 (-11.33%) read_async_py_stack 30444 28136 -2308 (-7.58%) syar_task_exit_setuid 10256 8440 -1816 (-17.71%) Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> [shung-hsi.yu: use of is_callback_calling_kfunc() is entirely dropped because commit 5d92ddc ("bpf: Add callback validation to kfunc verifier logic") is not present] Signed-off-by: Shung-Hsi Yu <[email protected]>
shunghsiyu
pushed a commit
to shunghsiyu/linux
that referenced
this pull request
Jul 9, 2024
commit fde2a38 upstream. Add support precision backtracking in the presence of subprogram frames in jump history. This means supporting a few different kinds of subprogram invocation situations, all requiring a slightly different handling in precision backtracking handling logic: - static subprogram calls; - global subprogram calls; - callback-calling helpers/kfuncs. For each of those we need to handle a few precision propagation cases: - what to do with precision of subprog returns (r0); - what to do with precision of input arguments; - for all of them callee-saved registers in caller function should be propagated ignoring subprog/callback part of jump history. N.B. Async callback-calling helpers (currently only bpf_timer_set_callback()) are transparent to all this because they set a separate async callback environment and thus callback's history is not shared with main program's history. So as far as all the changes in this commit goes, such helper is just a regular helper. Let's look at all these situation in more details. Let's start with static subprogram being called, using an exxerpt of a simple main program and its static subprog, indenting subprog's frame slightly to make everything clear. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; fr0: r6 11: call pc+10; fr0: r1, r6 22: r0 = r1; fr0: r6; fr1: r1 23: exit fr0: r6; fr1: r0 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6 15: exit As can be seen above main function is passing 123 as single argument to an identity (`return x;`) subprog. Returned value is used to adjust map pointer offset, which forces r0 to be marked as precise. Then instruction torvalds#14 does the same for callee-saved r6, which will have to be backtracked all the way to instruction torvalds#9. For brevity, precision sets for instruction torvalds#13 and torvalds#14 are combined in the diagram above. First, for subprog calls, r0 returned from subprog (in frame 0) has to go into subprog's frame 1, and should be cleared from frame 0. So we go back into subprog's frame knowing we need to mark r0 precise. We then see that insn torvalds#22 sets r0 from r1, so now we care about marking r1 precise. When we pop up from subprog's frame back into caller at insn torvalds#11 we keep r1, as it's an argument-passing register, so we eventually find `10: r1 = 123;` and satify precision propagation chain for insn torvalds#13. This example demonstrates two sets of rules: - r0 returned after subprog call has to be moved into subprog's r0 set; - *static* subprog arguments (r1-r5) are moved back to caller precision set. Let's look at what happens with callee-saved precision propagation. Insn torvalds#14 mark r6 as precise. When we get into subprog's frame, we keep r6 in frame 0's precision set *only*. Subprog itself has its own set of independent r6-r10 registers and is not affected. When we eventually made our way out of subprog frame we keep r6 in precision set until we reach `9: r6 = 456;`, satisfying propagation. r6-r10 propagation is perhaps the simplest aspect, it always stays in its original frame. That's pretty much all we have to do to support precision propagation across *static subprog* invocation. Let's look at what happens when we have global subprog invocation. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; fr0: r6 11: call pc+10; # global subprog fr0: r6 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6; 15: exit Starting from insn torvalds#13, r0 has to be precise. We backtrack all the way to insn torvalds#11 (call pc+10) and see that subprog is global, so was already validated in isolation. As opposed to static subprog, global subprog always returns unknown scalar r0, so that satisfies precision propagation and we drop r0 from precision set. We are done for insns torvalds#13. Now for insn torvalds#14. r6 is in precision set, we backtrack to `call pc+10;`. Here we need to recognize that this is effectively both exit and entry to global subprog, which means we stay in caller's frame. So we carry on with r6 still in precision set, until we satisfy it at insn torvalds#9. The only hard part with global subprogs is just knowing when it's a global func. Lastly, callback-calling helpers and kfuncs do simulate subprog calls, so jump history will have subprog instructions in between caller program's instructions, but the rules of propagating r0 and r1-r5 differ, because we don't actually directly call callback. We actually call helper/kfunc, which at runtime will call subprog, so the only difference between normal helper/kfunc handling is that we need to make sure to skip callback simulatinog part of jump history. Let's look at an example to make this clearer. frame 0 frame 1 precision set ======= ======= ============= 8: r6 = 456; 9: r1 = 123; fr0: r6 10: r2 = &callback; fr0: r6 11: call bpf_loop; fr0: r6 22: r0 = r1; fr0: r6 fr1: 23: exit fr0: r6 fr1: 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6; 15: exit Again, insn torvalds#13 forces r0 to be precise. As soon as we get to `23: exit` we see that this isn't actually a static subprog call (it's `call bpf_loop;` helper call instead). So we clear r0 from precision set. For callee-saved register, there is no difference: it stays in frame 0's precision set, we go through insn torvalds#22 and torvalds#23, ignoring them until we get back to caller frame 0, eventually satisfying precision backtrack logic at insn torvalds#8 (`r6 = 456;`). Assuming callback needed to set r0 as precise at insn torvalds#23, we'd backtrack to insn torvalds#22, switching from r0 to r1, and then at the point when we pop back to frame 0 at insn torvalds#11, we'll clear r1-r5 from precision set, as we don't really do a subprog call directly, so there is no input argument precision propagation. That's pretty much it. With these changes, it seems like the only still unsupported situation for precision backpropagation is the case when program is accessing stack through registers other than r10. This is still left as unsupported (though rare) case for now. As for results. For selftests, few positive changes for bigger programs, cls_redirect in dynptr variant benefitting the most: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results.csv ~/subprog-precise-after-results.csv -f @veristat.cfg -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ---------------------------------------- ------------- --------- --------- ---------------- pyperf600_bpf_loop.bpf.linked1.o on_event 2060 2002 -58 (-2.82%) test_cls_redirect_dynptr.bpf.linked1.o cls_redirect 15660 2914 -12746 (-81.39%) test_cls_redirect_subprogs.bpf.linked1.o cls_redirect 61620 59088 -2532 (-4.11%) xdp_synproxy_kern.bpf.linked1.o syncookie_tc 109980 86278 -23702 (-21.55%) xdp_synproxy_kern.bpf.linked1.o syncookie_xdp 97716 85147 -12569 (-12.86%) Cilium progress don't really regress. They don't use subprogs and are mostly unaffected, but some other fixes and improvements could have changed something. This doesn't appear to be the case: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-cilium.csv ~/subprog-precise-after-results-cilium.csv -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ------------- ------------------------------ --------- --------- ------------ bpf_host.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_lxc.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_overlay.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_xdp.o tail_handle_nat_fwd_ipv6 12475 12504 +29 (+0.23%) bpf_xdp.o tail_nodeport_nat_ingress_ipv6 6363 6371 +8 (+0.13%) Looking at (somewhat anonymized) Meta production programs, we see mostly insignificant variation in number of instructions, with one program (syar_bind6_protect6) benefitting the most at -17%. [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-fbcode.csv ~/subprog-precise-after-results-fbcode.csv -e prog,insns -f 'insns_diff!=0' Program Insns (A) Insns (B) Insns (DIFF) ------------------------ --------- --------- ---------------- on_request_context_event 597 585 -12 (-2.01%) read_async_py_stack 43789 43657 -132 (-0.30%) read_sync_py_stack 35041 37599 +2558 (+7.30%) rrm_usdt 946 940 -6 (-0.63%) sysarmor_inet6_bind 28863 28249 -614 (-2.13%) sysarmor_inet_bind 28845 28240 -605 (-2.10%) syar_bind4_protect4 154145 147640 -6505 (-4.22%) syar_bind6_protect6 165242 137088 -28154 (-17.04%) syar_task_exit_setgid 21289 19720 -1569 (-7.37%) syar_task_exit_setuid 21290 19721 -1569 (-7.37%) do_uprobe 19967 19413 -554 (-2.77%) tw_twfw_ingress 215877 204833 -11044 (-5.12%) tw_twfw_tc_in 215877 204833 -11044 (-5.12%) But checking duration (wall clock) differences, that is the actual time taken by verifier to validate programs, we see a sometimes dramatic improvements, all the way to about 16x improvements: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-meta.csv ~/subprog-precise-after-results-meta.csv -e prog,duration -s duration_diff^ | head -n20 Program Duration (us) (A) Duration (us) (B) Duration (us) (DIFF) ---------------------------------------- ----------------- ----------------- -------------------- tw_twfw_ingress 4488374 272836 -4215538 (-93.92%) tw_twfw_tc_in 4339111 268175 -4070936 (-93.82%) tw_twfw_egress 3521816 270751 -3251065 (-92.31%) tw_twfw_tc_eg 3472878 284294 -3188584 (-91.81%) balancer_ingress 343119 291391 -51728 (-15.08%) syar_bind6_protect6 78992 64782 -14210 (-17.99%) ttls_tc_ingress 11739 8176 -3563 (-30.35%) kprobe__security_inode_link 13864 11341 -2523 (-18.20%) read_sync_py_stack 21927 19442 -2485 (-11.33%) read_async_py_stack 30444 28136 -2308 (-7.58%) syar_task_exit_setuid 10256 8440 -1816 (-17.71%) Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> [shung-hsi.yu: use of is_callback_calling_kfunc() is entirely dropped because commit 5d92ddc ("bpf: Add callback validation to kfunc verifier logic") is not present] Signed-off-by: Shung-Hsi Yu <[email protected]>
shunghsiyu
pushed a commit
to shunghsiyu/linux
that referenced
this pull request
Jul 11, 2024
commit fde2a38 upstream. Add support precision backtracking in the presence of subprogram frames in jump history. This means supporting a few different kinds of subprogram invocation situations, all requiring a slightly different handling in precision backtracking handling logic: - static subprogram calls; - global subprogram calls; - callback-calling helpers/kfuncs. For each of those we need to handle a few precision propagation cases: - what to do with precision of subprog returns (r0); - what to do with precision of input arguments; - for all of them callee-saved registers in caller function should be propagated ignoring subprog/callback part of jump history. N.B. Async callback-calling helpers (currently only bpf_timer_set_callback()) are transparent to all this because they set a separate async callback environment and thus callback's history is not shared with main program's history. So as far as all the changes in this commit goes, such helper is just a regular helper. Let's look at all these situation in more details. Let's start with static subprogram being called, using an exxerpt of a simple main program and its static subprog, indenting subprog's frame slightly to make everything clear. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; fr0: r6 11: call pc+10; fr0: r1, r6 22: r0 = r1; fr0: r6; fr1: r1 23: exit fr0: r6; fr1: r0 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6 15: exit As can be seen above main function is passing 123 as single argument to an identity (`return x;`) subprog. Returned value is used to adjust map pointer offset, which forces r0 to be marked as precise. Then instruction torvalds#14 does the same for callee-saved r6, which will have to be backtracked all the way to instruction torvalds#9. For brevity, precision sets for instruction torvalds#13 and torvalds#14 are combined in the diagram above. First, for subprog calls, r0 returned from subprog (in frame 0) has to go into subprog's frame 1, and should be cleared from frame 0. So we go back into subprog's frame knowing we need to mark r0 precise. We then see that insn torvalds#22 sets r0 from r1, so now we care about marking r1 precise. When we pop up from subprog's frame back into caller at insn torvalds#11 we keep r1, as it's an argument-passing register, so we eventually find `10: r1 = 123;` and satify precision propagation chain for insn torvalds#13. This example demonstrates two sets of rules: - r0 returned after subprog call has to be moved into subprog's r0 set; - *static* subprog arguments (r1-r5) are moved back to caller precision set. Let's look at what happens with callee-saved precision propagation. Insn torvalds#14 mark r6 as precise. When we get into subprog's frame, we keep r6 in frame 0's precision set *only*. Subprog itself has its own set of independent r6-r10 registers and is not affected. When we eventually made our way out of subprog frame we keep r6 in precision set until we reach `9: r6 = 456;`, satisfying propagation. r6-r10 propagation is perhaps the simplest aspect, it always stays in its original frame. That's pretty much all we have to do to support precision propagation across *static subprog* invocation. Let's look at what happens when we have global subprog invocation. frame 0 frame 1 precision set ======= ======= ============= 9: r6 = 456; 10: r1 = 123; fr0: r6 11: call pc+10; # global subprog fr0: r6 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6; 15: exit Starting from insn torvalds#13, r0 has to be precise. We backtrack all the way to insn torvalds#11 (call pc+10) and see that subprog is global, so was already validated in isolation. As opposed to static subprog, global subprog always returns unknown scalar r0, so that satisfies precision propagation and we drop r0 from precision set. We are done for insns torvalds#13. Now for insn torvalds#14. r6 is in precision set, we backtrack to `call pc+10;`. Here we need to recognize that this is effectively both exit and entry to global subprog, which means we stay in caller's frame. So we carry on with r6 still in precision set, until we satisfy it at insn torvalds#9. The only hard part with global subprogs is just knowing when it's a global func. Lastly, callback-calling helpers and kfuncs do simulate subprog calls, so jump history will have subprog instructions in between caller program's instructions, but the rules of propagating r0 and r1-r5 differ, because we don't actually directly call callback. We actually call helper/kfunc, which at runtime will call subprog, so the only difference between normal helper/kfunc handling is that we need to make sure to skip callback simulatinog part of jump history. Let's look at an example to make this clearer. frame 0 frame 1 precision set ======= ======= ============= 8: r6 = 456; 9: r1 = 123; fr0: r6 10: r2 = &callback; fr0: r6 11: call bpf_loop; fr0: r6 22: r0 = r1; fr0: r6 fr1: 23: exit fr0: r6 fr1: 12: r1 = <map_pointer> fr0: r0, r6 13: r1 += r0; fr0: r0, r6 14: r1 += r6; fr0: r6; 15: exit Again, insn torvalds#13 forces r0 to be precise. As soon as we get to `23: exit` we see that this isn't actually a static subprog call (it's `call bpf_loop;` helper call instead). So we clear r0 from precision set. For callee-saved register, there is no difference: it stays in frame 0's precision set, we go through insn torvalds#22 and torvalds#23, ignoring them until we get back to caller frame 0, eventually satisfying precision backtrack logic at insn torvalds#8 (`r6 = 456;`). Assuming callback needed to set r0 as precise at insn torvalds#23, we'd backtrack to insn torvalds#22, switching from r0 to r1, and then at the point when we pop back to frame 0 at insn torvalds#11, we'll clear r1-r5 from precision set, as we don't really do a subprog call directly, so there is no input argument precision propagation. That's pretty much it. With these changes, it seems like the only still unsupported situation for precision backpropagation is the case when program is accessing stack through registers other than r10. This is still left as unsupported (though rare) case for now. As for results. For selftests, few positive changes for bigger programs, cls_redirect in dynptr variant benefitting the most: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results.csv ~/subprog-precise-after-results.csv -f @veristat.cfg -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ---------------------------------------- ------------- --------- --------- ---------------- pyperf600_bpf_loop.bpf.linked1.o on_event 2060 2002 -58 (-2.82%) test_cls_redirect_dynptr.bpf.linked1.o cls_redirect 15660 2914 -12746 (-81.39%) test_cls_redirect_subprogs.bpf.linked1.o cls_redirect 61620 59088 -2532 (-4.11%) xdp_synproxy_kern.bpf.linked1.o syncookie_tc 109980 86278 -23702 (-21.55%) xdp_synproxy_kern.bpf.linked1.o syncookie_xdp 97716 85147 -12569 (-12.86%) Cilium progress don't really regress. They don't use subprogs and are mostly unaffected, but some other fixes and improvements could have changed something. This doesn't appear to be the case: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-cilium.csv ~/subprog-precise-after-results-cilium.csv -e file,prog,insns -f 'insns_diff!=0' File Program Insns (A) Insns (B) Insns (DIFF) ------------- ------------------------------ --------- --------- ------------ bpf_host.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_lxc.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_overlay.o tail_nodeport_nat_ingress_ipv6 4983 5003 +20 (+0.40%) bpf_xdp.o tail_handle_nat_fwd_ipv6 12475 12504 +29 (+0.23%) bpf_xdp.o tail_nodeport_nat_ingress_ipv6 6363 6371 +8 (+0.13%) Looking at (somewhat anonymized) Meta production programs, we see mostly insignificant variation in number of instructions, with one program (syar_bind6_protect6) benefitting the most at -17%. [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-fbcode.csv ~/subprog-precise-after-results-fbcode.csv -e prog,insns -f 'insns_diff!=0' Program Insns (A) Insns (B) Insns (DIFF) ------------------------ --------- --------- ---------------- on_request_context_event 597 585 -12 (-2.01%) read_async_py_stack 43789 43657 -132 (-0.30%) read_sync_py_stack 35041 37599 +2558 (+7.30%) rrm_usdt 946 940 -6 (-0.63%) sysarmor_inet6_bind 28863 28249 -614 (-2.13%) sysarmor_inet_bind 28845 28240 -605 (-2.10%) syar_bind4_protect4 154145 147640 -6505 (-4.22%) syar_bind6_protect6 165242 137088 -28154 (-17.04%) syar_task_exit_setgid 21289 19720 -1569 (-7.37%) syar_task_exit_setuid 21290 19721 -1569 (-7.37%) do_uprobe 19967 19413 -554 (-2.77%) tw_twfw_ingress 215877 204833 -11044 (-5.12%) tw_twfw_tc_in 215877 204833 -11044 (-5.12%) But checking duration (wall clock) differences, that is the actual time taken by verifier to validate programs, we see a sometimes dramatic improvements, all the way to about 16x improvements: [vmuser@archvm bpf]$ ./veristat -C ~/subprog-precise-before-results-meta.csv ~/subprog-precise-after-results-meta.csv -e prog,duration -s duration_diff^ | head -n20 Program Duration (us) (A) Duration (us) (B) Duration (us) (DIFF) ---------------------------------------- ----------------- ----------------- -------------------- tw_twfw_ingress 4488374 272836 -4215538 (-93.92%) tw_twfw_tc_in 4339111 268175 -4070936 (-93.82%) tw_twfw_egress 3521816 270751 -3251065 (-92.31%) tw_twfw_tc_eg 3472878 284294 -3188584 (-91.81%) balancer_ingress 343119 291391 -51728 (-15.08%) syar_bind6_protect6 78992 64782 -14210 (-17.99%) ttls_tc_ingress 11739 8176 -3563 (-30.35%) kprobe__security_inode_link 13864 11341 -2523 (-18.20%) read_sync_py_stack 21927 19442 -2485 (-11.33%) read_async_py_stack 30444 28136 -2308 (-7.58%) syar_task_exit_setuid 10256 8440 -1816 (-17.71%) Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> [shung-hsi.yu: use of is_callback_calling_kfunc() is entirely dropped because commit 5d92ddc ("bpf: Add callback validation to kfunc verifier logic") is not present] Signed-off-by: Shung-Hsi Yu <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Aug 6, 2024
The dead lock can happen if we try to use printk(), such as a call of SCHED_WARN_ON(), during the rq->__lock is held. The printk() will try to print the message to the console, and the console driver can call queue_work_on(), which will try to obtain rq->__lock again. This means that any WARN during the kernel function that hold the rq->__lock, such as schedule(), sched_ttwu_pending(), etc, can cause dead lock. Following is the call trace of the deadlock case that I encounter: PID: 0 TASK: ff36bfda010c8000 CPU: 156 COMMAND: "swapper/156" #0 crash_nmi_callback+30 #1 nmi_handle+85 #2 default_do_nmi+66 #3 exc_nmi+291 #4 end_repeat_nmi+22 [exception RIP: native_queued_spin_lock_slowpath+96] #5 native_queued_spin_lock_slowpath+96 torvalds#6 _raw_spin_lock+30 torvalds#7 ttwu_queue+111 torvalds#8 try_to_wake_up+375 torvalds#9 __queue_work+462 torvalds#10 queue_work_on+32 torvalds#11 soft_cursor+420 torvalds#12 bit_cursor+898 torvalds#13 hide_cursor+39 torvalds#14 vt_console_print+995 torvalds#15 call_console_drivers.constprop.0+204 torvalds#16 console_unlock+374 torvalds#17 vprintk_emit+280 torvalds#18 printk+88 torvalds#19 __warn_printk+71 torvalds#20 enqueue_task_fair+1779 torvalds#21 activate_task+102 torvalds#22 ttwu_do_activate+155 torvalds#23 sched_ttwu_pending+177 torvalds#24 flush_smp_call_function_from_idle+42 torvalds#25 do_idle+161 torvalds#26 cpu_startup_entry+25 torvalds#27 secondary_startup_64_no_verify+194 Fix this by using __printk_safe_enter()/__printk_safe_exit() in rq_pin_lock()/rq_unpin_lock(). Then, printk will defer to print out the buffers to the console. Signed-off-by: Menglong Dong <[email protected]> Signed-off-by: Bin Lai <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 24, 2024
When a binder reference is cleaned up, any freeze work queued in the associated process should also be removed. Otherwise, the reference is freed while its ref->freeze.work is still queued in proc->work leading to a use-after-free issue as shown by the following KASAN report: ================================================================== BUG: KASAN: slab-use-after-free in binder_release_work+0x398/0x3d0 Read of size 8 at addr ffff31600ee91488 by task kworker/5:1/211 CPU: 5 UID: 0 PID: 211 Comm: kworker/5:1 Not tainted 6.11.0-rc7-00382-gfc6c92196396 torvalds#22 Hardware name: linux,dummy-virt (DT) Workqueue: events binder_deferred_func Call trace: binder_release_work+0x398/0x3d0 binder_deferred_func+0xb60/0x109c process_one_work+0x51c/0xbd4 worker_thread+0x608/0xee8 Allocated by task 703: __kmalloc_cache_noprof+0x130/0x280 binder_thread_write+0xdb4/0x42a0 binder_ioctl+0x18f0/0x25ac __arm64_sys_ioctl+0x124/0x190 invoke_syscall+0x6c/0x254 Freed by task 211: kfree+0xc4/0x230 binder_deferred_func+0xae8/0x109c process_one_work+0x51c/0xbd4 worker_thread+0x608/0xee8 ================================================================== This commit fixes the issue by ensuring any queued freeze work is removed when cleaning up a binder reference. Fixes: d579b04 ("binder: frozen notification") Cc: [email protected] Signed-off-by: Carlos Llamas <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Sep 26, 2024
When a binder reference is cleaned up, any freeze work queued in the associated process should also be removed. Otherwise, the reference is freed while its ref->freeze.work is still queued in proc->work leading to a use-after-free issue as shown by the following KASAN report: ================================================================== BUG: KASAN: slab-use-after-free in binder_release_work+0x398/0x3d0 Read of size 8 at addr ffff31600ee91488 by task kworker/5:1/211 CPU: 5 UID: 0 PID: 211 Comm: kworker/5:1 Not tainted 6.11.0-rc7-00382-gfc6c92196396 torvalds#22 Hardware name: linux,dummy-virt (DT) Workqueue: events binder_deferred_func Call trace: binder_release_work+0x398/0x3d0 binder_deferred_func+0xb60/0x109c process_one_work+0x51c/0xbd4 worker_thread+0x608/0xee8 Allocated by task 703: __kmalloc_cache_noprof+0x130/0x280 binder_thread_write+0xdb4/0x42a0 binder_ioctl+0x18f0/0x25ac __arm64_sys_ioctl+0x124/0x190 invoke_syscall+0x6c/0x254 Freed by task 211: kfree+0xc4/0x230 binder_deferred_func+0xae8/0x109c process_one_work+0x51c/0xbd4 worker_thread+0x608/0xee8 ================================================================== This commit fixes the issue by ensuring any queued freeze work is removed when cleaning up a binder reference. Fixes: d579b04 ("binder: frozen notification") Cc: [email protected] Acked-by: Todd Kjos <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Signed-off-by: Carlos Llamas <[email protected]>
intel-lab-lkp
pushed a commit
to intel-lab-lkp/linux
that referenced
this pull request
Oct 14, 2024
When a binder reference is cleaned up, any freeze work queued in the associated process should also be removed. Otherwise, the reference is freed while its ref->freeze.work is still queued in proc->work leading to a use-after-free issue as shown by the following KASAN report: ================================================================== BUG: KASAN: slab-use-after-free in binder_release_work+0x398/0x3d0 Read of size 8 at addr ffff31600ee91488 by task kworker/5:1/211 CPU: 5 UID: 0 PID: 211 Comm: kworker/5:1 Not tainted 6.11.0-rc7-00382-gfc6c92196396 torvalds#22 Hardware name: linux,dummy-virt (DT) Workqueue: events binder_deferred_func Call trace: binder_release_work+0x398/0x3d0 binder_deferred_func+0xb60/0x109c process_one_work+0x51c/0xbd4 worker_thread+0x608/0xee8 Allocated by task 703: __kmalloc_cache_noprof+0x130/0x280 binder_thread_write+0xdb4/0x42a0 binder_ioctl+0x18f0/0x25ac __arm64_sys_ioctl+0x124/0x190 invoke_syscall+0x6c/0x254 Freed by task 211: kfree+0xc4/0x230 binder_deferred_func+0xae8/0x109c process_one_work+0x51c/0xbd4 worker_thread+0x608/0xee8 ================================================================== This commit fixes the issue by ensuring any queued freeze work is removed when cleaning up a binder reference. Fixes: d579b04 ("binder: frozen notification") Cc: [email protected] Acked-by: Todd Kjos <[email protected]> Reviewed-by: Alice Ryhl <[email protected]> Signed-off-by: Carlos Llamas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Kaz205
pushed a commit
to Kaz205/linux
that referenced
this pull request
Dec 3, 2024
Some devices seem to produce invalid (wrong checksum/length/reply code) responses for the identification command but provide a valid capability string and otherwise interact normally. Related to issue torvalds#22
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The USB device with ID 0x015E is known as Huwai WIXUBB116. The module works fine when the ID is changed to 0x015E.