-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Compiler] Failed Checking missing-syscalls for O32 #4
Comments
heiher
pushed a commit
that referenced
this issue
Jul 10, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 10, 2018
Daniel Petre reported crashes in icmp_dst_unreach() with following call graph: #3 [ffff88003fc03938] __stack_chk_fail at ffffffff81037f77 #4 [ffff88003fc03948] icmp_send at ffffffff814d5fec #5 [ffff88003fc03ae8] ipv4_link_failure at ffffffff814a1795 #6 [ffff88003fc03af8] ipgre_tunnel_xmit at ffffffff814e7965 #7 [ffff88003fc03b78] dev_hard_start_xmit at ffffffff8146e032 #8 [ffff88003fc03bc8] sch_direct_xmit at ffffffff81487d66 #9 [ffff88003fc03c08] __qdisc_run at ffffffff81487efd #10 [ffff88003fc03c48] dev_queue_xmit at ffffffff8146e5a7 #11 [ffff88003fc03c88] ip_finish_output at ffffffff814ab596 Daniel found a similar problem mentioned in http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html And indeed this is the root cause : skb->cb[] contains data fooling IP stack. We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure() is called. Or else skb->cb[] might contain garbage from GSO segmentation layer. A similar fix was tested on linux-3.9, but gre code was refactored in linux-3.10. I'll send patches for stable kernels as well. Many thanks to Daniel for providing reports, patches and testing ! Reported-by: Daniel Petre <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 10, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 10, 2018
Daniel Petre reported crashes in icmp_dst_unreach() with following call graph: #3 [ffff88003fc03938] __stack_chk_fail at ffffffff81037f77 #4 [ffff88003fc03948] icmp_send at ffffffff814d5fec #5 [ffff88003fc03ae8] ipv4_link_failure at ffffffff814a1795 #6 [ffff88003fc03af8] ipgre_tunnel_xmit at ffffffff814e7965 #7 [ffff88003fc03b78] dev_hard_start_xmit at ffffffff8146e032 #8 [ffff88003fc03bc8] sch_direct_xmit at ffffffff81487d66 #9 [ffff88003fc03c08] __qdisc_run at ffffffff81487efd #10 [ffff88003fc03c48] dev_queue_xmit at ffffffff8146e5a7 #11 [ffff88003fc03c88] ip_finish_output at ffffffff814ab596 Daniel found a similar problem mentioned in http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html And indeed this is the root cause : skb->cb[] contains data fooling IP stack. We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure() is called. Or else skb->cb[] might contain garbage from GSO segmentation layer. A similar fix was tested on linux-3.9, but gre code was refactored in linux-3.10. I'll send patches for stable kernels as well. Many thanks to Daniel for providing reports, patches and testing ! Reported-by: Daniel Petre <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 10, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 10, 2018
Since commit 1bb8866 ("mtd: nand: denali: handle timing parameters by setup_data_interface()"), denali_dt.c gets the clock rate from the clock driver. The driver expects the frequency of the bus interface clock, whereas the clock driver of SOCFPGA provides the core clock. Thus, the setup_data_interface() hook calculates timing parameters based on a wrong frequency. To make it work without relying on the clock driver, hard-code the clock frequency, 200MHz. This is fine for existing DT of UniPhier, and also fixes the issue of SOCFPGA because both platforms use 200 MHz for the bus interface clock. Fixes: 1bb8866 ("mtd: nand: denali: handle timing parameters by setup_data_interface()") Cc: linux-stable <[email protected]> #4.14+ Reported-by: Philipp Rosenberger <[email protected]> Suggested-by: Boris Brezillon <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]> Tested-by: Richard Weinberger <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 10, 2018
A previous patch removed OMAP clock aliases that were perceived to be unnecessary. Unfortunately, it broke the ethernet on the am3517-evm. This patch enables the MDIO clock and EMAC clock. Fixes: 0ed266d ("clk: ti: omap3: cleanup unnecessary clock aliases") Cc: [email protected] #4.16+ Signed-off-by: Adam Ford <[email protected]> Signed-off-by: Tony Lindgren <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 13, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 16, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 16, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 16, 2018
Daniel Petre reported crashes in icmp_dst_unreach() with following call graph: #3 [ffff88003fc03938] __stack_chk_fail at ffffffff81037f77 #4 [ffff88003fc03948] icmp_send at ffffffff814d5fec #5 [ffff88003fc03ae8] ipv4_link_failure at ffffffff814a1795 #6 [ffff88003fc03af8] ipgre_tunnel_xmit at ffffffff814e7965 #7 [ffff88003fc03b78] dev_hard_start_xmit at ffffffff8146e032 #8 [ffff88003fc03bc8] sch_direct_xmit at ffffffff81487d66 #9 [ffff88003fc03c08] __qdisc_run at ffffffff81487efd #10 [ffff88003fc03c48] dev_queue_xmit at ffffffff8146e5a7 #11 [ffff88003fc03c88] ip_finish_output at ffffffff814ab596 Daniel found a similar problem mentioned in http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html And indeed this is the root cause : skb->cb[] contains data fooling IP stack. We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure() is called. Or else skb->cb[] might contain garbage from GSO segmentation layer. A similar fix was tested on linux-3.9, but gre code was refactored in linux-3.10. I'll send patches for stable kernels as well. Many thanks to Daniel for providing reports, patches and testing ! Reported-by: Daniel Petre <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 16, 2018
Daniel Petre reported crashes in icmp_dst_unreach() with following call graph: #3 [ffff88003fc03938] __stack_chk_fail at ffffffff81037f77 #4 [ffff88003fc03948] icmp_send at ffffffff814d5fec #5 [ffff88003fc03ae8] ipv4_link_failure at ffffffff814a1795 #6 [ffff88003fc03af8] ipgre_tunnel_xmit at ffffffff814e7965 #7 [ffff88003fc03b78] dev_hard_start_xmit at ffffffff8146e032 #8 [ffff88003fc03bc8] sch_direct_xmit at ffffffff81487d66 #9 [ffff88003fc03c08] __qdisc_run at ffffffff81487efd #10 [ffff88003fc03c48] dev_queue_xmit at ffffffff8146e5a7 #11 [ffff88003fc03c88] ip_finish_output at ffffffff814ab596 Daniel found a similar problem mentioned in http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html And indeed this is the root cause : skb->cb[] contains data fooling IP stack. We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure() is called. Or else skb->cb[] might contain garbage from GSO segmentation layer. A similar fix was tested on linux-3.9, but gre code was refactored in linux-3.10. I'll send patches for stable kernels as well. Many thanks to Daniel for providing reports, patches and testing ! Reported-by: Daniel Petre <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 16, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 18, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 21, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 21, 2018
The RX SGL in processing is already registered with the RX SGL tracking list to support proper cleanup. The cleanup code path uses the sg_num_bytes variable which must therefore be always initialized, even in the error code path. Signed-off-by: Stephan Mueller <[email protected]> Reported-by: [email protected] #syz test: https://github.com/google/kmsan.git master CC: <[email protected]> #4.14 Fixes: e870456 ("crypto: algif_skcipher - overhaul memory management") Fixes: d887c52 ("crypto: algif_aead - overhaul memory management") Signed-off-by: Herbert Xu <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 25, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 25, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 25, 2018
Daniel Petre reported crashes in icmp_dst_unreach() with following call graph: #3 [ffff88003fc03938] __stack_chk_fail at ffffffff81037f77 #4 [ffff88003fc03948] icmp_send at ffffffff814d5fec #5 [ffff88003fc03ae8] ipv4_link_failure at ffffffff814a1795 #6 [ffff88003fc03af8] ipgre_tunnel_xmit at ffffffff814e7965 #7 [ffff88003fc03b78] dev_hard_start_xmit at ffffffff8146e032 #8 [ffff88003fc03bc8] sch_direct_xmit at ffffffff81487d66 #9 [ffff88003fc03c08] __qdisc_run at ffffffff81487efd #10 [ffff88003fc03c48] dev_queue_xmit at ffffffff8146e5a7 #11 [ffff88003fc03c88] ip_finish_output at ffffffff814ab596 Daniel found a similar problem mentioned in http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html And indeed this is the root cause : skb->cb[] contains data fooling IP stack. We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure() is called. Or else skb->cb[] might contain garbage from GSO segmentation layer. A similar fix was tested on linux-3.9, but gre code was refactored in linux-3.10. I'll send patches for stable kernels as well. Many thanks to Daniel for providing reports, patches and testing ! Reported-by: Daniel Petre <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 25, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 25, 2018
Crash dump shows following instructions crash> bt PID: 0 TASK: ffffffffbe412480 CPU: 0 COMMAND: "swapper/0" #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1 #1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2 #2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c #3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a #4 [ffff891ee00039e0] no_context at ffffffffbd074643 #5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e #6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64 #7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a #8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8 #9 [ffff891ee0003b50] page_fault at ffffffffbda01925 [exception RIP: qlt_schedule_sess_for_deletion+15] RIP: ffffffffc02e526f RSP: ffff891ee0003c08 RFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffc0307847 RDX: 00000000000020e6 RSI: ffff891edbc377c8 RDI: 0000000000000000 RBP: ffff891ee0003c18 R8: ffffffffc02f0b20 R9: 0000000000000250 R10: 0000000000000258 R11: 000000000000b780 R12: ffff891ed9b43000 R13: 00000000000000f0 R14: 0000000000000006 R15: ffff891edbc377c8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx] #11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx] #12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx] #13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx] #14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59 #15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02 #16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90 #17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984 #18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5 #19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18 --- <IRQ stack> --- #20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e [exception RIP: unknown or invalid address] RIP: 000000000000001f RSP: 0000000000000000 RFLAGS: fff3b8c2091ebb3f RAX: ffffbba5a0000200 RBX: 0000be8cdfa8f9fa RCX: 0000000000000018 RDX: 0000000000000101 RSI: 000000000000015d RDI: 0000000000000193 RBP: 0000000000000083 R8: ffffffffbe403e38 R9: 0000000000000002 R10: 0000000000000000 R11: ffffffffbe56b820 R12: ffff891ee001cf00 R13: ffffffffbd11c0a4 R14: ffffffffbe403d60 R15: 0000000000000001 ORIG_RAX: ffff891ee0022ac0 CS: 0000 SS: ffffffffffffffb9 bt: WARNING: possibly bogus exception frame #21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd #22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907 #23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3 #24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42 #25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3 #26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa #27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca #28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675 #29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb #30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5 Fixes: 040036b ("scsi: qla2xxx: Delay loop id allocation at login") Cc: <[email protected]> # v4.17+ Signed-off-by: Chuck Anderson <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 25, 2018
This manifsted as strace segfaulting on HSDK because gcc was targetting the accumulator registers as GPRs, which kernek was not saving/restoring by default. Cc: [email protected] #4.14+ Signed-off-by: Vineet Gupta <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jul 26, 2018
Daniel Petre reported crashes in icmp_dst_unreach() with following call graph: #3 [ffff88003fc03938] __stack_chk_fail at ffffffff81037f77 #4 [ffff88003fc03948] icmp_send at ffffffff814d5fec #5 [ffff88003fc03ae8] ipv4_link_failure at ffffffff814a1795 #6 [ffff88003fc03af8] ipgre_tunnel_xmit at ffffffff814e7965 #7 [ffff88003fc03b78] dev_hard_start_xmit at ffffffff8146e032 #8 [ffff88003fc03bc8] sch_direct_xmit at ffffffff81487d66 #9 [ffff88003fc03c08] __qdisc_run at ffffffff81487efd #10 [ffff88003fc03c48] dev_queue_xmit at ffffffff8146e5a7 #11 [ffff88003fc03c88] ip_finish_output at ffffffff814ab596 Daniel found a similar problem mentioned in http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html And indeed this is the root cause : skb->cb[] contains data fooling IP stack. We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure() is called. Or else skb->cb[] might contain garbage from GSO segmentation layer. A similar fix was tested on linux-3.9, but gre code was refactored in linux-3.10. I'll send patches for stable kernels as well. Many thanks to Daniel for providing reports, patches and testing ! Reported-by: Daniel Petre <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Aug 18, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Aug 18, 2018
Daniel Petre reported crashes in icmp_dst_unreach() with following call graph: #3 [ffff88003fc03938] __stack_chk_fail at ffffffff81037f77 #4 [ffff88003fc03948] icmp_send at ffffffff814d5fec #5 [ffff88003fc03ae8] ipv4_link_failure at ffffffff814a1795 #6 [ffff88003fc03af8] ipgre_tunnel_xmit at ffffffff814e7965 #7 [ffff88003fc03b78] dev_hard_start_xmit at ffffffff8146e032 #8 [ffff88003fc03bc8] sch_direct_xmit at ffffffff81487d66 #9 [ffff88003fc03c08] __qdisc_run at ffffffff81487efd #10 [ffff88003fc03c48] dev_queue_xmit at ffffffff8146e5a7 #11 [ffff88003fc03c88] ip_finish_output at ffffffff814ab596 Daniel found a similar problem mentioned in http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html And indeed this is the root cause : skb->cb[] contains data fooling IP stack. We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure() is called. Or else skb->cb[] might contain garbage from GSO segmentation layer. A similar fix was tested on linux-3.9, but gre code was refactored in linux-3.10. I'll send patches for stable kernels as well. Many thanks to Daniel for providing reports, patches and testing ! Reported-by: Daniel Petre <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Aug 18, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Aug 18, 2018
Daniel Petre reported crashes in icmp_dst_unreach() with following call graph: #3 [ffff88003fc03938] __stack_chk_fail at ffffffff81037f77 #4 [ffff88003fc03948] icmp_send at ffffffff814d5fec #5 [ffff88003fc03ae8] ipv4_link_failure at ffffffff814a1795 #6 [ffff88003fc03af8] ipgre_tunnel_xmit at ffffffff814e7965 #7 [ffff88003fc03b78] dev_hard_start_xmit at ffffffff8146e032 #8 [ffff88003fc03bc8] sch_direct_xmit at ffffffff81487d66 #9 [ffff88003fc03c08] __qdisc_run at ffffffff81487efd #10 [ffff88003fc03c48] dev_queue_xmit at ffffffff8146e5a7 #11 [ffff88003fc03c88] ip_finish_output at ffffffff814ab596 Daniel found a similar problem mentioned in http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html And indeed this is the root cause : skb->cb[] contains data fooling IP stack. We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure() is called. Or else skb->cb[] might contain garbage from GSO segmentation layer. A similar fix was tested on linux-3.9, but gre code was refactored in linux-3.10. I'll send patches for stable kernels as well. Many thanks to Daniel for providing reports, patches and testing ! Reported-by: Daniel Petre <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Aug 18, 2018
When the core THP code is modifying the permissions of a huge page it calls pmd_modify(), which unfortunately was clearing the _PAGE_HUGE bit of the page table entry. The result can be kernel messages like: mm/memory.c:397: bad pmd 000000040080004d. mm/memory.c:397: bad pmd 00000003ff00004d. mm/memory.c:397: bad pmd 000000040100004d. or: ------------[ cut here ]------------ WARNING: at mm/mmap.c:3200 exit_mmap+0x150/0x158() Modules linked in: ipv6 at24 octeon3_ethernet octeon_srio_nexus m25p80 CPU: 12 PID: 1295 Comm: pmderr Not tainted 3.10.87-rt80-Cavium-Octeon #4 Stack : 0000000040808000 0000000014009ce1 0000000000400004 ffffffff81076ba0 0000000000000000 0000000000000000 ffffffff85110000 0000000000000119 0000000000000004 0000000000000000 0000000000000119 43617669756d2d4f 0000000000000000 ffffffff850fda40 ffffffff85110000 0000000000000000 0000000000000000 0000000000000009 ffffffff809207a0 0000000000000c80 ffffffff80f1bf20 0000000000000001 000000ffeca36828 0000000000000001 0000000000000000 0000000000000001 000000ffeca7e700 ffffffff80886924 80000003fd7a0000 80000003fd7a39b0 80000003fdea8000 ffffffff80885780 80000003fdea8000 ffffffff80f12218 000000000000000c 000000000000050f 0000000000000000 ffffffff80865c4c 0000000000000000 0000000000000000 ... Call Trace: [<ffffffff80865c4c>] show_stack+0x6c/0xf8 [<ffffffff80885780>] warn_slowpath_common+0x78/0xa8 [<ffffffff809207a0>] exit_mmap+0x150/0x158 [<ffffffff80882d44>] mmput+0x5c/0x110 [<ffffffff8088b450>] do_exit+0x230/0xa68 [<ffffffff8088be34>] do_group_exit+0x54/0x1d0 [<ffffffff8088bfc0>] __wake_up_parent+0x0/0x18 ---[ end trace c7b38293191c57dc ]--- BUG: Bad rss-counter state mm:80000003fa168000 idx:1 val:1536 Fix by not clearing _PAGE_HUGE bit. Signed-off-by: David Daney <[email protected]> Tested-by: Aaro Koskinen <[email protected]> Cc: [email protected] Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/13687/ Signed-off-by: Ralf Baechle <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jun 18, 2020
When SCSI-MQ is enabled, in some case system would present nr_possible_cpus() which is greater than requested vectors by the driver. This results into driver being able to get larger number of MSI-X vectors than actual online CPUs. Driver then uses pci_alloc_irq_vectors_affinity() to assign 1:1 mapping and affinity for each MSI-x vector to CPUs. When the command is submitted using MSI-x vector, assigned to offline CPU, it results in an ABTS and system hang. This hang is result of a driver not being able to process interrupt on a vector assigned to an Off-line CPUs This patch fixes this issue by setting irq_offset value for the blk_mq_pci_map_queues() to use only those CPUs which has CPU mask affinity assigned and are online. By using the irq_offset value, driver will allow online cpumask to decide which vectors are used in blk_mq_pci_map_queues(). Fixes: 5601236 ("scsi: qla2xxx: Add Block Multi Queue functionality.") Cc: <[email protected]> #4.19 Signed-off-by: Ming Lei <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Tested-by: Himanshu Madhani <[email protected]> Reviewed-by: Ewan D. Milne <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
heiher
pushed a commit
that referenced
this issue
Jun 18, 2020
…supported This patch fixes warning seen when BLK-MQ is enabled and hardware does not support MQ. This will result into driver requesting MSIx vectors which are equal or less than pre_desc via PCI IRQ Affinity infrastructure. [ 19.746300] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.12-k. [ 19.746599] qla2xxx [0000:02:00.0]-001d: : Found an ISP2432 irq 18 iobase 0x(____ptrval____). [ 20.203186] ------------[ cut here ]------------ [ 20.203306] WARNING: CPU: 8 PID: 268 at drivers/pci/msi.c:1273 pci_irq_get_affinity+0xf4/0x120 [ 20.203481] Modules linked in: tg3 ptp qla2xxx(+) pps_core sg libphy scsi_transport_fc flash loop autofs4 [ 20.203700] CPU: 8 PID: 268 Comm: systemd-udevd Not tainted 5.0.0-rc5-00358-gdf3865f #113 [ 20.203830] Call Trace: [ 20.203933] [0000000000461bb0] __warn+0xb0/0xe0 [ 20.204090] [00000000006c8f34] pci_irq_get_affinity+0xf4/0x120 [ 20.204219] [000000000068c764] blk_mq_pci_map_queues+0x24/0x120 [ 20.204396] [00000000007162f4] scsi_map_queues+0x14/0x40 [ 20.204626] [0000000000673654] blk_mq_update_queue_map+0x94/0xe0 [ 20.204698] [0000000000676ce0] blk_mq_alloc_tag_set+0x120/0x300 [ 20.204869] [000000000071077c] scsi_add_host_with_dma+0x7c/0x300 [ 20.205419] [00000000100ead54] qla2x00_probe_one+0x19d4/0x2640 [qla2xxx] [ 20.205621] [00000000006b3c88] pci_device_probe+0xc8/0x160 [ 20.205697] [0000000000701c0c] really_probe+0x1ac/0x2e0 [ 20.205770] [0000000000701f90] driver_probe_device+0x50/0x100 [ 20.205843] [0000000000702134] __driver_attach+0xf4/0x120 [ 20.205913] [0000000000700644] bus_for_each_dev+0x44/0x80 [ 20.206081] [0000000000700c98] bus_add_driver+0x198/0x220 [ 20.206300] [0000000000702950] driver_register+0x70/0x120 [ 20.206582] [0000000010248224] qla2x00_module_init+0x224/0x284 [qla2xxx] [ 20.206857] ---[ end trace b1de7a3f79fab2c2 ]--- The fix is to check if the hardware does not have Multi Queue capabiltiy, use pci_alloc_irq_vectors() call instead of pci_alloc_irq_affinity(). Fixes: f664a3c ("scsi: kill off the legacy IO path") Cc: [email protected] #4.19 Signed-off-by: Giridhar Malavali <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
lshw
pushed a commit
to lshw/linux-stable-loongson3
that referenced
this issue
Mar 20, 2023
This reverts commit d76c743. While commit d76c743 ("serial: 8250_dw: Fix runtime PM handling") fixes runtime PM handling when using kgdb, it introduces a traceback for everyone else. BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/next/drivers/base/power/runtime.c:1034 in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: swapper/0 7 locks held by swapper/0/1: #0: 000000005ec5bc72 (&dev->mutex){....}, at: __driver_attach+0xb5/0x12b loongson-community#1: 000000005d5fa9e5 (&dev->mutex){....}, at: __device_attach+0x3e/0x15b loongson-community#2: 0000000047e93286 (serial_mutex){+.+.}, at: serial8250_register_8250_port+0x51/0x8bb loongson-community#3: 000000003b328f07 (port_mutex){+.+.}, at: uart_add_one_port+0xab/0x8b0 loongson-community#4: 00000000fa313d4d (&port->mutex){+.+.}, at: uart_add_one_port+0xcc/0x8b0 loongson-community#5: 00000000090983ca (console_lock){+.+.}, at: vprintk_emit+0xdb/0x217 loongson-community#6: 00000000c743e583 (console_owner){-...}, at: console_unlock+0x211/0x60f irq event stamp: 735222 __down_trylock_console_sem+0x4a/0x84 console_unlock+0x338/0x60f __do_softirq+0x4a4/0x50d irq_exit+0x64/0xe2 CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc5 loongson-community#6 Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.286.0 03/15/2017 Call Trace: dump_stack+0x7d/0xbd ___might_sleep+0x238/0x259 __pm_runtime_resume+0x4e/0xa4 ? serial8250_rpm_get+0x2e/0x44 serial8250_console_write+0x44/0x301 ? lock_acquire+0x1b8/0x1fa console_unlock+0x577/0x60f vprintk_emit+0x1f0/0x217 printk+0x52/0x6e register_console+0x43b/0x524 uart_add_one_port+0x672/0x8b0 ? set_io_from_upio+0x150/0x162 serial8250_register_8250_port+0x825/0x8bb dw8250_probe+0x80c/0x8b0 ? dw8250_serial_inq+0x8e/0x8e ? dw8250_check_lcr+0x108/0x108 ? dw8250_runtime_resume+0x5b/0x5b ? dw8250_serial_outq+0xa1/0xa1 ? dw8250_remove+0x115/0x115 platform_drv_probe+0x76/0xc5 really_probe+0x1f1/0x3ee ? driver_allows_async_probing+0x5d/0x5d driver_probe_device+0xd6/0x112 ? driver_allows_async_probing+0x5d/0x5d bus_for_each_drv+0xbe/0xe5 __device_attach+0xdd/0x15b bus_probe_device+0x5a/0x10b device_add+0x501/0x894 ? _raw_write_unlock+0x27/0x3a platform_device_add+0x224/0x2b7 mfd_add_device+0x718/0x75b ? __kmalloc+0x144/0x16a ? mfd_add_devices+0x38/0xdb mfd_add_devices+0x9b/0xdb intel_lpss_probe+0x7d4/0x8ee intel_lpss_pci_probe+0xac/0xd4 pci_device_probe+0x101/0x18e ... Revert the offending patch until a more comprehensive solution is available. Cc: Tony Lindgren <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Phil Edworthy <[email protected]> Fixes: d76c743 ("serial: 8250_dw: Fix runtime PM handling") Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
lshw
pushed a commit
to lshw/linux-stable-loongson3
that referenced
this issue
Mar 20, 2023
Fixes a crash when the report encounters an address that could not be associated with an mmaped region: #0 0x00005555557bdc4a in callchain_srcline (ip=<error reading variable: Cannot access memory at address 0x38>, sym=0x0, map=0x0) at util/machine.c:2329 loongson-community#1 unwind_entry (entry=entry@entry=0x7fffffff9180, arg=arg@entry=0x7ffff5642498) at util/machine.c:2329 loongson-community#2 0x00005555558370af in entry (arg=0x7ffff5642498, cb=0x5555557bdb50 <unwind_entry>, thread=<optimized out>, ip=18446744073709551615) at util/unwind-libunwind-local.c:586 loongson-community#3 get_entries (ui=ui@entry=0x7fffffff9620, cb=0x5555557bdb50 <unwind_entry>, arg=0x7ffff5642498, max_stack=<optimized out>) at util/unwind-libunwind-local.c:703 loongson-community#4 0x0000555555837192 in _unwind__get_entries (cb=<optimized out>, arg=<optimized out>, thread=<optimized out>, data=<optimized out>, max_stack=<optimized out>) at util/unwind-libunwind-local.c:725 loongson-community#5 0x00005555557c310f in thread__resolve_callchain_unwind (max_stack=127, sample=0x7fffffff9830, evsel=0x555555c7b3b0, cursor=0x7ffff5642498, thread=0x555555c7f6f0) at util/machine.c:2351 loongson-community#6 thread__resolve_callchain (thread=0x555555c7f6f0, cursor=0x7ffff5642498, evsel=0x555555c7b3b0, sample=0x7fffffff9830, parent=0x7fffffff97b8, root_al=0x7fffffff9750, max_stack=127) at util/machine.c:2378 loongson-community#7 0x00005555557ba4ee in sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fffffff97b8, evsel=<optimized out>, al=al@entry=0x7fffffff9750, max_stack=<optimized out>) at util/callchain.c:1085 Signed-off-by: Milian Wolff <[email protected]> Tested-by: Sandipan Das <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Jin Yao <[email protected]> Cc: Namhyung Kim <[email protected]> Fixes: 2a9d505 ("perf script: Show correct offsets for DWARF-based unwinding") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
lshw
pushed a commit
to lshw/linux-stable-loongson3
that referenced
this issue
Mar 20, 2023
The function that puts back the MR in cache also removes the DMA address from the HCA. Therefore we need to call this function before we remove the DMA mapping from MMU. Otherwise the HCA may access a memory that is no longer DMA mapped. Call trace: NMI: IOCK error (debug interrupt?) for reason 71 on CPU 0. CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc6+ loongson-community#4 Hardware name: HP ProLiant DL360p Gen8, BIOS P71 08/20/2012 RIP: 0010:intel_idle+0x73/0x120 Code: 80 5c 01 00 0f ae 38 0f ae f0 31 d2 65 48 8b 04 25 80 5c 01 00 48 89 d1 0f 60 02 RSP: 0018:ffffffff9a403e38 EFLAGS: 00000046 RAX: 0000000000000030 RBX: 0000000000000005 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffff9a5790c0 RDI: 0000000000000000 RBP: 0000000000000030 R08: 0000000000000000 R09: 0000000000007cf9 R10: 000000000000030a R11: 0000000000000018 R12: 0000000000000000 R13: ffffffff9a5792b8 R14: ffffffff9a5790c0 R15: 0000002b48471e4d FS: 0000000000000000(0000) GS:ffff9c6caf400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5737185000 CR3: 0000000590c0a002 CR4: 00000000000606f0 Call Trace: cpuidle_enter_state+0x7e/0x2e0 do_idle+0x1ed/0x290 cpu_startup_entry+0x6f/0x80 start_kernel+0x524/0x544 ? set_init_arg+0x55/0x55 secondary_startup_64+0xa4/0xb0 DMAR: DRHD: handling fault status reg 2 DMAR: [DMA Read] Request device [04:00.0] fault addr b34d2000 [fault reason 06] PTE Read access is not set DMAR: [DMA Read] Request device [01:00.2] fault addr bff8b000 [fault reason 06] PTE Read access is not set Fixes: f3f134f ("RDMA/mlx5: Fix crash while accessing garbage pointer and freed memory") Signed-off-by: Valentine Fatiev <[email protected]> Reviewed-by: Moni Shoua <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
lshw
pushed a commit
to lshw/linux-stable-loongson3
that referenced
this issue
Mar 20, 2023
When the function name for an inline frame is invalid, we must not try to demangle this symbol, otherwise we crash with: #0 0x0000555555895c01 in bfd_demangle () loongson-community#1 0x0000555555823262 in demangle_sym (dso=0x555555d92b90, elf_name=0x0, kmodule=0) at util/symbol-elf.c:215 loongson-community#2 dso__demangle_sym (dso=dso@entry=0x555555d92b90, kmodule=<optimized out>, kmodule@entry=0, elf_name=elf_name@entry=0x0) at util/symbol-elf.c:400 loongson-community#3 0x00005555557fef4b in new_inline_sym (funcname=0x0, base_sym=0x555555d92b90, dso=0x555555d92b90) at util/srcline.c:89 loongson-community#4 inline_list__append_dso_a2l (dso=dso@entry=0x555555c7bb00, node=node@entry=0x555555e31810, sym=sym@entry=0x555555d92b90) at util/srcline.c:264 loongson-community#5 0x00005555557ff27f in addr2line (dso_name=dso_name@entry=0x555555d92430 "/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf", addr=addr@entry=2888, file=file@entry=0x0, line=line@entry=0x0, dso=dso@entry=0x555555c7bb00, unwind_inlines=unwind_inlines@entry=true, node=0x555555e31810, sym=0x555555d92b90) at util/srcline.c:313 loongson-community#6 0x00005555557ffe7c in addr2inlines (sym=0x555555d92b90, dso=0x555555c7bb00, addr=2888, dso_name=0x555555d92430 "/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf") at util/srcline.c:358 So instead handle the case where we get invalid function names for inlined frames and use a fallback '??' function name instead. While this crash was originally reported by Hadrien for rust code, I can now also reproduce it with trivial C++ code. Indeed, it seems like libbfd fails to interpret the debug information for the inline frame symbol name: $ addr2line -e /home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf -if b48 main /usr/include/c++/8.2.1/complex:610 ?? /usr/include/c++/8.2.1/complex:618 ?? /usr/include/c++/8.2.1/complex:675 ?? /usr/include/c++/8.2.1/complex:685 main /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 I've reported this bug upstream and also attached a patch there which should fix this issue: https://sourceware.org/bugzilla/show_bug.cgi?id=23715 Reported-by: Hadrien Grasland <[email protected]> Signed-off-by: Milian Wolff <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Fixes: a64489c ("perf report: Find the inline stack for a given address") [ The above 'Fixes:' cset is where originally the problem was introduced, i.e. using a2l->funcname without checking if it is NULL, but this current patch fixes the current codebase, i.e. multiple csets were applied after a64489c before the problem was reported by Hadrien ] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
lshw
pushed a commit
to lshw/linux-stable-loongson3
that referenced
this issue
Mar 20, 2023
When SCSI-MQ is enabled, in some case system would present nr_possible_cpus() which is greater than requested vectors by the driver. This results into driver being able to get larger number of MSI-X vectors than actual online CPUs. Driver then uses pci_alloc_irq_vectors_affinity() to assign 1:1 mapping and affinity for each MSI-x vector to CPUs. When the command is submitted using MSI-x vector, assigned to offline CPU, it results in an ABTS and system hang. This hang is result of a driver not being able to process interrupt on a vector assigned to an Off-line CPUs This patch fixes this issue by setting irq_offset value for the blk_mq_pci_map_queues() to use only those CPUs which has CPU mask affinity assigned and are online. By using the irq_offset value, driver will allow online cpumask to decide which vectors are used in blk_mq_pci_map_queues(). Fixes: 5601236 ("scsi: qla2xxx: Add Block Multi Queue functionality.") Cc: <[email protected]> loongson-community#4.19 Signed-off-by: Ming Lei <[email protected]> Reviewed-by: Himanshu Madhani <[email protected]> Tested-by: Himanshu Madhani <[email protected]> Reviewed-by: Ewan D. Milne <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
lshw
pushed a commit
to lshw/linux-stable-loongson3
that referenced
this issue
Mar 20, 2023
…supported This patch fixes warning seen when BLK-MQ is enabled and hardware does not support MQ. This will result into driver requesting MSIx vectors which are equal or less than pre_desc via PCI IRQ Affinity infrastructure. [ 19.746300] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.12-k. [ 19.746599] qla2xxx [0000:02:00.0]-001d: : Found an ISP2432 irq 18 iobase 0x(____ptrval____). [ 20.203186] ------------[ cut here ]------------ [ 20.203306] WARNING: CPU: 8 PID: 268 at drivers/pci/msi.c:1273 pci_irq_get_affinity+0xf4/0x120 [ 20.203481] Modules linked in: tg3 ptp qla2xxx(+) pps_core sg libphy scsi_transport_fc flash loop autofs4 [ 20.203700] CPU: 8 PID: 268 Comm: systemd-udevd Not tainted 5.0.0-rc5-00358-gdf3865f #113 [ 20.203830] Call Trace: [ 20.203933] [0000000000461bb0] __warn+0xb0/0xe0 [ 20.204090] [00000000006c8f34] pci_irq_get_affinity+0xf4/0x120 [ 20.204219] [000000000068c764] blk_mq_pci_map_queues+0x24/0x120 [ 20.204396] [00000000007162f4] scsi_map_queues+0x14/0x40 [ 20.204626] [0000000000673654] blk_mq_update_queue_map+0x94/0xe0 [ 20.204698] [0000000000676ce0] blk_mq_alloc_tag_set+0x120/0x300 [ 20.204869] [000000000071077c] scsi_add_host_with_dma+0x7c/0x300 [ 20.205419] [00000000100ead54] qla2x00_probe_one+0x19d4/0x2640 [qla2xxx] [ 20.205621] [00000000006b3c88] pci_device_probe+0xc8/0x160 [ 20.205697] [0000000000701c0c] really_probe+0x1ac/0x2e0 [ 20.205770] [0000000000701f90] driver_probe_device+0x50/0x100 [ 20.205843] [0000000000702134] __driver_attach+0xf4/0x120 [ 20.205913] [0000000000700644] bus_for_each_dev+0x44/0x80 [ 20.206081] [0000000000700c98] bus_add_driver+0x198/0x220 [ 20.206300] [0000000000702950] driver_register+0x70/0x120 [ 20.206582] [0000000010248224] qla2x00_module_init+0x224/0x284 [qla2xxx] [ 20.206857] ---[ end trace b1de7a3f79fab2c2 ]--- The fix is to check if the hardware does not have Multi Queue capabiltiy, use pci_alloc_irq_vectors() call instead of pci_alloc_irq_affinity(). Fixes: f664a3c ("scsi: kill off the legacy IO path") Cc: [email protected] loongson-community#4.19 Signed-off-by: Giridhar Malavali <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi Loonger,
I reported PR38063 to LLVM, because GCC is able to work, but I argue that ugly workaround will not be merged by upstream:
Please give me some hint to make the patch looks good, thanks a lot!
Regards,
Leslie Zhai
The text was updated successfully, but these errors were encountered: