Update README #101

raintean · 2014-06-27T15:54:37Z

No description provided.

younishd · 2014-06-27T16:45:39Z

test

ComradePhilos · 2014-06-27T16:58:56Z

dude, it worked! :D

raintean · 2014-06-28T07:23:47Z

Turn it into (for example): [ 0.073380] x86: Booting SMP configuration: [ 0.074005] .... node #0, CPUs: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 [ 0.603005] .... node #1, CPUs: torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 [ 1.200005] .... node #2, CPUs: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 [ 1.796005] .... node #3, CPUs: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 [ 2.393005] .... node #4, CPUs: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 [ 2.996005] .... node #5, CPUs: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 [ 3.600005] .... node torvalds#6, CPUs: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 [ 4.202005] .... node torvalds#7, CPUs: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 [ 4.811005] .... node torvalds#8, CPUs: torvalds#64 torvalds#65 torvalds#66 torvalds#67 torvalds#68 torvalds#69 #70 torvalds#71 [ 5.421006] .... node torvalds#9, CPUs: torvalds#72 torvalds#73 torvalds#74 torvalds#75 torvalds#76 torvalds#77 torvalds#78 torvalds#79 [ 6.032005] .... node torvalds#10, CPUs: torvalds#80 torvalds#81 torvalds#82 torvalds#83 torvalds#84 torvalds#85 torvalds#86 torvalds#87 [ 6.648006] .... node torvalds#11, CPUs: torvalds#88 torvalds#89 torvalds#90 torvalds#91 torvalds#92 torvalds#93 torvalds#94 torvalds#95 [ 7.262005] .... node torvalds#12, CPUs: torvalds#96 torvalds#97 torvalds#98 torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103 [ 7.865005] .... node torvalds#13, CPUs: torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111 [ 8.466005] .... node torvalds#14, CPUs: torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119 [ 9.073006] .... node torvalds#15, CPUs: torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127 [ 9.679901] x86: Booted up 16 nodes, 128 CPUs and drop useless elements. Change num_digits() to hpa's division-avoiding, cell-phone-typed version which he went at great lengths and pains to submit on a Saturday evening. Signed-off-by: Borislav Petkov <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>

This reverts commit 3617904. IRQ_CROSSBAR_96 maps to irq torvalds#101 and not torvalds#96, which is mmc4's irq. Change-Id: I9aecf8e08abeda5917817aea38634edd0f84b721 Signed-off-by: Ido Yariv <[email protected]> Signed-off-by: bvijay <[email protected]>

nf_ct_gre_keymap_flush() removes a nf_ct_gre_keymap object from net_gre->keymap_list and frees the object. But it doesn't clean a reference on this object from ct_pptp_info->keymap[dir]. Then nf_ct_gre_keymap_destroy() may release the same object again. So nf_ct_gre_keymap_flush() can be called only when we are sure that when nf_ct_gre_keymap_destroy will not be called. nf_ct_gre_keymap is created by nf_ct_gre_keymap_add() and the right way to destroy it is to call nf_ct_gre_keymap_destroy(). This patch marks nf_ct_gre_keymap_flush() as static, so this patch can break compilation of third party modules, which use nf_ct_gre_keymap_flush. I'm not sure this is the right way to deprecate this function. [ 226.540793] general protection fault: 0000 [#1] SMP [ 226.541750] Modules linked in: nf_nat_pptp nf_nat_proto_gre nf_conntrack_pptp nf_conntrack_proto_gre ip_gre ip_tunnel gre ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack veth tun bridge stp llc ppdev microcode joydev pcspkr serio_raw virtio_console virtio_balloon floppy parport_pc parport pvpanic i2c_piix4 virtio_net drm_kms_helper ttm ata_generic virtio_pci virtio_ring virtio drm i2c_core pata_acpi [last unloaded: ip_tunnel] [ 226.541776] CPU: 0 PID: 49 Comm: kworker/u4:2 Not tainted 3.14.0-rc8+ torvalds#101 [ 226.541776] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 226.541776] Workqueue: netns cleanup_net [ 226.541776] task: ffff8800371e0000 ti: ffff88003730c000 task.ti: ffff88003730c000 [ 226.541776] RIP: 0010:[<ffffffff81389ba9>] [<ffffffff81389ba9>] __list_del_entry+0x29/0xd0 [ 226.541776] RSP: 0018:ffff88003730dbd0 EFLAGS: 00010a83 [ 226.541776] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800374e6c40 RCX: dead000000200200 [ 226.541776] RDX: 6b6b6b6b6b6b6b6b RSI: ffff8800371e07d0 RDI: ffff8800374e6c40 [ 226.541776] RBP: ffff88003730dbd0 R08: 0000000000000000 R09: 0000000000000000 [ 226.541776] R10: 0000000000000001 R11: ffff88003730d92e R12: 0000000000000002 [ 226.541776] R13: ffff88007a4c42d0 R14: ffff88007aef0000 R15: ffff880036cf0018 [ 226.541776] FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 [ 226.541776] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 226.541776] CR2: 00007f07f643f7d0 CR3: 0000000036fd2000 CR4: 00000000000006f0 [ 226.541776] Stack: [ 226.541776] ffff88003730dbe8 ffffffff81389c5d ffff8800374ffbe4 ffff88003730dc28 [ 226.541776] ffffffffa0162a43 ffffffffa01627c5 ffff88007a4c42d0 ffff88007aef0000 [ 226.541776] ffffffffa01651c0 ffff88007a4c45e0 ffff88007aef0000 ffff88003730dc40 [ 226.541776] Call Trace: [ 226.541776] [<ffffffff81389c5d>] list_del+0xd/0x30 [ 226.541776] [<ffffffffa0162a43>] nf_ct_gre_keymap_destroy+0x283/0x2d0 [nf_conntrack_proto_gre] [ 226.541776] [<ffffffffa01627c5>] ? nf_ct_gre_keymap_destroy+0x5/0x2d0 [nf_conntrack_proto_gre] [ 226.541776] [<ffffffffa0162ab7>] gre_destroy+0x27/0x70 [nf_conntrack_proto_gre] [ 226.541776] [<ffffffffa0117de3>] destroy_conntrack+0x83/0x200 [nf_conntrack] [ 226.541776] [<ffffffffa0117d87>] ? destroy_conntrack+0x27/0x200 [nf_conntrack] [ 226.541776] [<ffffffffa0117d60>] ? nf_conntrack_hash_check_insert+0x2e0/0x2e0 [nf_conntrack] [ 226.541776] [<ffffffff81630142>] nf_conntrack_destroy+0x72/0x180 [ 226.541776] [<ffffffff816300d5>] ? nf_conntrack_destroy+0x5/0x180 [ 226.541776] [<ffffffffa011ef80>] ? kill_l3proto+0x20/0x20 [nf_conntrack] [ 226.541776] [<ffffffffa011847e>] nf_ct_iterate_cleanup+0x14e/0x170 [nf_conntrack] [ 226.541776] [<ffffffffa011f74b>] nf_ct_l4proto_pernet_unregister+0x5b/0x90 [nf_conntrack] [ 226.541776] [<ffffffffa0162409>] proto_gre_net_exit+0x19/0x30 [nf_conntrack_proto_gre] [ 226.541776] [<ffffffff815edf89>] ops_exit_list.isra.1+0x39/0x60 [ 226.541776] [<ffffffff815eecc0>] cleanup_net+0x100/0x1d0 [ 226.541776] [<ffffffff810a608a>] process_one_work+0x1ea/0x4f0 [ 226.541776] [<ffffffff810a6028>] ? process_one_work+0x188/0x4f0 [ 226.541776] [<ffffffff810a64ab>] worker_thread+0x11b/0x3a0 [ 226.541776] [<ffffffff810a6390>] ? process_one_work+0x4f0/0x4f0 [ 226.541776] [<ffffffff810af42d>] kthread+0xed/0x110 [ 226.541776] [<ffffffff8173d4dc>] ? _raw_spin_unlock_irq+0x2c/0x40 [ 226.541776] [<ffffffff810af340>] ? kthread_create_on_node+0x200/0x200 [ 226.541776] [<ffffffff8174747c>] ret_from_fork+0x7c/0xb0 [ 226.541776] [<ffffffff810af340>] ? kthread_create_on_node+0x200/0x200 [ 226.541776] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08 [ 226.541776] RIP [<ffffffff81389ba9>] __list_del_entry+0x29/0xd0 [ 226.541776] RSP <ffff88003730dbd0> [ 226.612193] ---[ end trace 985ae23ddfcc357c ]--- Cc: Pablo Neira Ayuso <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: Jozsef Kadlecsik <[email protected]> Cc: "David S. Miller" <[email protected]> Signed-off-by: Andrey Vagin <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>

Added support for the DVB devices: DVBSky USB, TechoTrend CT2 4400 and TechnoTrend CT2 4650

…echnoTrend CT2 4650 See PR torvalds#101 for more information Change-Id: I245214a6d2c37f6e7d365461e564d5a7374ea6c0

In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Cc: <[email protected]> Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]>

In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Cc: <[email protected]> Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Gregor Boirie <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]>

commit 07d2390 upstream. In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Gregor Boirie <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 07d2390 upstream. In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty #101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Gregor Boirie <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

On Fri, May 13, 2016 at 07:23:34AM +0200, Marc Haber wrote: > How do I apply this? I'm attaching it. $ patch -p1 --dry-run -i /tmp/01-mm-thp-calculate_the_mapcount_correctly_for_thp_pages_during_wp_faults.patch checking file include/linux/mm.h checking file include/linux/swap.h checking file mm/huge_memory.c checking file mm/memory.c checking file mm/swapfile.c $ patch -p1 -i /tmp/01-mm-thp-calculate_the_mapcount_correctly_for_thp_pages_during_wp_faults.patch patching file include/linux/mm.h patching file include/linux/swap.h patching file mm/huge_memory.c patching file mm/memory.c patching file mm/swapfile.c The --dry-run is to check whether it applies first. That's on 4.6-rc7+ here. HTH. -- Regards/Gruss, Boris. ECO tip torvalds#101: Trim your mails when you reply. From [email protected] Tue May 10 21:21:32 2016 From: Andrea Arcangeli <[email protected]> To: Andrew Morton <[email protected]>, [email protected], [email protected] Cc: Alex Williamson <[email protected]>, "Kirill A. Shutemov" <[email protected]> Subject: [PATCH 1/1] mm: thp: calculate the mapcount correctly for THP pages during WP faults Date: Tue, 10 May 2016 21:21:22 +0200 Message-Id: <[email protected]> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=utf-8 Status: RO This will provide fully accuracy to the mapcount calculation in the write protect faults, so page pinning will not get broken by false positive copy-on-writes. total_mapcount() isn't the right calculation needed in reuse_swap_page(), so this introduces a page_trans_huge_mapcount() that is effectively the full accurate return value for page_mapcount() if dealing with Transparent Hugepages, however we only use the page_trans_huge_mapcount() during COW faults where it strictly needed, due to its higher runtime cost. This also provide at practical zero cost the total_mapcount information which is needed to know if we can still relocate the page anon_vma to the local vma. If page_trans_huge_mapcount() returns 1 we can reuse the page no matter if it's a pte or a pmd_trans_huge triggering the fault, but we can only relocate the page anon_vma to the local vma->anon_vma if we're sure it's only this "vma" mapping the whole THP physical range. Kirill A. Shutemov discovered the problem with moving the page anon_vma to the local vma->anon_vma in a previous version of this patch and another problem in the way page_move_anon_rmap() was called. Andrew Morton discovered that CONFIG_SWAP=n wouldn't build in a previous version, because reuse_swap_page must be a macro to call page_trans_huge_mapcount from swap.h, so this uses a macro again instead of an inline function. With this change at least it's a less dangerous usage than it was before, because "page" is used only once now, while with the previous code reuse_swap_page(page++) would have called page_mapcount on page+1 and it would have increased page twice instead of just once. Reviewed-by: "Kirill A. Shutemov" <[email protected]> Signed-off-by: Andrea Arcangeli <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Al Viro <[email protected]> Cc: Alex Williamson <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Ebru Akagunduz <[email protected]> Cc: Geliang Tang <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kara <[email protected]> Cc: Jerome Marchand <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Miklos Szeredi <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Xie XiuQi <[email protected]> Cc: linux-mm <[email protected]> Cc: lkml <[email protected]> Link: http://lkml.kernel.org/r/[email protected]

[ Upstream commit 07d2390 ] In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Cc: <[email protected]> Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Gregor Boirie <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit 07d2390 upstream. In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Gregor Boirie <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Jiri Slaby <[email protected]>

[ Upstream commit 07d2390 ] In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Cc: <[email protected]> Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Gregor Boirie <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit 07d2390 upstream. In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Gregor Boirie <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit 07d2390 upstream. In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Gregor Boirie <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Jiri Slaby <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 07d2390 ] In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Cc: <[email protected]> Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Gregor Boirie <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

commit 07d2390 upstream. In certain probe conditions the interrupt came right after registering the handler causing a NULL pointer exception because of uninitialized waitqueue: $ udevadm trigger i2c-gpio i2c-gpio-1: using pins 143 (SDA) and 144 (SCL) i2c-gpio i2c-gpio-3: using pins 53 (SDA) and 52 (SCL) Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = e8b38000 [00000000] *pgd=00000000 Internal error: Oops: 5 [#1] SMP ARM Modules linked in: snd_soc_i2s(+) i2c_gpio(+) snd_soc_idma snd_soc_s3c_dma snd_soc_core snd_pcm_dmaengine snd_pcm snd_timer snd soundcore ac97_bus spi_s3c64xx pwm_samsung dwc2 exynos_adc phy_exynos_usb2 exynosdrm exynos_rng rng_core rtc_s3c CPU: 0 PID: 717 Comm: data-provider-m Not tainted 4.6.0-rc1-next-20160401-00011-g1b8d87473b9e-dirty torvalds#101 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) (...) (__wake_up_common) from [<c0379624>] (__wake_up+0x38/0x4c) (__wake_up) from [<c0a41d30>] (ak8975_irq_handler+0x28/0x30) (ak8975_irq_handler) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389c40>] (handle_edge_irq+0xf0/0x19c) (handle_edge_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c05ee360>] (exynos_eint_gpio_irq+0x50/0x68) (exynos_eint_gpio_irq) from [<c0386720>] (handle_irq_event_percpu+0x88/0x140) (handle_irq_event_percpu) from [<c038681c>] (handle_irq_event+0x44/0x68) (handle_irq_event) from [<c0389a70>] (handle_fasteoi_irq+0xb4/0x194) (handle_fasteoi_irq) from [<c0385e04>] (generic_handle_irq+0x24/0x34) (generic_handle_irq) from [<c03860b4>] (__handle_domain_irq+0x5c/0xb4) (__handle_domain_irq) from [<c0301774>] (gic_handle_irq+0x54/0x94) (gic_handle_irq) from [<c030c910>] (__irq_usr+0x50/0x80) The bug was reproduced on exynos4412-trats2 (with a max77693 device also using i2c-gpio) after building max77693 as a module. Fixes: 94a6d5c ("iio:ak8975 Implement data ready interrupt handling") Signed-off-by: Krzysztof Kozlowski <[email protected]> Tested-by: Gregor Boirie <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

On 07/08/16 09:13, Petr Mladek wrote: > On Thu 2016-07-07 20:27:13, Topi Miettinen wrote: >> On 07/07/16 09:16, Petr Mladek wrote: >>> On Sun 2016-07-03 15:08:07, Topi Miettinen wrote: >>>> The attached patch would make any uses of capabilities generate audit >>>> messages. It works for simple tests as you can see from the commit >>>> message, but unfortunately the call to audit_cgroup_list() deadlocks the >>>> system when booting a full blown OS. There's no deadlock when the call >>>> is removed. >>>> >>>> I guess that in some cases, cgroup_mutex and/or css_set_lock could be >>>> already held earlier before entering audit_cgroup_list(). Holding the >>>> locks is however required by task_cgroup_from_root(). Is there any way >>>> to avoid this? For example, only print some kind of cgroup ID numbers >>>> (are there unique and stable IDs, available without locks?) for those >>>> cgroups where the task is registered in the audit message? >>> >>> I am not sure if anyone know what really happens here. I suggest to >>> enable lockdep. It might detect possible deadlock even before it >>> really happens, see Documentation/locking/lockdep-design.txt >>> >>> It can be enabled by >>> >>> CONFIG_PROVE_LOCKING=y >>> >>> It depends on >>> >>> CONFIG_DEBUG_KERNEL=y >>> >>> and maybe some more options, see lib/Kconfig.debug >> >> Thanks a lot! I caught this stack dump: >> >> starting version 230 >> [ 3.416647] ------------[ cut here ]------------ >> [ 3.417310] WARNING: CPU: 0 PID: 95 at >> /home/topi/d/linux.git/kernel/locking/lockdep.c:2871 >> lockdep_trace_alloc+0xb4/0xc0 >> [ 3.417605] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) >> [ 3.417923] Modules linked in: >> [ 3.418288] CPU: 0 PID: 95 Comm: systemd-udevd Not tainted 4.7.0-rc5+ torvalds#97 >> [ 3.418444] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >> BIOS Debian-1.8.2-1 04/01/2014 >> [ 3.418726] 0000000000000086 000000007970f3b0 ffff88000016fb00 >> ffffffff813c9c45 >> [ 3.418993] ffff88000016fb50 0000000000000000 ffff88000016fb40 >> ffffffff81091e9b >> [ 3.419176] 00000b3705e2c798 0000000000000046 0000000000000410 >> 00000000ffffffff >> [ 3.419374] Call Trace: >> [ 3.419511] [<ffffffff813c9c45>] dump_stack+0x67/0x92 >> [ 3.419644] [<ffffffff81091e9b>] __warn+0xcb/0xf0 >> [ 3.419745] [<ffffffff81091f1f>] warn_slowpath_fmt+0x5f/0x80 >> [ 3.419868] [<ffffffff810e9a84>] lockdep_trace_alloc+0xb4/0xc0 >> [ 3.419988] [<ffffffff8120dc42>] kmem_cache_alloc_node+0x42/0x600 >> [ 3.420156] [<ffffffff8110432d>] ? debug_lockdep_rcu_enabled+0x1d/0x20 >> [ 3.420170] [<ffffffff8163183b>] __alloc_skb+0x5b/0x1d0 >> [ 3.420170] [<ffffffff81144f6b>] audit_log_start+0x29b/0x480 >> [ 3.420170] [<ffffffff810a2925>] ? __lock_task_sighand+0x95/0x270 >> [ 3.420170] [<ffffffff81145cc9>] audit_log_cap_use+0x39/0xf0 >> [ 3.420170] [<ffffffff8109cd75>] ns_capable+0x45/0x70 >> [ 3.420170] [<ffffffff8109cdb7>] capable+0x17/0x20 >> [ 3.420170] [<ffffffff812a2f50>] oom_score_adj_write+0x150/0x2f0 >> [ 3.420170] [<ffffffff81230997>] __vfs_write+0x37/0x160 >> [ 3.420170] [<ffffffff810e33b7>] ? update_fast_ctr+0x17/0x30 >> [ 3.420170] [<ffffffff810e3449>] ? percpu_down_read+0x49/0x90 >> [ 3.420170] [<ffffffff81233d47>] ? __sb_start_write+0xb7/0xf0 >> [ 3.420170] [<ffffffff81233d47>] ? __sb_start_write+0xb7/0xf0 >> [ 3.420170] [<ffffffff81231048>] vfs_write+0xb8/0x1b0 >> [ 3.420170] [<ffffffff812533c6>] ? __fget_light+0x66/0x90 >> [ 3.420170] [<ffffffff81232078>] SyS_write+0x58/0xc0 >> [ 3.420170] [<ffffffff81001f2c>] do_syscall_64+0x5c/0x300 >> [ 3.420170] [<ffffffff81849c9a>] entry_SYSCALL64_slow_path+0x25/0x25 >> [ 3.420170] ---[ end trace fb586899fb556a5e ]--- >> [ 3.447922] random: systemd-udevd urandom read with 3 bits of entropy >> available >> [ 4.014078] clocksource: Switched to clocksource tsc >> Begin: Loading essential drivers ... done. >> >> This is with qemu and the boot continues normally. With real computer, >> there's no such output and system just seems to freeze. >> >> Could it be possible that the deadlock happens because there's some IO >> towards /sys/fs/cgroup, which causes a capability check and that in turn >> causes locking problems when we try to print cgroup list? > > The above warning is printed by the code from > kernel/locking/lockdep.c:2871 > > static void __lockdep_trace_alloc(gfp_t gfp_mask, unsigned long flags) > { > [...] > /* We're only interested __GFP_FS allocations for now */ > if (!(gfp_mask & __GFP_FS)) > return; > > /* > * Oi! Can't be having __GFP_FS allocations with IRQs disabled. > */ > if (DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))) > return; > > > The backtrace shows that your new audit_log_cap_use() is called > from vfs_write(). You might try to use audit_log_start() with > GFP_NOFS instead of GFP_KERNEL. > > Note that this is rather intuitive advice. I still need to learn a lot > about memory management and kernel in general to be more sure about > a correct solution. > > Best Regards, > Petr > With the attached patch, the system boots without deadlock. Locking problems still remain: [ 3.652221] ====================================================== [ 3.652221] [ INFO: possible circular locking dependency detected ] [ 3.652221] 4.7.0-rc5+ torvalds#101 Not tainted [ 3.652221] ------------------------------------------------------- [ 3.652221] systemd/1 is trying to acquire lock: [ 3.652221] (tasklist_lock){.+.+..}, at: [<ffffffff81137ddd>] cgroup_mount+0x7ed/0xbc0 [ 3.652221] but task is already holding lock: [ 3.652221] (css_set_lock){......}, at: [<ffffffff81137c59>] cgroup_mount+0x669/0xbc0 [ 3.652221] which lock already depends on the new lock. [ 3.652221] the existing dependency chain (in reverse order) is: [ 3.652221] -> #3 (css_set_lock){......}: [ 3.652221] [<ffffffff810e92b3>] lock_acquire+0xe3/0x1c0 [ 3.652221] [<ffffffff8184e137>] _raw_spin_lock_irq+0x37/0x50 [ 3.652221] [<ffffffff811374be>] cgroup_setup_root+0x19e/0x2d0 [ 3.652221] [<ffffffff821911fc>] cgroup_init+0xec/0x41d [ 3.652221] [<ffffffff82171f68>] start_kernel+0x40c/0x465 [ 3.652221] [<ffffffff82171294>] x86_64_start_reservations+0x2f/0x31 [ 3.652221] [<ffffffff8217140e>] x86_64_start_kernel+0x178/0x18b [ 3.652221] -> #2 (cgroup_mutex){+.+...}: [ 3.652221] [<ffffffff810e92b3>] lock_acquire+0xe3/0x1c0 [ 3.652221] [<ffffffff8184af5f>] mutex_lock_nested+0x5f/0x350 [ 3.652221] [<ffffffff8113962a>] audit_cgroup_list+0x4a/0x2f0 [ 3.652221] [<ffffffff81145d19>] audit_log_cap_use+0xd9/0xf0 [ 3.652221] [<ffffffff8109cd75>] ns_capable+0x45/0x70 [ 3.652221] [<ffffffff8109cdb7>] capable+0x17/0x20 [ 3.652221] [<ffffffff812a2f00>] oom_score_adj_write+0x150/0x2f0 [ 3.652221] [<ffffffff81230947>] __vfs_write+0x37/0x160 [ 3.652221] [<ffffffff81230ff8>] vfs_write+0xb8/0x1b0 [ 3.652221] [<ffffffff81232028>] SyS_write+0x58/0xc0 [ 3.652221] [<ffffffff81001f2c>] do_syscall_64+0x5c/0x300 [ 3.652221] [<ffffffff8184ea1a>] return_from_SYSCALL_64+0x0/0x7a [ 3.652221] -> #1 (&(&sighand->siglock)->rlock){+.+...}: [ 3.652221] [<ffffffff810e92b3>] lock_acquire+0xe3/0x1c0 [ 3.652221] [<ffffffff8184dfc1>] _raw_spin_lock+0x31/0x40 [ 3.652221] [<ffffffff810901d9>] copy_process.part.34+0x10f9/0x1b40 [ 3.652221] [<ffffffff81090e23>] _do_fork+0xf3/0x6b0 [ 3.652221] [<ffffffff81091409>] kernel_thread+0x29/0x30 [ 3.652221] [<ffffffff810b71d7>] kthreadd+0x187/0x1e0 [ 3.652221] [<ffffffff8184eb7f>] ret_from_fork+0x1f/0x40 [ 3.652221] -> #0 (tasklist_lock){.+.+..}: [ 3.652221] [<ffffffff810e8dfb>] __lock_acquire+0x13cb/0x1440 [ 3.652221] [<ffffffff810e92b3>] lock_acquire+0xe3/0x1c0 [ 3.652221] [<ffffffff8184e3f4>] _raw_read_lock+0x34/0x50 [ 3.652221] [<ffffffff81137ddd>] cgroup_mount+0x7ed/0xbc0 [ 3.652221] [<ffffffff81234d98>] mount_fs+0x38/0x170 [ 3.652221] [<ffffffff8125626b>] vfs_kern_mount+0x6b/0x150 [ 3.652221] [<ffffffff81258f8c>] do_mount+0x24c/0xe30 [ 3.652221] [<ffffffff81259ea5>] SyS_mount+0x95/0xe0 [ 3.652221] [<ffffffff8184e965>] entry_SYSCALL_64_fastpath+0x18/0xa8 [ 3.652221] other info that might help us debug this: [ 3.652221] Chain exists of: tasklist_lock --> cgroup_mutex --> css_set_lock [ 3.652221] Possible unsafe locking scenario: [ 3.652221] CPU0 CPU1 [ 3.652221] ---- ---- [ 3.652221] lock(css_set_lock); [ 3.652221] lock(cgroup_mutex); [ 3.652221] lock(css_set_lock); [ 3.652221] lock(tasklist_lock); [ 3.652221] *** DEADLOCK *** [ 3.652221] 1 lock held by systemd/1: [ 3.652221] #0: (css_set_lock){......}, at: [<ffffffff81137c59>] cgroup_mount+0x669/0xbc0 [ 3.652221] stack backtrace: [ 3.652221] CPU: 0 PID: 1 Comm: systemd Not tainted 4.7.0-rc5+ torvalds#101 [ 3.652221] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 3.652221] 0000000000000086 0000000024b7c1ed ffff880006d13bb0 ffffffff813c9bf5 [ 3.652221] ffffffff829dbd20 ffffffff829cf2a0 ffff880006d13bf0 ffffffff810e60a3 [ 3.652221] ffff880006d13c30 ffff880006d067b0 ffff880006d06040 0000000000000001 [ 3.652221] Call Trace: [ 3.652221] [<ffffffff813c9bf5>] dump_stack+0x67/0x92 [ 3.652221] [<ffffffff810e60a3>] print_circular_bug+0x1e3/0x250 [ 3.652221] [<ffffffff810e8dfb>] __lock_acquire+0x13cb/0x1440 [ 3.652221] [<ffffffff81210bcd>] ? __kmalloc_track_caller+0x2bd/0x630 [ 3.652221] [<ffffffff810e92b3>] lock_acquire+0xe3/0x1c0 [ 3.652221] [<ffffffff81137ddd>] ? cgroup_mount+0x7ed/0xbc0 [ 3.652221] [<ffffffff8184e3f4>] _raw_read_lock+0x34/0x50 [ 3.652221] [<ffffffff81137ddd>] ? cgroup_mount+0x7ed/0xbc0 [ 3.652221] [<ffffffff81137ddd>] cgroup_mount+0x7ed/0xbc0 [ 3.652221] [<ffffffff810e5637>] ? lockdep_init_map+0x57/0x1f0 [ 3.652221] [<ffffffff81234d98>] mount_fs+0x38/0x170 [ 3.652221] [<ffffffff8125626b>] vfs_kern_mount+0x6b/0x150 [ 3.652221] [<ffffffff81258f8c>] do_mount+0x24c/0xe30 [ 3.652221] [<ffffffff812105bb>] ? kmem_cache_alloc_trace+0x28b/0x5e0 [ 3.652221] [<ffffffff811cc176>] ? strndup_user+0x46/0x80 [ 3.652221] [<ffffffff81259ea5>] SyS_mount+0x95/0xe0 [ 3.652221] [<ffffffff8184e965>] entry_SYSCALL_64_fastpath+0x18/0xa8 Rate limiting would not be a bad idea, there were 329 audit log entries about capability use. -Topi From b857ec7d436f6fbb8c33e8d6a16d2f16175517d8 Mon Sep 17 00:00:00 2001 From: Topi Miettinen <[email protected]> Date: Sun, 10 Jul 2016 11:45:30 +0300 Subject: [PATCH] cgroup: check options and capabilities before locking to avoid a deadlock Signed-off-by: Topi Miettinen <[email protected]>

WARNING: please, no spaces at the start of a line torvalds#101: FILE: kernel/sysctl.c:2297: + return do_proc_dointvec(table,write,buffer,lenp,ppos,$ ERROR: space required after that ',' (ctx:VxV) torvalds#101: FILE: kernel/sysctl.c:2297: + return do_proc_dointvec(table,write,buffer,lenp,ppos, ^ ERROR: space required after that ',' (ctx:VxV) torvalds#101: FILE: kernel/sysctl.c:2297: + return do_proc_dointvec(table,write,buffer,lenp,ppos, ^ ERROR: space required after that ',' (ctx:VxV) torvalds#101: FILE: kernel/sysctl.c:2297: + return do_proc_dointvec(table,write,buffer,lenp,ppos, ^ ERROR: space required after that ',' (ctx:VxV) torvalds#101: FILE: kernel/sysctl.c:2297: + return do_proc_dointvec(table,write,buffer,lenp,ppos, ^ WARNING: ENOSYS means 'invalid syscall nr' and nothing else torvalds#115: FILE: kernel/sysctl.c:2899: + return -ENOSYS; total: 4 errors, 2 warnings, 74 lines checked NOTE: For some of the reported defects, checkpatch may be able to mechanically convert to the typical style using --fix or --fix-inplace. ./patches/sysctl-handle-error-writing-uint_max-to-u32-fields.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Subash Abhinov Kasiviswanathan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>

For leaf dir, In most cases, there should be as many bestfree slots as the dir data blocks that can fit under i_size (except for [1]). Root cause is we don't examin the number bestfree slots, when the slots number less than dir data blocks, if we need to allocate new dir data block and update the bestfree array, we will use the dir block number as index to assign bestfree array, while we did not check the leaf buf boundary which may cause UAF or other memory access problem. This issue can also triggered with test cases xfs/473 from fstests. According to Dave Chinner & Darrick's suggestion, adding buffer verifier to detect this abnormal situation in time. Simplify the testcase for fstest xfs/554 [1] The error log is shown as follows: ================================================================== BUG: KASAN: use-after-free in xfs_dir2_leaf_addname+0x1995/0x1ac0 Write of size 2 at addr ffff88810168b000 by task touch/1552 CPU: 5 PID: 1552 Comm: touch Not tainted 6.0.0-rc3+ torvalds#101 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4d/0x66 print_report.cold+0xf6/0x691 kasan_report+0xa8/0x120 xfs_dir2_leaf_addname+0x1995/0x1ac0 xfs_dir_createname+0x58c/0x7f0 xfs_create+0x7af/0x1010 xfs_generic_create+0x270/0x5e0 path_openat+0x270b/0x3450 do_filp_open+0x1cf/0x2b0 do_sys_openat2+0x46b/0x7a0 do_sys_open+0xb7/0x130 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fe4d9e9312b Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b RDX: 0000000000000941 RSI: 00007ffda4c17f33 RDI: 00000000ffffff9c RBP: 00007ffda4c17f33 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 R13: 00007fe4d9f631a4 R14: 00007ffda4c17f33 R15: 0000000000000000 </TASK> The buggy address belongs to the physical page: page:ffffea000405a2c0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10168b flags: 0x2fffff80000000(node=0|zone=2|lastcpupid=0x1fffff) raw: 002fffff80000000 ffffea0004057788 ffffea000402dbc8 0000000000000000 raw: 0000000000000000 0000000000170000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88810168af00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88810168af80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff88810168b000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff88810168b080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88810168b100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Disabling lock debugging due to kernel taint 00000000: 58 44 44 33 5b 53 35 c2 00 00 00 00 00 00 00 78 XDD3[S5........x XFS (sdb): Internal error xfs_dir2_data_use_free at line 1200 of file fs/xfs/libxfs/xfs_dir2_data.c. Caller xfs_dir2_data_use_free+0x28a/0xeb0 CPU: 5 PID: 1552 Comm: touch Tainted: G B 6.0.0-rc3+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4d/0x66 xfs_corruption_error+0x132/0x150 xfs_dir2_data_use_free+0x198/0xeb0 xfs_dir2_leaf_addname+0xa59/0x1ac0 xfs_dir_createname+0x58c/0x7f0 xfs_create+0x7af/0x1010 xfs_generic_create+0x270/0x5e0 path_openat+0x270b/0x3450 do_filp_open+0x1cf/0x2b0 do_sys_openat2+0x46b/0x7a0 do_sys_open+0xb7/0x130 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fe4d9e9312b Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b RDX: 0000000000000941 RSI: 00007ffda4c17f46 RDI: 00000000ffffff9c RBP: 00007ffda4c17f46 R08: 0000000000000000 R09: 0000000000000001 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 R13: 00007fe4d9f631a4 R14: 00007ffda4c17f46 R15: 0000000000000000 </TASK> XFS (sdb): Corruption detected. Unmount and run xfs_repair [1] https://lore.kernel.org/all/[email protected]/ Reviewed-by: Hou Tao <[email protected]> Signed-off-by: Guo Xuenan <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>

Ilya Leoshkevich says: ==================== v2: https://lore.kernel.org/bpf/[email protected]/#t v2 -> v3: - Make __arch_prepare_bpf_trampoline static. (Reported-by: kernel test robot <[email protected]>) - Support both old- and new- style map definitions in sk_assign. (Alexei) - Trim DENYLIST.s390x. (Alexei) - Adjust s390x vmlinux path in vmtest.sh. - Drop merged fixes. v1: https://lore.kernel.org/bpf/[email protected]/#t v1 -> v2: - Fix core_read_macros, sk_assign, test_profiler, test_bpffs (24/31; I'm not quite happy with the fix, but don't have better ideas), and xdp_synproxy. (Andrii) - Prettify liburandom_read and verify_pkcs7_sig fixes. (Andrii) - Fix bpf_usdt_arg using barrier_var(); prettify barrier_var(). (Andrii) - Change BPF_MAX_TRAMP_LINKS to enum and query it using BTF. (Andrii) - Improve bpf_jit_supports_kfunc_call() description. (Alexei) - Always check sign_extend() return value. - Cc: Alexander Gordeev. Hi, This series implements poke, trampoline, kfunc, and mixing subprogs and tailcalls on s390x. The following failures still remain: torvalds#82 get_stack_raw_tp:FAIL get_stack_print_output:FAIL:user_stack corrupted user stack Known issue: We cannot reliably unwind userspace on s390x without DWARF. torvalds#101 ksyms_module:FAIL address of kernel function bpf_testmod_test_mod_kfunc is out of range Known issue: Kernel and modules are too far away from each other on s390x. torvalds#190 stacktrace_build_id:FAIL Known issue: We cannot reliably unwind userspace on s390x without DWARF. torvalds#281 xdp_metadata:FAIL See patch 6. None of these seem to be due to the new changes. Best regards, Ilya ==================== Signed-off-by: Alexei Starovoitov <[email protected]>

Currently, test_progs outputs all stdout/stderr as it runs, and when it is done, prints a summary. It is non-trivial for tooling to parse that output and extract meaningful information from it. This change adds a new option, `--json-summary`/`-J` that let the caller specify a file where `test_progs{,-no_alu32}` can write a summary of the run in a json format that can later be parsed by tooling. Currently, it creates a summary section with successes/skipped/failures followed by a list of failed tests and subtests. A test contains the following fields: - name: the name of the test - number: the number of the test - message: the log message that was printed by the test. - failed: A boolean indicating whether the test failed or not. Currently we only output failed tests, but in the future, successful tests could be added. - subtests: A list of subtests associated with this test. A subtest contains the following fields: - name: same as above - number: sanme as above - message: the log message that was printed by the subtest. - failed: same as above but for the subtest An example run and json content below: ``` $ sudo ./test_progs -a $(grep -v '^#' ./DENYLIST.aarch64 | awk '{print $1","}' | tr -d '\n') -j -J /tmp/test_progs.json $ jq < /tmp/test_progs.json | head -n 30 { "success": 29, "success_subtest": 23, "skipped": 3, "failed": 28, "results": [ { "name": "bpf_cookie", "number": 10, "message": "test_bpf_cookie:PASS:skel_open 0 nsec\n", "failed": true, "subtests": [ { "name": "multi_kprobe_link_api", "number": 2, "message": "kprobe_multi_link_api_subtest:PASS:load_kallsyms 0 nsec\nlibbpf: extern 'bpf_testmod_fentry_test1' (strong): not resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi': -3\nkprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3\n", "failed": true }, { "name": "multi_kprobe_attach_api", "number": 3, "message": "libbpf: extern 'bpf_testmod_fentry_test1' (strong): not resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi': -3\nkprobe_multi_attach_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3\n", "failed": true }, { "name": "lsm", "number": 8, "message": "lsm_subtest:PASS:lsm.link_create 0 nsec\nlsm_subtest:FAIL:stack_mprotect unexpected stack_mprotect: actual 0 != expected -1\n", "failed": true } ``` The file can then be used to print a summary of the test run and list of failing tests/subtests: ``` $ jq -r < /tmp/test_progs.json '"Success: \(.success)/\(.success_subtest), Skipped: \(.skipped), Failed: \(.failed)"' Success: 29/23, Skipped: 3, Failed: 28 $ jq -r < /tmp/test_progs.json '.results | map([ if .failed then "#\(.number) \(.name)" else empty end, ( . as {name: $tname, number: $tnum} | .subtests | map( if .failed then "#\($tnum)/\(.number) \($tname)/\(.name)" else empty end ) ) ]) | flatten | .[]' | head -n 20 torvalds#10 bpf_cookie torvalds#10/2 bpf_cookie/multi_kprobe_link_api torvalds#10/3 bpf_cookie/multi_kprobe_attach_api torvalds#10/8 bpf_cookie/lsm torvalds#15 bpf_mod_race torvalds#15/1 bpf_mod_race/ksym (used_btfs UAF) torvalds#15/2 bpf_mod_race/kfunc (kfunc_btf_tab UAF) torvalds#36 cgroup_hierarchical_stats torvalds#61 deny_namespace torvalds#61/1 deny_namespace/unpriv_userns_create_no_bpf torvalds#73 fexit_stress torvalds#83 get_func_ip_test torvalds#99 kfunc_dynptr_param torvalds#99/1 kfunc_dynptr_param/dynptr_data_null torvalds#99/4 kfunc_dynptr_param/dynptr_data_null torvalds#100 kprobe_multi_bench_attach torvalds#100/1 kprobe_multi_bench_attach/kernel torvalds#100/2 kprobe_multi_bench_attach/modules torvalds#101 kprobe_multi_test torvalds#101/1 kprobe_multi_test/skel_api ``` Signed-off-by: Manu Bretelle <[email protected]>

Currently, test_progs outputs all stdout/stderr as it runs, and when it is done, prints a summary. It is non-trivial for tooling to parse that output and extract meaningful information from it. This change adds a new option, `--json-summary`/`-J` that let the caller specify a file where `test_progs{,-no_alu32}` can write a summary of the run in a json format that can later be parsed by tooling. Currently, it creates a summary section with successes/skipped/failures followed by a list of failed tests and subtests. A test contains the following fields: - name: the name of the test - number: the number of the test - message: the log message that was printed by the test. - failed: A boolean indicating whether the test failed or not. Currently we only output failed tests, but in the future, successful tests could be added. - subtests: A list of subtests associated with this test. A subtest contains the following fields: - name: same as above - number: sanme as above - message: the log message that was printed by the subtest. - failed: same as above but for the subtest An example run and json content below: ``` $ sudo ./test_progs -a $(grep -v '^#' ./DENYLIST.aarch64 | awk '{print $1","}' | tr -d '\n') -j -J /tmp/test_progs.json $ jq < /tmp/test_progs.json | head -n 30 { "success": 29, "success_subtest": 23, "skipped": 3, "failed": 28, "results": [ { "name": "bpf_cookie", "number": 10, "message": "test_bpf_cookie:PASS:skel_open 0 nsec\n", "failed": true, "subtests": [ { "name": "multi_kprobe_link_api", "number": 2, "message": "kprobe_multi_link_api_subtest:PASS:load_kallsyms 0 nsec\nlibbpf: extern 'bpf_testmod_fentry_test1' (strong): not resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi': -3\nkprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3\n", "failed": true }, { "name": "multi_kprobe_attach_api", "number": 3, "message": "libbpf: extern 'bpf_testmod_fentry_test1' (strong): not resolved\nlibbpf: failed to load object 'kprobe_multi'\nlibbpf: failed to load BPF skeleton 'kprobe_multi': -3\nkprobe_multi_attach_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3\n", "failed": true }, { "name": "lsm", "number": 8, "message": "lsm_subtest:PASS:lsm.link_create 0 nsec\nlsm_subtest:FAIL:stack_mprotect unexpected stack_mprotect: actual 0 != expected -1\n", "failed": true } ``` The file can then be used to print a summary of the test run and list of failing tests/subtests: ``` $ jq -r < /tmp/test_progs.json '"Success: \(.success)/\(.success_subtest), Skipped: \(.skipped), Failed: \(.failed)"' Success: 29/23, Skipped: 3, Failed: 28 $ jq -r < /tmp/test_progs.json '.results | map([ if .failed then "#\(.number) \(.name)" else empty end, ( . as {name: $tname, number: $tnum} | .subtests | map( if .failed then "#\($tnum)/\(.number) \($tname)/\(.name)" else empty end ) ) ]) | flatten | .[]' | head -n 20 torvalds#10 bpf_cookie torvalds#10/2 bpf_cookie/multi_kprobe_link_api torvalds#10/3 bpf_cookie/multi_kprobe_attach_api torvalds#10/8 bpf_cookie/lsm torvalds#15 bpf_mod_race torvalds#15/1 bpf_mod_race/ksym (used_btfs UAF) torvalds#15/2 bpf_mod_race/kfunc (kfunc_btf_tab UAF) torvalds#36 cgroup_hierarchical_stats torvalds#61 deny_namespace torvalds#61/1 deny_namespace/unpriv_userns_create_no_bpf torvalds#73 fexit_stress torvalds#83 get_func_ip_test torvalds#99 kfunc_dynptr_param torvalds#99/1 kfunc_dynptr_param/dynptr_data_null torvalds#99/4 kfunc_dynptr_param/dynptr_data_null torvalds#100 kprobe_multi_bench_attach torvalds#100/1 kprobe_multi_bench_attach/kernel torvalds#100/2 kprobe_multi_bench_attach/modules torvalds#101 kprobe_multi_test torvalds#101/1 kprobe_multi_test/skel_api ``` Signed-off-by: Manu Bretelle <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]

[ Upstream commit 13cf24e ] For leaf dir, In most cases, there should be as many bestfree slots as the dir data blocks that can fit under i_size (except for [1]). Root cause is we don't examin the number bestfree slots, when the slots number less than dir data blocks, if we need to allocate new dir data block and update the bestfree array, we will use the dir block number as index to assign bestfree array, while we did not check the leaf buf boundary which may cause UAF or other memory access problem. This issue can also triggered with test cases xfs/473 from fstests. According to Dave Chinner & Darrick's suggestion, adding buffer verifier to detect this abnormal situation in time. Simplify the testcase for fstest xfs/554 [1] The error log is shown as follows: ================================================================== BUG: KASAN: use-after-free in xfs_dir2_leaf_addname+0x1995/0x1ac0 Write of size 2 at addr ffff88810168b000 by task touch/1552 CPU: 5 PID: 1552 Comm: touch Not tainted 6.0.0-rc3+ torvalds#101 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4d/0x66 print_report.cold+0xf6/0x691 kasan_report+0xa8/0x120 xfs_dir2_leaf_addname+0x1995/0x1ac0 xfs_dir_createname+0x58c/0x7f0 xfs_create+0x7af/0x1010 xfs_generic_create+0x270/0x5e0 path_openat+0x270b/0x3450 do_filp_open+0x1cf/0x2b0 do_sys_openat2+0x46b/0x7a0 do_sys_open+0xb7/0x130 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fe4d9e9312b Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b RDX: 0000000000000941 RSI: 00007ffda4c17f33 RDI: 00000000ffffff9c RBP: 00007ffda4c17f33 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 R13: 00007fe4d9f631a4 R14: 00007ffda4c17f33 R15: 0000000000000000 </TASK> The buggy address belongs to the physical page: page:ffffea000405a2c0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10168b flags: 0x2fffff80000000(node=0|zone=2|lastcpupid=0x1fffff) raw: 002fffff80000000 ffffea0004057788 ffffea000402dbc8 0000000000000000 raw: 0000000000000000 0000000000170000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88810168af00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88810168af80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff88810168b000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff88810168b080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88810168b100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Disabling lock debugging due to kernel taint 00000000: 58 44 44 33 5b 53 35 c2 00 00 00 00 00 00 00 78 XDD3[S5........x XFS (sdb): Internal error xfs_dir2_data_use_free at line 1200 of file fs/xfs/libxfs/xfs_dir2_data.c. Caller xfs_dir2_data_use_free+0x28a/0xeb0 CPU: 5 PID: 1552 Comm: touch Tainted: G B 6.0.0-rc3+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4d/0x66 xfs_corruption_error+0x132/0x150 xfs_dir2_data_use_free+0x198/0xeb0 xfs_dir2_leaf_addname+0xa59/0x1ac0 xfs_dir_createname+0x58c/0x7f0 xfs_create+0x7af/0x1010 xfs_generic_create+0x270/0x5e0 path_openat+0x270b/0x3450 do_filp_open+0x1cf/0x2b0 do_sys_openat2+0x46b/0x7a0 do_sys_open+0xb7/0x130 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fe4d9e9312b Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b RDX: 0000000000000941 RSI: 00007ffda4c17f46 RDI: 00000000ffffff9c RBP: 00007ffda4c17f46 R08: 0000000000000000 R09: 0000000000000001 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 R13: 00007fe4d9f631a4 R14: 00007ffda4c17f46 R15: 0000000000000000 </TASK> XFS (sdb): Corruption detected. Unmount and run xfs_repair [1] https://lore.kernel.org/all/[email protected]/ Reviewed-by: Hou Tao <[email protected]> Signed-off-by: Guo Xuenan <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Leah Rumancik <[email protected]> Acked-by: Chandan Babu R <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe deferral of the vqmmc-supply regulator: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101 Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT) epc : __run_timers.part.0+0x1d0/0x1e8 ra : __run_timers.part.0+0x134/0x1e8 epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60 gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000 t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20 s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000 a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60 a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68 s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000 s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0 s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000 s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52 t5 : ffffffd8024ae018 t6 : ffffffd8024ae038 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8 [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a [<ffffffff80809092>] __do_softirq+0xc6/0x1fa [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84 [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28 ---[ end trace 0000000000000000 ]--- What happens? renesas_sdhi_probe() { tmio_mmc_host_alloc() mmc_alloc_host() INIT_DELAYED_WORK(&host->detect, mmc_rescan); devm_request_irq(tmio_mmc_irq); /* * After this, the interrupt handler may be invoked at any time * * tmio_mmc_irq() * { * __tmio_mmc_card_detect_irq() * mmc_detect_change() * _mmc_detect_change() * mmc_schedule_delayed_work(&host->detect, delay); * } */ tmio_mmc_host_probe() tmio_mmc_init_ocr() -EPROBE_DEFER tmio_mmc_host_free() mmc_free_host() } When expire_timers() runs later, it warns because the MMC host structure containing the delayed work was freed, and now contains an invalid work function pointer. Fix this by cancelling any pending delayed work before releasing the MMC host structure. Signed-off-by: Geert Uytterhoeven <[email protected]>

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe deferral of the vqmmc-supply regulator: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101 Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT) epc : __run_timers.part.0+0x1d0/0x1e8 ra : __run_timers.part.0+0x134/0x1e8 epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60 gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000 t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20 s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000 a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60 a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68 s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000 s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0 s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000 s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52 t5 : ffffffd8024ae018 t6 : ffffffd8024ae038 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8 [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a [<ffffffff80809092>] __do_softirq+0xc6/0x1fa [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84 [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28 ---[ end trace 0000000000000000 ]--- What happens? renesas_sdhi_probe() { tmio_mmc_host_alloc() mmc_alloc_host() INIT_DELAYED_WORK(&host->detect, mmc_rescan); devm_request_irq(tmio_mmc_irq); /* * After this, the interrupt handler may be invoked at any time * * tmio_mmc_irq() * { * __tmio_mmc_card_detect_irq() * mmc_detect_change() * _mmc_detect_change() * mmc_schedule_delayed_work(&host->detect, delay); * } */ tmio_mmc_host_probe() tmio_mmc_init_ocr() -EPROBE_DEFER tmio_mmc_host_free() mmc_free_host() } When expire_timers() runs later, it warns because the MMC host structure containing the delayed work was freed, and now contains an invalid work function pointer. Fix this by cancelling any pending delayed work before releasing the MMC host structure. Signed-off-by: Geert Uytterhoeven <[email protected]> Tested-by: Lad Prabhakar <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be Signed-off-by: Ulf Hansson <[email protected]>

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe deferral of the vqmmc-supply regulator: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101 Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT) epc : __run_timers.part.0+0x1d0/0x1e8 ra : __run_timers.part.0+0x134/0x1e8 epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60 gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000 t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20 s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000 a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60 a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68 s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000 s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0 s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000 s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52 t5 : ffffffd8024ae018 t6 : ffffffd8024ae038 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8 [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a [<ffffffff80809092>] __do_softirq+0xc6/0x1fa [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84 [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28 ---[ end trace 0000000000000000 ]--- What happens? renesas_sdhi_probe() { tmio_mmc_host_alloc() mmc_alloc_host() INIT_DELAYED_WORK(&host->detect, mmc_rescan); devm_request_irq(tmio_mmc_irq); /* * After this, the interrupt handler may be invoked at any time * * tmio_mmc_irq() * { * __tmio_mmc_card_detect_irq() * mmc_detect_change() * _mmc_detect_change() * mmc_schedule_delayed_work(&host->detect, delay); * } */ tmio_mmc_host_probe() tmio_mmc_init_ocr() -EPROBE_DEFER tmio_mmc_host_free() mmc_free_host() } When expire_timers() runs later, it warns because the MMC host structure containing the delayed work was freed, and now contains an invalid work function pointer. Fix this by cancelling any pending delayed work before releasing the MMC host structure. Signed-off-by: Geert Uytterhoeven <[email protected]>

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe deferral of the vqmmc-supply regulator: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101 Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT) epc : __run_timers.part.0+0x1d0/0x1e8 ra : __run_timers.part.0+0x134/0x1e8 epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60 gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000 t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20 s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000 a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60 a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68 s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000 s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0 s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000 s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52 t5 : ffffffd8024ae018 t6 : ffffffd8024ae038 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8 [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a [<ffffffff80809092>] __do_softirq+0xc6/0x1fa [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84 [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28 ---[ end trace 0000000000000000 ]--- What happens? renesas_sdhi_probe() { tmio_mmc_host_alloc() mmc_alloc_host() INIT_DELAYED_WORK(&host->detect, mmc_rescan); devm_request_irq(tmio_mmc_irq); /* * After this, the interrupt handler may be invoked at any time * * tmio_mmc_irq() * { * __tmio_mmc_card_detect_irq() * mmc_detect_change() * _mmc_detect_change() * mmc_schedule_delayed_work(&host->detect, delay); * } */ tmio_mmc_host_probe() tmio_mmc_init_ocr() -EPROBE_DEFER tmio_mmc_host_free() mmc_free_host() } When expire_timers() runs later, it warns because the MMC host structure containing the delayed work was freed, and now contains an invalid work function pointer. Fix this by cancelling any pending delayed work before releasing the MMC host structure. Signed-off-by: Geert Uytterhoeven <[email protected]> Tested-by: Lad Prabhakar <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be Signed-off-by: Ulf Hansson <[email protected]>

commit 1036f69 upstream. On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe deferral of the vqmmc-supply regulator: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101 Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT) epc : __run_timers.part.0+0x1d0/0x1e8 ra : __run_timers.part.0+0x134/0x1e8 epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60 gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000 t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20 s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000 a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60 a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68 s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000 s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0 s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000 s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52 t5 : ffffffd8024ae018 t6 : ffffffd8024ae038 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8 [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a [<ffffffff80809092>] __do_softirq+0xc6/0x1fa [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84 [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28 ---[ end trace 0000000000000000 ]--- What happens? renesas_sdhi_probe() { tmio_mmc_host_alloc() mmc_alloc_host() INIT_DELAYED_WORK(&host->detect, mmc_rescan); devm_request_irq(tmio_mmc_irq); /* * After this, the interrupt handler may be invoked at any time * * tmio_mmc_irq() * { * __tmio_mmc_card_detect_irq() * mmc_detect_change() * _mmc_detect_change() * mmc_schedule_delayed_work(&host->detect, delay); * } */ tmio_mmc_host_probe() tmio_mmc_init_ocr() -EPROBE_DEFER tmio_mmc_host_free() mmc_free_host() } When expire_timers() runs later, it warns because the MMC host structure containing the delayed work was freed, and now contains an invalid work function pointer. Fix this by cancelling any pending delayed work before releasing the MMC host structure. Signed-off-by: Geert Uytterhoeven <[email protected]> Tested-by: Lad Prabhakar <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

…cise exception (torvalds#101) To print error messages for imprecise exception, extract the field sdcause in the CSR sdcause instead of reading the whole register. Signed-off-by: Ben Zong-You Xie <[email protected]> Reviewed-on: https://gitea.andestech.com/RD-SW/linux/pulls/101 Reviewed-by: Tim Shih-Ting OuYang <[email protected]> Reviewed-by: Leo Yu-Chi Liang <[email protected]> Reviewed-by: Dylan Dai-Rong Jhong <[email protected]> Co-authored-by: Ben Zong-You Xie <[email protected]> Co-committed-by: Ben Zong-You Xie <[email protected]>

commit 1036f69 upstream. On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe deferral of the vqmmc-supply regulator: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101 Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT) epc : __run_timers.part.0+0x1d0/0x1e8 ra : __run_timers.part.0+0x134/0x1e8 epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60 gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000 t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20 s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000 a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60 a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68 s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000 s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0 s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000 s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52 t5 : ffffffd8024ae018 t6 : ffffffd8024ae038 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8 [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a [<ffffffff80809092>] __do_softirq+0xc6/0x1fa [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84 [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28 ---[ end trace 0000000000000000 ]--- What happens? renesas_sdhi_probe() { tmio_mmc_host_alloc() mmc_alloc_host() INIT_DELAYED_WORK(&host->detect, mmc_rescan); devm_request_irq(tmio_mmc_irq); /* * After this, the interrupt handler may be invoked at any time * * tmio_mmc_irq() * { * __tmio_mmc_card_detect_irq() * mmc_detect_change() * _mmc_detect_change() * mmc_schedule_delayed_work(&host->detect, delay); * } */ tmio_mmc_host_probe() tmio_mmc_init_ocr() -EPROBE_DEFER tmio_mmc_host_free() mmc_free_host() } When expire_timers() runs later, it warns because the MMC host structure containing the delayed work was freed, and now contains an invalid work function pointer. Fix this by cancelling any pending delayed work before releasing the MMC host structure. Signed-off-by: Geert Uytterhoeven <[email protected]> Tested-by: Lad Prabhakar <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

sched_ext: fix race in scx_move_task() with exiting tasks

Why: The PCI error slot reset maybe triggered after inject ue to UMC multi times, this caused system hang. [ 557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume [ 557.373718] [drm] PCIE GART of 512M enabled. [ 557.373722] [drm] PTB located at 0x0000031FED700000 [ 557.373788] [drm] VRAM is lost due to GPU reset! [ 557.373789] [drm] PSP is resuming... [ 557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset [ 557.547067] [drm] PCI error: detected callback, state(1)!! [ 557.547069] [drm] No support for XGMI hive yet... [ 557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter [ 557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations [ 557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered [ 557.610492] [drm] PCI error: slot reset callback!! ... [ 560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI [ 560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G OE 5.15.0-91-generic torvalds#101-Ubuntu [ 560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023 [ 560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu] [ 560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00 [ 560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202 [ 560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0 [ 560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010 [ 560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08 [ 560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000 [ 560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000 [ 560.803889] FS: 0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000 [ 560.812973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0 [ 560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 560.843444] PKRU: 55555554 [ 560.846480] Call Trace: [ 560.849225] <TASK> [ 560.851580] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.856488] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.861379] ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.867778] ? show_regs.part.0+0x23/0x29 [ 560.872293] ? __die_body.cold+0x8/0xd [ 560.876502] ? die_addr+0x3e/0x60 [ 560.880238] ? exc_general_protection+0x1c5/0x410 [ 560.885532] ? asm_exc_general_protection+0x27/0x30 [ 560.891025] ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.898323] amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.904520] process_one_work+0x228/0x3d0 How: In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure. Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe deferral of the vqmmc-supply regulator: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101 Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT) epc : __run_timers.part.0+0x1d0/0x1e8 ra : __run_timers.part.0+0x134/0x1e8 epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60 gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000 t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20 s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000 a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60 a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68 s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000 s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0 s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000 s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52 t5 : ffffffd8024ae018 t6 : ffffffd8024ae038 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8 [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a [<ffffffff80809092>] __do_softirq+0xc6/0x1fa [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84 [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28 ---[ end trace 0000000000000000 ]--- What happens? renesas_sdhi_probe() { tmio_mmc_host_alloc() mmc_alloc_host() INIT_DELAYED_WORK(&host->detect, mmc_rescan); devm_request_irq(tmio_mmc_irq); /* * After this, the interrupt handler may be invoked at any time * * tmio_mmc_irq() * { * __tmio_mmc_card_detect_irq() * mmc_detect_change() * _mmc_detect_change() * mmc_schedule_delayed_work(&host->detect, delay); * } */ tmio_mmc_host_probe() tmio_mmc_init_ocr() -EPROBE_DEFER tmio_mmc_host_free() mmc_free_host() } When expire_timers() runs later, it warns because the MMC host structure containing the delayed work was freed, and now contains an invalid work function pointer. Fix this by cancelling any pending delayed work before releasing the MMC host structure. Signed-off-by: Geert Uytterhoeven <[email protected]> Tested-by: Lad Prabhakar <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be Signed-off-by: Ulf Hansson <[email protected]>

[ Upstream commit 601429c ] Why: The PCI error slot reset maybe triggered after inject ue to UMC multi times, this caused system hang. [ 557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume [ 557.373718] [drm] PCIE GART of 512M enabled. [ 557.373722] [drm] PTB located at 0x0000031FED700000 [ 557.373788] [drm] VRAM is lost due to GPU reset! [ 557.373789] [drm] PSP is resuming... [ 557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset [ 557.547067] [drm] PCI error: detected callback, state(1)!! [ 557.547069] [drm] No support for XGMI hive yet... [ 557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter [ 557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations [ 557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered [ 557.610492] [drm] PCI error: slot reset callback!! ... [ 560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI [ 560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G OE 5.15.0-91-generic torvalds#101-Ubuntu [ 560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023 [ 560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu] [ 560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00 [ 560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202 [ 560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0 [ 560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010 [ 560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08 [ 560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000 [ 560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000 [ 560.803889] FS: 0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000 [ 560.812973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0 [ 560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 560.843444] PKRU: 55555554 [ 560.846480] Call Trace: [ 560.849225] <TASK> [ 560.851580] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.856488] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.861379] ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.867778] ? show_regs.part.0+0x23/0x29 [ 560.872293] ? __die_body.cold+0x8/0xd [ 560.876502] ? die_addr+0x3e/0x60 [ 560.880238] ? exc_general_protection+0x1c5/0x410 [ 560.885532] ? asm_exc_general_protection+0x27/0x30 [ 560.891025] ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.898323] amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.904520] process_one_work+0x228/0x3d0 How: In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure. Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe deferral of the vqmmc-supply regulator: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 torvalds#101 Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT) epc : __run_timers.part.0+0x1d0/0x1e8 ra : __run_timers.part.0+0x134/0x1e8 epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60 gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000 t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20 s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000 a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60 a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68 s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000 s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0 s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000 s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52 t5 : ffffffd8024ae018 t6 : ffffffd8024ae038 status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8 [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a [<ffffffff80809092>] __do_softirq+0xc6/0x1fa [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84 [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28 ---[ end trace 0000000000000000 ]--- What happens? renesas_sdhi_probe() { tmio_mmc_host_alloc() mmc_alloc_host() INIT_DELAYED_WORK(&host->detect, mmc_rescan); devm_request_irq(tmio_mmc_irq); /* * After this, the interrupt handler may be invoked at any time * * tmio_mmc_irq() * { * __tmio_mmc_card_detect_irq() * mmc_detect_change() * _mmc_detect_change() * mmc_schedule_delayed_work(&host->detect, delay); * } */ tmio_mmc_host_probe() tmio_mmc_init_ocr() -EPROBE_DEFER tmio_mmc_host_free() mmc_free_host() } When expire_timers() runs later, it warns because the MMC host structure containing the delayed work was freed, and now contains an invalid work function pointer. Fix this by cancelling any pending delayed work before releasing the MMC host structure. Signed-off-by: Geert Uytterhoeven <[email protected]> Tested-by: Lad Prabhakar <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be Signed-off-by: Ulf Hansson <[email protected]> (cherry picked from commit 1036f69) Signed-off-by: Cristian Ciocaltea <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2065899 [ Upstream commit 601429c ] Why: The PCI error slot reset maybe triggered after inject ue to UMC multi times, this caused system hang. [ 557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume [ 557.373718] [drm] PCIE GART of 512M enabled. [ 557.373722] [drm] PTB located at 0x0000031FED700000 [ 557.373788] [drm] VRAM is lost due to GPU reset! [ 557.373789] [drm] PSP is resuming... [ 557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset [ 557.547067] [drm] PCI error: detected callback, state(1)!! [ 557.547069] [drm] No support for XGMI hive yet... [ 557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter [ 557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations [ 557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered [ 557.610492] [drm] PCI error: slot reset callback!! ... [ 560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI [ 560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G OE 5.15.0-91-generic torvalds#101-Ubuntu [ 560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023 [ 560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu] [ 560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00 [ 560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202 [ 560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0 [ 560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010 [ 560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08 [ 560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000 [ 560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000 [ 560.803889] FS: 0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000 [ 560.812973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0 [ 560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 560.843444] PKRU: 55555554 [ 560.846480] Call Trace: [ 560.849225] <TASK> [ 560.851580] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.856488] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.861379] ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.867778] ? show_regs.part.0+0x23/0x29 [ 560.872293] ? __die_body.cold+0x8/0xd [ 560.876502] ? die_addr+0x3e/0x60 [ 560.880238] ? exc_general_protection+0x1c5/0x410 [ 560.885532] ? asm_exc_general_protection+0x27/0x30 [ 560.891025] ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.898323] amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.904520] process_one_work+0x228/0x3d0 How: In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure. Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Manuel Diewald <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>

Fix uniprocessor build issue torvalds#101

…cise exception (torvalds#101) To print error messages for imprecise exception, extract the field sdcause in the CSR sdcause instead of reading the whole register. Signed-off-by: Ben Zong-You Xie <[email protected]> Reviewed-on: https://gitea.andestech.com/RD-SW/linux/pulls/101 Reviewed-by: Tim Shih-Ting OuYang <[email protected]> Reviewed-by: Leo Yu-Chi Liang <[email protected]> Reviewed-by: Dylan Dai-Rong Jhong <[email protected]> Co-authored-by: Ben Zong-You Xie <[email protected]> Co-committed-by: Ben Zong-You Xie <[email protected]>

Update README

5285388

raintean closed this Jun 27, 2014

apxii pushed a commit to apxii/linux that referenced this pull request May 6, 2015

Merge pull request torvalds#101 from joukestoel/odroidc-3.10.y

af68aa6

Added support for the DVB devices: DVBSky USB, TechoTrend CT2 4400 and TechnoTrend CT2 4650

apxii pushed a commit to apxii/linux that referenced this pull request May 6, 2015

defconfig: enabling support for DVBSky USB, TechoTrend CT2 4400 and T…

2600f0f

…echnoTrend CT2 4650 See PR torvalds#101 for more information Change-Id: I245214a6d2c37f6e7d365461e564d5a7374ea6c0

logic10492 pushed a commit to logic10492/linux-amd-zen2 that referenced this pull request Jan 18, 2024

Merge pull request torvalds#101 from arighi/fix-move-task-race

79d694e

sched_ext: fix race in scx_move_task() with exiting tasks

honjow pushed a commit to 3003n/linux that referenced this pull request Oct 11, 2024

sched/alt: Fix build fail with CONFIG_SMP=n

ea33d81

Fix uniprocessor build issue torvalds#101

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update README #101

Update README #101

raintean commented Jun 27, 2014

younishd commented Jun 27, 2014

ComradePhilos commented Jun 27, 2014

raintean commented Jun 28, 2014

Update README #101

Update README #101

Conversation

raintean commented Jun 27, 2014

younishd commented Jun 27, 2014

ComradePhilos commented Jun 27, 2014

raintean commented Jun 28, 2014