Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mwifiex: split large commit into three commits #44

Merged

Conversation

kitakar5525
Copy link
Member

(The first revert commit does not apply cleanly on v4.19 branch. So, I'll make the same change to v5.4 and v4.19 branch once this PR gets approved. Or let me know if you guys want to do it by yourself.)

This PR

  1. reverts the large commit 456cdc4 ("wireless/mwifiex: Fix S0ix / suspend")
  2. split the commit into three small commits

So, no functional changes are intended. Just split the commit to describe what each part does.

There are small changes from the reverted commit.

  1. added a description to the top of suspend()/resume() functions (starting with 'XXX:')
  2. used pci_upstream_bridge() instead of bus->self-> and placed bridge_d3 thing to the top of probe()
  3. removed description regarding auto_ds from the top of the function
  4. not changing the location of int ret;; the previous commit changed the location (I feel this is rather cleaner, but I followed the upstream code)
  5. added a comment where the removed code existed
  6. added commit message to each commit

This reverts commit 456cdc4.

Reason for revert:
  This commit will be split into smaller commits to describe what each
  commit does. Currently, this commit does the following things within
  this one commit:
  a) modify mwifiex_pcie_suspend/mwifiex_pcie_resume functions to
     achieve S0ix without user unloading mwifiex module manually
     and also to fix "(sometimes) scanning for APs doesn't work after
     suspend"
  b) disable bridge_d3 to fix mwifiex module crashing after suspend
  c) disable "auto deep sleep" (auto_ds). auto_ds is reportedly causing
     "suspend/resume fails when not connected to an Access Point."

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
(made based on commit https://github.com/sebanc/linux-kernel/commit/0f5283b42247cb8978ed9493936f9b5b72350121)

On upstream mwifiex driver, it's reported that unloading the driver
manually [1] fixes S0ix achievement during s2idle (suspend).

Also, unloading/reloading the driver fixes "AP scanning (sometimes) not
working after suspend".

This commit removes/probes the card on suspend/resume on the driver
side to fix such issues.

[1] Run the following command for example
    $ sudo modprobe -r mwifiex_pcie
      sudo modprobe -r mwifiex

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
On Windows, it seems that wifi parent bridge will not enter D3 state
(stay on D0 state) [1]. And on Linux, disabling the D3 state for the
bridge fixes driver crashing after suspend.

This commit disables the parent bridge D3 state on driver initialization
to fix driver crashing after suspend.

NOTE_1: The driver does not crash on LTS 4.4 series but does crash on
        LTS 4.9 series. Maybe commit 9d26d3a ("PCI: Put PCIe ports
        into D3 during suspend") changed the bridge D-state behavior.

NOTE_2: As a side effect, it reports parent bridge is in D3hot. I'm not
        sure why.

        acpi device:4b: Cannot transition to power state D0 for parent in D3hot

        Power state of the parent bridge on SB1 before this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D0
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

        Power state of the parent bridge on SB1 after this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D3hot
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

NOTE_3: This change is needed for SP4 and later generations that use
        mwifiex (i.e. SP4/SP5/SP6, SB1/SB2 and SL1/SL2). Not needed for
        Surface 3 but safe to apply as this change doesn't break S0ix.
        Not sure if needed for SP1/SP2 and SP3. Anyway, I think applying
        this change to all the devices may be safe regarding S0ix because
        Surface 3 can achieve S0ix anyway even when this patch applied.
        So, let's apply this change for all the devices for now.

[1] jakeday/linux-surface#554 (comment)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
(made based on commit https://github.com/sebanc/linux-kernel/commit/5cd03f3e1c01440d0e0bd0cd00a51732c07f99da)

According to the commit this commit is based on, enabling "auto deep
sleep" (auto_ds) causes "suspend/resume fails when not connected to
an Access Point.".

This commit stops enabling auto_ds on driver initialization to fix the
issue.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
@qzed
Copy link
Member

qzed commented Apr 24, 2020

Thanks!

@qzed qzed merged commit e7e8db7 into linux-surface:v5.6-surface-devel Apr 24, 2020
kitakar5525 added a commit that referenced this pull request Apr 24, 2020
This reverts commit c8da40c.

Reason for revert:
  This commit will be split into smaller commits to describe what each
  commit does. Currently, this commit does the following things within
  this one commit:
  a) modify mwifiex_pcie_suspend/mwifiex_pcie_resume functions to
     achieve S0ix without user unloading mwifiex module manually
     and also to fix "(sometimes) scanning for APs doesn't work after
     suspend"
  b) disable bridge_d3 to fix mwifiex module crashing after suspend
  c) disable "auto deep sleep" (auto_ds). auto_ds is reportedly causing
     "suspend/resume fails when not connected to an Access Point."

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(corresponds to commit dd9a57a
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
kitakar5525 added a commit that referenced this pull request Apr 24, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/0f5283b42247cb8978ed9493936f9b5b72350121)

On upstream mwifiex driver, it's reported that unloading the driver
manually [1] fixes S0ix achievement during s2idle (suspend).

Also, unloading/reloading the driver fixes "AP scanning (sometimes) not
working after suspend".

This commit removes/probes the card on suspend/resume on the driver
side to fix such issues.

[1] Run the following command for example
    $ sudo modprobe -r mwifiex_pcie
      sudo modprobe -r mwifiex

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 82204f8
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
kitakar5525 added a commit that referenced this pull request Apr 24, 2020
On Windows, it seems that wifi parent bridge will not enter D3 state
(stay on D0 state) [1]. And on Linux, disabling the D3 state for the
bridge fixes driver crashing after suspend.

This commit disables the parent bridge D3 state on driver initialization
to fix driver crashing after suspend.

NOTE_1: The driver does not crash on LTS 4.4 series but does crash on
        LTS 4.9 series. Maybe commit 9d26d3a ("PCI: Put PCIe ports
        into D3 during suspend") changed the bridge D-state behavior.

NOTE_2: As a side effect, it reports parent bridge is in D3hot. I'm not
        sure why.

        acpi device:4b: Cannot transition to power state D0 for parent in D3hot

        Power state of the parent bridge on SB1 before this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D0
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

        Power state of the parent bridge on SB1 after this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D3hot
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

NOTE_3: This change is needed for SP4 and later generations that use
        mwifiex (i.e. SP4/SP5/SP6, SB1/SB2 and SL1/SL2). Not needed for
        Surface 3 but safe to apply as this change doesn't break S0ix.
        Not sure if needed for SP1/SP2 and SP3. Anyway, I think applying
        this change to all the devices may be safe regarding S0ix because
        Surface 3 can achieve S0ix anyway even when this patch applied.
        So, let's apply this change for all the devices for now.

[1] jakeday/linux-surface#554 (comment)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 028e378
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
kitakar5525 added a commit that referenced this pull request Apr 24, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/5cd03f3e1c01440d0e0bd0cd00a51732c07f99da)

According to the commit this commit is based on, enabling "auto deep
sleep" (auto_ds) causes "suspend/resume fails when not connected to
an Access Point.".

This commit stops enabling auto_ds on driver initialization to fix the
issue.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit cecd107
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
kitakar5525 added a commit that referenced this pull request Apr 24, 2020
This reverts commit d3e3527.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(corresponds to commit dd9a57a
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
kitakar5525 added a commit that referenced this pull request Apr 24, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/0f5283b42247cb8978ed9493936f9b5b72350121)

On upstream mwifiex driver, it's reported that unloading the driver
manually [1] fixes S0ix achievement during s2idle (suspend).

Also, unloading/reloading the driver fixes "AP scanning (sometimes) not
working after suspend".

This commit removes/probes the card on suspend/resume on the driver
side to fix such issues.

[1] Run the following command for example
    $ sudo modprobe -r mwifiex_pcie
      sudo modprobe -r mwifiex

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 82204f8
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
kitakar5525 added a commit that referenced this pull request Apr 24, 2020
On Windows, it seems that wifi parent bridge will not enter D3 state
(stay on D0 state) [1]. And on Linux, disabling the D3 state for the
bridge fixes driver crashing after suspend.

This commit disables the parent bridge D3 state on driver initialization
to fix driver crashing after suspend.

NOTE_1: The driver does not crash on LTS 4.4 series but does crash on
        LTS 4.9 series. Maybe commit 9d26d3a ("PCI: Put PCIe ports
        into D3 during suspend") changed the bridge D-state behavior.

NOTE_2: As a side effect, it reports parent bridge is in D3hot. I'm not
        sure why.

        acpi device:4b: Cannot transition to power state D0 for parent in D3hot

        Power state of the parent bridge on SB1 before this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D0
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

        Power state of the parent bridge on SB1 after this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D3hot
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

NOTE_3: This change is needed for SP4 and later generations that use
        mwifiex (i.e. SP4/SP5/SP6, SB1/SB2 and SL1/SL2). Not needed for
        Surface 3 but safe to apply as this change doesn't break S0ix.
        Not sure if needed for SP1/SP2 and SP3. Anyway, I think applying
        this change to all the devices may be safe regarding S0ix because
        Surface 3 can achieve S0ix anyway even when this patch applied.
        So, let's apply this change for all the devices for now.

[1] jakeday/linux-surface#554 (comment)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 028e378
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
kitakar5525 added a commit that referenced this pull request Apr 24, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/5cd03f3e1c01440d0e0bd0cd00a51732c07f99da)

According to the commit this commit is based on, enabling "auto deep
sleep" (auto_ds) causes "suspend/resume fails when not connected to
an Access Point.".

This commit stops enabling auto_ds on driver initialization to fix the
issue.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit cecd107
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
@kitakar5525 kitakar5525 deleted the mwifiex-split-patch/v5.6 branch April 25, 2020 02:03
qzed pushed a commit that referenced this pull request Aug 3, 2020
The lifetime of EMAD transactions (i.e., 'struct mlxsw_reg_trans') is
managed using RCU. They are freed using kfree_rcu() once the transaction
ends.

However, in case the transaction failed it is freed immediately after being
removed from the active transactions list. This is problematic because it is
still possible for a different CPU to dereference the transaction from an RCU
read-side critical section while traversing the active transaction list in
mlxsw_emad_rx_listener_func(). In which case, a use-after-free is triggered
[1].

Fix this by freeing the transaction after a grace period by calling
kfree_rcu().

[1]
BUG: KASAN: use-after-free in mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
Read of size 8 at addr ffff88800b7964e8 by task syz-executor.2/2881

CPU: 0 PID: 2881 Comm: syz-executor.2 Not tainted 5.8.0-rc4+ #44
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
 mlxsw_core_skb_receive+0x571/0x700 drivers/net/ethernet/mellanox/mlxsw/core.c:2061
 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
 mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
 __do_softirq+0x223/0x964 kernel/softirq.c:292
 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 do_softirq_own_stack+0x109/0x140 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:387 [inline]
 __irq_exit_rcu kernel/softirq.c:417 [inline]
 irq_exit_rcu+0x16f/0x1a0 kernel/softirq.c:429
 sysvec_apic_timer_interrupt+0x4e/0xd0 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:587
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/irqflags.h:85 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x3b/0x40 kernel/locking/spinlock.c:191
Code: e8 2a c3 f4 fc 48 89 ef e8 12 96 f5 fc f6 c7 02 75 11 53 9d e8 d6 db 11 fd 65 ff 0d 1f 21 b3 56 5b 5d c3 e8 a7 d7 11 fd 53 9d <eb> ed 0f 1f 00 55 48 89 fd 65 ff 05 05 21 b3 56 ff 74 24 08 48 8d
RSP: 0018:ffff8880446ffd80 EFLAGS: 00000286
RAX: 0000000000000006 RBX: 0000000000000286 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa94ecea9
RBP: ffff888012934408 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: fffffbfff57be301 R12: 1ffff110088dffc1
R13: ffff888037b817c0 R14: ffff88802442415a R15: ffff888024424000
 __do_sys_perf_event_open+0x1b5d/0x2bd0 kernel/events/core.c:11874
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x473dbd
Code: Bad RIP value.
RSP: 002b:00007f21e5e9cc28 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
RAX: ffffffffffffffda RBX: 000000000057bf00 RCX: 0000000000473dbd
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000040
RBP: 000000000057bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000246 R12: 000000000057bf0c
R13: 00007ffd0493503f R14: 00000000004d0f46 R15: 00007f21e5e9cd80

Allocated by task 871:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc mm/kasan/common.c:494 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
 kmalloc include/linux/slab.h:555 [inline]
 kzalloc include/linux/slab.h:669 [inline]
 mlxsw_core_reg_access_emad+0x70/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1812
 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

Freed by task 871:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
 slab_free_hook mm/slub.c:1474 [inline]
 slab_free_freelist_hook mm/slub.c:1507 [inline]
 slab_free mm/slub.c:3072 [inline]
 kfree+0xe6/0x320 mm/slub.c:4052
 mlxsw_core_reg_access_emad+0xd45/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1819
 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

The buggy address belongs to the object at ffff88800b796400
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 232 bytes inside of
 512-byte region [ffff88800b796400, ffff88800b796600)
The buggy address belongs to the page:
page:ffffea00002de500 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00002de500 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 dead000000000100 dead000000000122 ffff88806c402500
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88800b796380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88800b796400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88800b796480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                          ^
 ffff88800b796500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88800b796580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
kitakar5525 pushed a commit to kitakar5525/linux-kernel that referenced this pull request Aug 8, 2020
[ Upstream commit 3c8ce24 ]

The lifetime of EMAD transactions (i.e., 'struct mlxsw_reg_trans') is
managed using RCU. They are freed using kfree_rcu() once the transaction
ends.

However, in case the transaction failed it is freed immediately after being
removed from the active transactions list. This is problematic because it is
still possible for a different CPU to dereference the transaction from an RCU
read-side critical section while traversing the active transaction list in
mlxsw_emad_rx_listener_func(). In which case, a use-after-free is triggered
[1].

Fix this by freeing the transaction after a grace period by calling
kfree_rcu().

[1]
BUG: KASAN: use-after-free in mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
Read of size 8 at addr ffff88800b7964e8 by task syz-executor.2/2881

CPU: 0 PID: 2881 Comm: syz-executor.2 Not tainted 5.8.0-rc4+ linux-surface#44
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
 mlxsw_core_skb_receive+0x571/0x700 drivers/net/ethernet/mellanox/mlxsw/core.c:2061
 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
 mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
 __do_softirq+0x223/0x964 kernel/softirq.c:292
 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 do_softirq_own_stack+0x109/0x140 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:387 [inline]
 __irq_exit_rcu kernel/softirq.c:417 [inline]
 irq_exit_rcu+0x16f/0x1a0 kernel/softirq.c:429
 sysvec_apic_timer_interrupt+0x4e/0xd0 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:587
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/irqflags.h:85 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x3b/0x40 kernel/locking/spinlock.c:191
Code: e8 2a c3 f4 fc 48 89 ef e8 12 96 f5 fc f6 c7 02 75 11 53 9d e8 d6 db 11 fd 65 ff 0d 1f 21 b3 56 5b 5d c3 e8 a7 d7 11 fd 53 9d <eb> ed 0f 1f 00 55 48 89 fd 65 ff 05 05 21 b3 56 ff 74 24 08 48 8d
RSP: 0018:ffff8880446ffd80 EFLAGS: 00000286
RAX: 0000000000000006 RBX: 0000000000000286 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa94ecea9
RBP: ffff888012934408 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: fffffbfff57be301 R12: 1ffff110088dffc1
R13: ffff888037b817c0 R14: ffff88802442415a R15: ffff888024424000
 __do_sys_perf_event_open+0x1b5d/0x2bd0 kernel/events/core.c:11874
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x473dbd
Code: Bad RIP value.
RSP: 002b:00007f21e5e9cc28 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
RAX: ffffffffffffffda RBX: 000000000057bf00 RCX: 0000000000473dbd
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000040
RBP: 000000000057bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000246 R12: 000000000057bf0c
R13: 00007ffd0493503f R14: 00000000004d0f46 R15: 00007f21e5e9cd80

Allocated by task 871:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc mm/kasan/common.c:494 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
 kmalloc include/linux/slab.h:555 [inline]
 kzalloc include/linux/slab.h:669 [inline]
 mlxsw_core_reg_access_emad+0x70/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1812
 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

Freed by task 871:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
 slab_free_hook mm/slub.c:1474 [inline]
 slab_free_freelist_hook mm/slub.c:1507 [inline]
 slab_free mm/slub.c:3072 [inline]
 kfree+0xe6/0x320 mm/slub.c:4052
 mlxsw_core_reg_access_emad+0xd45/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1819
 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

The buggy address belongs to the object at ffff88800b796400
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 232 bytes inside of
 512-byte region [ffff88800b796400, ffff88800b796600)
The buggy address belongs to the page:
page:ffffea00002de500 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00002de500 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 dead000000000100 dead000000000122 ffff88806c402500
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88800b796380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88800b796400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88800b796480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                          ^
 ffff88800b796500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88800b796580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
kitakar5525 pushed a commit to kitakar5525/linux-kernel that referenced this pull request Aug 8, 2020
[ Upstream commit 3c8ce24 ]

The lifetime of EMAD transactions (i.e., 'struct mlxsw_reg_trans') is
managed using RCU. They are freed using kfree_rcu() once the transaction
ends.

However, in case the transaction failed it is freed immediately after being
removed from the active transactions list. This is problematic because it is
still possible for a different CPU to dereference the transaction from an RCU
read-side critical section while traversing the active transaction list in
mlxsw_emad_rx_listener_func(). In which case, a use-after-free is triggered
[1].

Fix this by freeing the transaction after a grace period by calling
kfree_rcu().

[1]
BUG: KASAN: use-after-free in mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
Read of size 8 at addr ffff88800b7964e8 by task syz-executor.2/2881

CPU: 0 PID: 2881 Comm: syz-executor.2 Not tainted 5.8.0-rc4+ linux-surface#44
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
 mlxsw_core_skb_receive+0x571/0x700 drivers/net/ethernet/mellanox/mlxsw/core.c:2061
 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
 mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
 __do_softirq+0x223/0x964 kernel/softirq.c:292
 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 do_softirq_own_stack+0x109/0x140 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:387 [inline]
 __irq_exit_rcu kernel/softirq.c:417 [inline]
 irq_exit_rcu+0x16f/0x1a0 kernel/softirq.c:429
 sysvec_apic_timer_interrupt+0x4e/0xd0 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:587
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/irqflags.h:85 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x3b/0x40 kernel/locking/spinlock.c:191
Code: e8 2a c3 f4 fc 48 89 ef e8 12 96 f5 fc f6 c7 02 75 11 53 9d e8 d6 db 11 fd 65 ff 0d 1f 21 b3 56 5b 5d c3 e8 a7 d7 11 fd 53 9d <eb> ed 0f 1f 00 55 48 89 fd 65 ff 05 05 21 b3 56 ff 74 24 08 48 8d
RSP: 0018:ffff8880446ffd80 EFLAGS: 00000286
RAX: 0000000000000006 RBX: 0000000000000286 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa94ecea9
RBP: ffff888012934408 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: fffffbfff57be301 R12: 1ffff110088dffc1
R13: ffff888037b817c0 R14: ffff88802442415a R15: ffff888024424000
 __do_sys_perf_event_open+0x1b5d/0x2bd0 kernel/events/core.c:11874
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x473dbd
Code: Bad RIP value.
RSP: 002b:00007f21e5e9cc28 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
RAX: ffffffffffffffda RBX: 000000000057bf00 RCX: 0000000000473dbd
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000040
RBP: 000000000057bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000246 R12: 000000000057bf0c
R13: 00007ffd0493503f R14: 00000000004d0f46 R15: 00007f21e5e9cd80

Allocated by task 871:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc mm/kasan/common.c:494 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
 kmalloc include/linux/slab.h:555 [inline]
 kzalloc include/linux/slab.h:669 [inline]
 mlxsw_core_reg_access_emad+0x70/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1812
 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

Freed by task 871:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
 slab_free_hook mm/slub.c:1474 [inline]
 slab_free_freelist_hook mm/slub.c:1507 [inline]
 slab_free mm/slub.c:3072 [inline]
 kfree+0xe6/0x320 mm/slub.c:4052
 mlxsw_core_reg_access_emad+0xd45/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1819
 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

The buggy address belongs to the object at ffff88800b796400
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 232 bytes inside of
 512-byte region [ffff88800b796400, ffff88800b796600)
The buggy address belongs to the page:
page:ffffea00002de500 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00002de500 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 dead000000000100 dead000000000122 ffff88806c402500
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88800b796380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88800b796400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88800b796480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                          ^
 ffff88800b796500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88800b796580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Aug 18, 2020
[ Upstream commit 3c8ce24 ]

The lifetime of EMAD transactions (i.e., 'struct mlxsw_reg_trans') is
managed using RCU. They are freed using kfree_rcu() once the transaction
ends.

However, in case the transaction failed it is freed immediately after being
removed from the active transactions list. This is problematic because it is
still possible for a different CPU to dereference the transaction from an RCU
read-side critical section while traversing the active transaction list in
mlxsw_emad_rx_listener_func(). In which case, a use-after-free is triggered
[1].

Fix this by freeing the transaction after a grace period by calling
kfree_rcu().

[1]
BUG: KASAN: use-after-free in mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
Read of size 8 at addr ffff88800b7964e8 by task syz-executor.2/2881

CPU: 0 PID: 2881 Comm: syz-executor.2 Not tainted 5.8.0-rc4+ #44
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 mlxsw_emad_rx_listener_func+0x969/0xac0 drivers/net/ethernet/mellanox/mlxsw/core.c:671
 mlxsw_core_skb_receive+0x571/0x700 drivers/net/ethernet/mellanox/mlxsw/core.c:2061
 mlxsw_pci_cqe_rdq_handle drivers/net/ethernet/mellanox/mlxsw/pci.c:595 [inline]
 mlxsw_pci_cq_tasklet+0x12a6/0x2520 drivers/net/ethernet/mellanox/mlxsw/pci.c:651
 tasklet_action_common.isra.0+0x13f/0x3e0 kernel/softirq.c:550
 __do_softirq+0x223/0x964 kernel/softirq.c:292
 asm_call_on_stack+0x12/0x20 arch/x86/entry/entry_64.S:711
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 do_softirq_own_stack+0x109/0x140 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:387 [inline]
 __irq_exit_rcu kernel/softirq.c:417 [inline]
 irq_exit_rcu+0x16f/0x1a0 kernel/softirq.c:429
 sysvec_apic_timer_interrupt+0x4e/0xd0 arch/x86/kernel/apic/apic.c:1091
 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:587
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/irqflags.h:85 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x3b/0x40 kernel/locking/spinlock.c:191
Code: e8 2a c3 f4 fc 48 89 ef e8 12 96 f5 fc f6 c7 02 75 11 53 9d e8 d6 db 11 fd 65 ff 0d 1f 21 b3 56 5b 5d c3 e8 a7 d7 11 fd 53 9d <eb> ed 0f 1f 00 55 48 89 fd 65 ff 05 05 21 b3 56 ff 74 24 08 48 8d
RSP: 0018:ffff8880446ffd80 EFLAGS: 00000286
RAX: 0000000000000006 RBX: 0000000000000286 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa94ecea9
RBP: ffff888012934408 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: fffffbfff57be301 R12: 1ffff110088dffc1
R13: ffff888037b817c0 R14: ffff88802442415a R15: ffff888024424000
 __do_sys_perf_event_open+0x1b5d/0x2bd0 kernel/events/core.c:11874
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x473dbd
Code: Bad RIP value.
RSP: 002b:00007f21e5e9cc28 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
RAX: ffffffffffffffda RBX: 000000000057bf00 RCX: 0000000000473dbd
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000040
RBP: 000000000057bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000003 R11: 0000000000000246 R12: 000000000057bf0c
R13: 00007ffd0493503f R14: 00000000004d0f46 R15: 00007f21e5e9cd80

Allocated by task 871:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc mm/kasan/common.c:494 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:467
 kmalloc include/linux/slab.h:555 [inline]
 kzalloc include/linux/slab.h:669 [inline]
 mlxsw_core_reg_access_emad+0x70/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1812
 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

Freed by task 871:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0x12c/0x170 mm/kasan/common.c:455
 slab_free_hook mm/slub.c:1474 [inline]
 slab_free_freelist_hook mm/slub.c:1507 [inline]
 slab_free mm/slub.c:3072 [inline]
 kfree+0xe6/0x320 mm/slub.c:4052
 mlxsw_core_reg_access_emad+0xd45/0x1410 drivers/net/ethernet/mellanox/mlxsw/core.c:1819
 mlxsw_core_reg_access+0xeb/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1991
 mlxsw_sp_port_get_hw_xstats+0x335/0x7e0 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1130
 update_stats_cache+0xf4/0x140 drivers/net/ethernet/mellanox/mlxsw/spectrum.c:1173
 process_one_work+0xa3e/0x17a0 kernel/workqueue.c:2269
 worker_thread+0x9e/0x1050 kernel/workqueue.c:2415
 kthread+0x355/0x470 kernel/kthread.c:291
 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:293

The buggy address belongs to the object at ffff88800b796400
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 232 bytes inside of
 512-byte region [ffff88800b796400, ffff88800b796600)
The buggy address belongs to the page:
page:ffffea00002de500 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00002de500 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 dead000000000100 dead000000000122 ffff88806c402500
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88800b796380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88800b796400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88800b796480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                          ^
 ffff88800b796500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88800b796580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: caf7297 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Aug 28, 2020
[ Upstream commit 6184358 ]

When enable LOCKDEP, static_obj() will cause error. Because some
__initdata static variables is before _stext:

static int static_obj(const void *obj)
{
        unsigned long start = (unsigned long) &_stext,
                      end   = (unsigned long) &_end,
                      addr  = (unsigned long) obj;

        /*
         * static variable?
         */
        if ((addr >= start) && (addr < end))
                return 1;

[    0.067192] INFO: trying to register non-static key.
[    0.067325] the code is fine but needs lockdep annotation.
[    0.067449] turning off the locking correctness validator.
[    0.067718] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-rc7-dirty #44
[    0.067945] Call Trace:
[    0.068369] [<ffffffe00020323c>] walk_stackframe+0x0/0xa4
[    0.068506] [<ffffffe000203422>] show_stack+0x2a/0x34
[    0.068631] [<ffffffe000521e4e>] dump_stack+0x94/0xca
[    0.068757] [<ffffffe000255a4e>] register_lock_class+0x5b8/0x5bc
[    0.068969] [<ffffffe000255abe>] __lock_acquire+0x6c/0x1d5c
[    0.069101] [<ffffffe0002550fe>] lock_acquire+0xae/0x312
[    0.069228] [<ffffffe000989a8e>] _raw_spin_lock_irqsave+0x40/0x5a
[    0.069357] [<ffffffe000247c64>] complete+0x1e/0x50
[    0.069479] [<ffffffe000984c38>] rest_init+0x1b0/0x28a
[    0.069660] [<ffffffe0000016a2>] 0xffffffe0000016a2
[    0.069779] [<ffffffe000001b84>] 0xffffffe000001b84
[    0.069953] [<ffffffe000001092>] 0xffffffe000001092

static __initdata DECLARE_COMPLETION(kthreadd_done);

noinline void __ref rest_init(void)
{
	...
	complete(&kthreadd_done);

Signed-off-by: Guo Ren <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Aug 28, 2020
[ Upstream commit 6184358 ]

When enable LOCKDEP, static_obj() will cause error. Because some
__initdata static variables is before _stext:

static int static_obj(const void *obj)
{
        unsigned long start = (unsigned long) &_stext,
                      end   = (unsigned long) &_end,
                      addr  = (unsigned long) obj;

        /*
         * static variable?
         */
        if ((addr >= start) && (addr < end))
                return 1;

[    0.067192] INFO: trying to register non-static key.
[    0.067325] the code is fine but needs lockdep annotation.
[    0.067449] turning off the locking correctness validator.
[    0.067718] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-rc7-dirty #44
[    0.067945] Call Trace:
[    0.068369] [<ffffffe00020323c>] walk_stackframe+0x0/0xa4
[    0.068506] [<ffffffe000203422>] show_stack+0x2a/0x34
[    0.068631] [<ffffffe000521e4e>] dump_stack+0x94/0xca
[    0.068757] [<ffffffe000255a4e>] register_lock_class+0x5b8/0x5bc
[    0.068969] [<ffffffe000255abe>] __lock_acquire+0x6c/0x1d5c
[    0.069101] [<ffffffe0002550fe>] lock_acquire+0xae/0x312
[    0.069228] [<ffffffe000989a8e>] _raw_spin_lock_irqsave+0x40/0x5a
[    0.069357] [<ffffffe000247c64>] complete+0x1e/0x50
[    0.069479] [<ffffffe000984c38>] rest_init+0x1b0/0x28a
[    0.069660] [<ffffffe0000016a2>] 0xffffffe0000016a2
[    0.069779] [<ffffffe000001b84>] 0xffffffe000001b84
[    0.069953] [<ffffffe000001092>] 0xffffffe000001092

static __initdata DECLARE_COMPLETION(kthreadd_done);

noinline void __ref rest_init(void)
{
	...
	complete(&kthreadd_done);

Signed-off-by: Guo Ren <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Aug 28, 2020
This reverts commit c8da40c.

Reason for revert:
  This commit will be split into smaller commits to describe what each
  commit does. Currently, this commit does the following things within
  this one commit:
  a) modify mwifiex_pcie_suspend/mwifiex_pcie_resume functions to
     achieve S0ix without user unloading mwifiex module manually
     and also to fix "(sometimes) scanning for APs doesn't work after
     suspend"
  b) disable bridge_d3 to fix mwifiex module crashing after suspend
  c) disable "auto deep sleep" (auto_ds). auto_ds is reportedly causing
     "suspend/resume fails when not connected to an Access Point."

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(corresponds to commit dd9a57a
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Aug 28, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/0f5283b42247cb8978ed9493936f9b5b72350121)

On upstream mwifiex driver, it's reported that unloading the driver
manually [1] fixes S0ix achievement during s2idle (suspend).

Also, unloading/reloading the driver fixes "AP scanning (sometimes) not
working after suspend".

This commit removes/probes the card on suspend/resume on the driver
side to fix such issues.

[1] Run the following command for example
    $ sudo modprobe -r mwifiex_pcie
      sudo modprobe -r mwifiex

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 82204f8
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Aug 28, 2020
On Windows, it seems that wifi parent bridge will not enter D3 state
(stay on D0 state) [1]. And on Linux, disabling the D3 state for the
bridge fixes driver crashing after suspend.

This commit disables the parent bridge D3 state on driver initialization
to fix driver crashing after suspend.

NOTE_1: The driver does not crash on LTS 4.4 series but does crash on
        LTS 4.9 series. Maybe commit 9d26d3a ("PCI: Put PCIe ports
        into D3 during suspend") changed the bridge D-state behavior.

NOTE_2: As a side effect, it reports parent bridge is in D3hot. I'm not
        sure why.

        acpi device:4b: Cannot transition to power state D0 for parent in D3hot

        Power state of the parent bridge on SB1 before this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D0
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

        Power state of the parent bridge on SB1 after this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D3hot
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

NOTE_3: This change is needed for SP4 and later generations that use
        mwifiex (i.e. SP4/SP5/SP6, SB1/SB2 and SL1/SL2). Not needed for
        Surface 3 but safe to apply as this change doesn't break S0ix.
        Not sure if needed for SP1/SP2 and SP3. Anyway, I think applying
        this change to all the devices may be safe regarding S0ix because
        Surface 3 can achieve S0ix anyway even when this patch applied.
        So, let's apply this change for all the devices for now.

[1] jakeday/linux-surface#554 (comment)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 028e378
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Aug 28, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/5cd03f3e1c01440d0e0bd0cd00a51732c07f99da)

According to the commit this commit is based on, enabling "auto deep
sleep" (auto_ds) causes "suspend/resume fails when not connected to
an Access Point.".

This commit stops enabling auto_ds on driver initialization to fix the
issue.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit cecd107
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Aug 28, 2020
This reverts commit d3e3527.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(corresponds to commit dd9a57a
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Aug 28, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/0f5283b42247cb8978ed9493936f9b5b72350121)

On upstream mwifiex driver, it's reported that unloading the driver
manually [1] fixes S0ix achievement during s2idle (suspend).

Also, unloading/reloading the driver fixes "AP scanning (sometimes) not
working after suspend".

This commit removes/probes the card on suspend/resume on the driver
side to fix such issues.

[1] Run the following command for example
    $ sudo modprobe -r mwifiex_pcie
      sudo modprobe -r mwifiex

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 82204f8
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Aug 28, 2020
On Windows, it seems that wifi parent bridge will not enter D3 state
(stay on D0 state) [1]. And on Linux, disabling the D3 state for the
bridge fixes driver crashing after suspend.

This commit disables the parent bridge D3 state on driver initialization
to fix driver crashing after suspend.

NOTE_1: The driver does not crash on LTS 4.4 series but does crash on
        LTS 4.9 series. Maybe commit 9d26d3a ("PCI: Put PCIe ports
        into D3 during suspend") changed the bridge D-state behavior.

NOTE_2: As a side effect, it reports parent bridge is in D3hot. I'm not
        sure why.

        acpi device:4b: Cannot transition to power state D0 for parent in D3hot

        Power state of the parent bridge on SB1 before this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D0
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

        Power state of the parent bridge on SB1 after this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D3hot
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

NOTE_3: This change is needed for SP4 and later generations that use
        mwifiex (i.e. SP4/SP5/SP6, SB1/SB2 and SL1/SL2). Not needed for
        Surface 3 but safe to apply as this change doesn't break S0ix.
        Not sure if needed for SP1/SP2 and SP3. Anyway, I think applying
        this change to all the devices may be safe regarding S0ix because
        Surface 3 can achieve S0ix anyway even when this patch applied.
        So, let's apply this change for all the devices for now.

[1] jakeday/linux-surface#554 (comment)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 028e378
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Aug 28, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/5cd03f3e1c01440d0e0bd0cd00a51732c07f99da)

According to the commit this commit is based on, enabling "auto deep
sleep" (auto_ds) causes "suspend/resume fails when not connected to
an Access Point.".

This commit stops enabling auto_ds on driver initialization to fix the
issue.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit cecd107
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
kitakar5525 added a commit to kitakar5525/linux-kernel that referenced this pull request Sep 16, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/0f5283b42247cb8978ed9493936f9b5b72350121)

On upstream mwifiex driver, it's reported that unloading the driver
manually [1] fixes S0ix achievement during s2idle (suspend).

Also, unloading/reloading the driver fixes "AP scanning (sometimes) not
working after suspend".

This commit removes/probes the card on suspend/resume on the driver
side to fix such issues.

[1] Run the following command for example
    $ sudo modprobe -r mwifiex_pcie
      sudo modprobe -r mwifiex

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 82204f8
from PR linux-surface#44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
kitakar5525 added a commit to kitakar5525/linux-kernel that referenced this pull request Sep 16, 2020
On Windows, it seems that wifi parent bridge will not enter D3 state
(stay on D0 state) [1]. And on Linux, disabling the D3 state for the
bridge fixes driver crashing after suspend.

This commit disables the parent bridge D3 state on driver initialization
to fix driver crashing after suspend.

NOTE_1: The driver does not crash on LTS 4.4 series but does crash on
        LTS 4.9 series. Maybe commit 9d26d3a ("PCI: Put PCIe ports
        into D3 during suspend") changed the bridge D-state behavior.

NOTE_2: As a side effect, it reports parent bridge is in D3hot. I'm not
        sure why.

        acpi device:4b: Cannot transition to power state D0 for parent in D3hot

        Power state of the parent bridge on SB1 before this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D0
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

        Power state of the parent bridge on SB1 after this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D3hot
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

NOTE_3: This change is needed for SP4 and later generations that use
        mwifiex (i.e. SP4/SP5/SP6, SB1/SB2 and SL1/SL2). Not needed for
        Surface 3 but safe to apply as this change doesn't break S0ix.
        Not sure if needed for SP1/SP2 and SP3. Anyway, I think applying
        this change to all the devices may be safe regarding S0ix because
        Surface 3 can achieve S0ix anyway even when this patch applied.
        So, let's apply this change for all the devices for now.

[1] jakeday/linux-surface#554 (comment)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 028e378
from PR linux-surface#44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
kitakar5525 added a commit to kitakar5525/linux-kernel that referenced this pull request Sep 16, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/5cd03f3e1c01440d0e0bd0cd00a51732c07f99da)

According to the commit this commit is based on, enabling "auto deep
sleep" (auto_ds) causes "suspend/resume fails when not connected to
an Access Point.".

This commit stops enabling auto_ds on driver initialization to fix the
issue.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit cecd107
from PR linux-surface#44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Oct 1, 2020
[ Upstream commit 95a3d8f ]

When xfstests generic/451, there is an BUG at mm/memcontrol.c:
  page:ffffea000560f2c0 refcount:2 mapcount:0 mapping:000000008544e0ea
       index:0xf
  mapping->aops:cifs_addr_ops dentry name:"tst-aio-dio-cycle-write.451"
  flags: 0x2fffff80000001(locked)
  raw: 002fffff80000001 ffffc90002023c50 ffffea0005280088 ffff88815cda0210
  raw: 000000000000000f 0000000000000000 00000002ffffffff ffff88817287d000
  page dumped because: VM_BUG_ON_PAGE(page->mem_cgroup)
  page->mem_cgroup:ffff88817287d000
  ------------[ cut here ]------------
  kernel BUG at mm/memcontrol.c:2659!
  invalid opcode: 0000 [#1] SMP
  CPU: 2 PID: 2038 Comm: xfs_io Not tainted 5.8.0-rc1 #44
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_
    073836-buildvm-ppc64le-16.ppc.4
  RIP: 0010:commit_charge+0x35/0x50
  Code: 0d 48 83 05 54 b2 02 05 01 48 89 77 38 c3 48 c7
        c6 78 4a ea ba 48 83 05 38 b2 02 05 01 e8 63 0d9
  RSP: 0018:ffffc90002023a50 EFLAGS: 00010202
  RAX: 0000000000000000 RBX: ffff88817287d000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffff88817ac97ea0 RDI: ffff88817ac97ea0
  RBP: ffffea000560f2c0 R08: 0000000000000203 R09: 0000000000000005
  R10: 0000000000000030 R11: ffffc900020237a8 R12: 0000000000000000
  R13: 0000000000000001 R14: 0000000000000001 R15: ffff88815a1272c0
  FS:  00007f5071ab0800(0000) GS:ffff88817ac80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000055efcd5ca000 CR3: 000000015d312000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   mem_cgroup_charge+0x166/0x4f0
   __add_to_page_cache_locked+0x4a9/0x710
   add_to_page_cache_locked+0x15/0x20
   cifs_readpages+0x217/0x1270
   read_pages+0x29a/0x670
   page_cache_readahead_unbounded+0x24f/0x390
   __do_page_cache_readahead+0x3f/0x60
   ondemand_readahead+0x1f1/0x470
   page_cache_async_readahead+0x14c/0x170
   generic_file_buffered_read+0x5df/0x1100
   generic_file_read_iter+0x10c/0x1d0
   cifs_strict_readv+0x139/0x170
   new_sync_read+0x164/0x250
   __vfs_read+0x39/0x60
   vfs_read+0xb5/0x1e0
   ksys_pread64+0x85/0xf0
   __x64_sys_pread64+0x22/0x30
   do_syscall_64+0x69/0x150
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f5071fcb1af
  Code: Bad RIP value.
  RSP: 002b:00007ffde2cdb8e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
  RAX: ffffffffffffffda RBX: 00007ffde2cdb990 RCX: 00007f5071fcb1af
  RDX: 0000000000001000 RSI: 000055efcd5ca000 RDI: 0000000000000003
  RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000001000 R11: 0000000000000293 R12: 0000000000000001
  R13: 000000000009f000 R14: 0000000000000000 R15: 0000000000001000
  Modules linked in:
  ---[ end trace 725fa14a3e1af65c ]---

Since commit 3fea5a4 ("mm: memcontrol: convert page cache to a new
mem_cgroup_charge() API") not cancel the page charge, the pages maybe
double add to pagecache:
thread1                       | thread2
cifs_readpages
readpages_get_pages
 add_to_page_cache_locked(head,index=n)=0
                              | readpages_get_pages
                              | add_to_page_cache_locked(head,index=n+1)=0
 add_to_page_cache_locked(head, index=n+1)=-EEXIST
 then, will next loop with list head page's
 index=n+1 and the page->mapping not NULL
readpages_get_pages
add_to_page_cache_locked(head, index=n+1)
 commit_charge
  VM_BUG_ON_PAGE

So, we should not do the next loop when any page add to page cache
failed.

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Steve French <[email protected]>
Acked-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Oct 1, 2020
[ Upstream commit 95a3d8f ]

When xfstests generic/451, there is an BUG at mm/memcontrol.c:
  page:ffffea000560f2c0 refcount:2 mapcount:0 mapping:000000008544e0ea
       index:0xf
  mapping->aops:cifs_addr_ops dentry name:"tst-aio-dio-cycle-write.451"
  flags: 0x2fffff80000001(locked)
  raw: 002fffff80000001 ffffc90002023c50 ffffea0005280088 ffff88815cda0210
  raw: 000000000000000f 0000000000000000 00000002ffffffff ffff88817287d000
  page dumped because: VM_BUG_ON_PAGE(page->mem_cgroup)
  page->mem_cgroup:ffff88817287d000
  ------------[ cut here ]------------
  kernel BUG at mm/memcontrol.c:2659!
  invalid opcode: 0000 [#1] SMP
  CPU: 2 PID: 2038 Comm: xfs_io Not tainted 5.8.0-rc1 #44
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_
    073836-buildvm-ppc64le-16.ppc.4
  RIP: 0010:commit_charge+0x35/0x50
  Code: 0d 48 83 05 54 b2 02 05 01 48 89 77 38 c3 48 c7
        c6 78 4a ea ba 48 83 05 38 b2 02 05 01 e8 63 0d9
  RSP: 0018:ffffc90002023a50 EFLAGS: 00010202
  RAX: 0000000000000000 RBX: ffff88817287d000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffff88817ac97ea0 RDI: ffff88817ac97ea0
  RBP: ffffea000560f2c0 R08: 0000000000000203 R09: 0000000000000005
  R10: 0000000000000030 R11: ffffc900020237a8 R12: 0000000000000000
  R13: 0000000000000001 R14: 0000000000000001 R15: ffff88815a1272c0
  FS:  00007f5071ab0800(0000) GS:ffff88817ac80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000055efcd5ca000 CR3: 000000015d312000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   mem_cgroup_charge+0x166/0x4f0
   __add_to_page_cache_locked+0x4a9/0x710
   add_to_page_cache_locked+0x15/0x20
   cifs_readpages+0x217/0x1270
   read_pages+0x29a/0x670
   page_cache_readahead_unbounded+0x24f/0x390
   __do_page_cache_readahead+0x3f/0x60
   ondemand_readahead+0x1f1/0x470
   page_cache_async_readahead+0x14c/0x170
   generic_file_buffered_read+0x5df/0x1100
   generic_file_read_iter+0x10c/0x1d0
   cifs_strict_readv+0x139/0x170
   new_sync_read+0x164/0x250
   __vfs_read+0x39/0x60
   vfs_read+0xb5/0x1e0
   ksys_pread64+0x85/0xf0
   __x64_sys_pread64+0x22/0x30
   do_syscall_64+0x69/0x150
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f5071fcb1af
  Code: Bad RIP value.
  RSP: 002b:00007ffde2cdb8e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
  RAX: ffffffffffffffda RBX: 00007ffde2cdb990 RCX: 00007f5071fcb1af
  RDX: 0000000000001000 RSI: 000055efcd5ca000 RDI: 0000000000000003
  RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000001000 R11: 0000000000000293 R12: 0000000000000001
  R13: 000000000009f000 R14: 0000000000000000 R15: 0000000000001000
  Modules linked in:
  ---[ end trace 725fa14a3e1af65c ]---

Since commit 3fea5a4 ("mm: memcontrol: convert page cache to a new
mem_cgroup_charge() API") not cancel the page charge, the pages maybe
double add to pagecache:
thread1                       | thread2
cifs_readpages
readpages_get_pages
 add_to_page_cache_locked(head,index=n)=0
                              | readpages_get_pages
                              | add_to_page_cache_locked(head,index=n+1)=0
 add_to_page_cache_locked(head, index=n+1)=-EEXIST
 then, will next loop with list head page's
 index=n+1 and the page->mapping not NULL
readpages_get_pages
add_to_page_cache_locked(head, index=n+1)
 commit_charge
  VM_BUG_ON_PAGE

So, we should not do the next loop when any page add to page cache
failed.

Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Steve French <[email protected]>
Acked-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Oct 16, 2020
This reverts commit c8da40c.

Reason for revert:
  This commit will be split into smaller commits to describe what each
  commit does. Currently, this commit does the following things within
  this one commit:
  a) modify mwifiex_pcie_suspend/mwifiex_pcie_resume functions to
     achieve S0ix without user unloading mwifiex module manually
     and also to fix "(sometimes) scanning for APs doesn't work after
     suspend"
  b) disable bridge_d3 to fix mwifiex module crashing after suspend
  c) disable "auto deep sleep" (auto_ds). auto_ds is reportedly causing
     "suspend/resume fails when not connected to an Access Point."

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(corresponds to commit dd9a57a
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Oct 16, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/0f5283b42247cb8978ed9493936f9b5b72350121)

On upstream mwifiex driver, it's reported that unloading the driver
manually [1] fixes S0ix achievement during s2idle (suspend).

Also, unloading/reloading the driver fixes "AP scanning (sometimes) not
working after suspend".

This commit removes/probes the card on suspend/resume on the driver
side to fix such issues.

[1] Run the following command for example
    $ sudo modprobe -r mwifiex_pcie
      sudo modprobe -r mwifiex

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 82204f8
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Oct 16, 2020
On Windows, it seems that wifi parent bridge will not enter D3 state
(stay on D0 state) [1]. And on Linux, disabling the D3 state for the
bridge fixes driver crashing after suspend.

This commit disables the parent bridge D3 state on driver initialization
to fix driver crashing after suspend.

NOTE_1: The driver does not crash on LTS 4.4 series but does crash on
        LTS 4.9 series. Maybe commit 9d26d3a ("PCI: Put PCIe ports
        into D3 during suspend") changed the bridge D-state behavior.

NOTE_2: As a side effect, it reports parent bridge is in D3hot. I'm not
        sure why.

        acpi device:4b: Cannot transition to power state D0 for parent in D3hot

        Power state of the parent bridge on SB1 before this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D0
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

        Power state of the parent bridge on SB1 after this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D3hot
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

NOTE_3: This change is needed for SP4 and later generations that use
        mwifiex (i.e. SP4/SP5/SP6, SB1/SB2 and SL1/SL2). Not needed for
        Surface 3 but safe to apply as this change doesn't break S0ix.
        Not sure if needed for SP1/SP2 and SP3. Anyway, I think applying
        this change to all the devices may be safe regarding S0ix because
        Surface 3 can achieve S0ix anyway even when this patch applied.
        So, let's apply this change for all the devices for now.

[1] jakeday/linux-surface#554 (comment)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 028e378
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Oct 16, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/5cd03f3e1c01440d0e0bd0cd00a51732c07f99da)

According to the commit this commit is based on, enabling "auto deep
sleep" (auto_ds) causes "suspend/resume fails when not connected to
an Access Point.".

This commit stops enabling auto_ds on driver initialization to fix the
issue.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit cecd107
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Oct 16, 2020
This reverts commit d3e3527.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(corresponds to commit dd9a57a
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Oct 16, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/0f5283b42247cb8978ed9493936f9b5b72350121)

On upstream mwifiex driver, it's reported that unloading the driver
manually [1] fixes S0ix achievement during s2idle (suspend).

Also, unloading/reloading the driver fixes "AP scanning (sometimes) not
working after suspend".

This commit removes/probes the card on suspend/resume on the driver
side to fix such issues.

[1] Run the following command for example
    $ sudo modprobe -r mwifiex_pcie
      sudo modprobe -r mwifiex

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 82204f8
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Oct 16, 2020
On Windows, it seems that wifi parent bridge will not enter D3 state
(stay on D0 state) [1]. And on Linux, disabling the D3 state for the
bridge fixes driver crashing after suspend.

This commit disables the parent bridge D3 state on driver initialization
to fix driver crashing after suspend.

NOTE_1: The driver does not crash on LTS 4.4 series but does crash on
        LTS 4.9 series. Maybe commit 9d26d3a ("PCI: Put PCIe ports
        into D3 during suspend") changed the bridge D-state behavior.

NOTE_2: As a side effect, it reports parent bridge is in D3hot. I'm not
        sure why.

        acpi device:4b: Cannot transition to power state D0 for parent in D3hot

        Power state of the parent bridge on SB1 before this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D0
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

        Power state of the parent bridge on SB1 after this commit:

        $ grep -H . /sys/bus/pci/devices/0000:00:1d.3/firmware_node/{power_state,real_power_state}
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/power_state:D3hot
        /sys/bus/pci/devices/0000:00:1d.3/firmware_node/real_power_state:D3hot

NOTE_3: This change is needed for SP4 and later generations that use
        mwifiex (i.e. SP4/SP5/SP6, SB1/SB2 and SL1/SL2). Not needed for
        Surface 3 but safe to apply as this change doesn't break S0ix.
        Not sure if needed for SP1/SP2 and SP3. Anyway, I think applying
        this change to all the devices may be safe regarding S0ix because
        Surface 3 can achieve S0ix anyway even when this patch applied.
        So, let's apply this change for all the devices for now.

[1] jakeday/linux-surface#554 (comment)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit 028e378
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request Oct 16, 2020
(made based on commit https://github.com/sebanc/linux-kernel/commit/5cd03f3e1c01440d0e0bd0cd00a51732c07f99da)

According to the commit this commit is based on, enabling "auto deep
sleep" (auto_ds) causes "suspend/resume fails when not connected to
an Access Point.".

This commit stops enabling auto_ds on driver initialization to fix the
issue.

Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>

(cherry picked from commit cecd107
from PR #44)
Signed-off-by: Tsuchiya Yuto (kitakar5525) <[email protected]>
qzed pushed a commit that referenced this pull request May 24, 2021
[ Upstream commit d5027ca ]

Ritesh reported a bug [1] against UML, noting that it crashed on
startup. The backtrace shows the following (heavily redacted):

(gdb) bt
...
 #26 0x0000000060015b5d in sem_init () at ipc/sem.c:268
 #27 0x00007f89906d92f7 in ?? () from /lib/x86_64-linux-gnu/libcom_err.so.2
 #28 0x00007f8990ab8fb2 in call_init (...) at dl-init.c:72
...
 #40 0x00007f89909bf3a6 in nss_load_library (...) at nsswitch.c:359
...
 #44 0x00007f8990895e35 in _nss_compat_getgrnam_r (...) at nss_compat/compat-grp.c:486
 #45 0x00007f8990968b85 in __getgrnam_r [...]
 #46 0x00007f89909d6b77 in grantpt [...]
 #47 0x00007f8990a9394e in __GI_openpty [...]
 #48 0x00000000604a1f65 in openpty_cb (...) at arch/um/os-Linux/sigio.c:407
 #49 0x00000000604a58d0 in start_idle_thread (...) at arch/um/os-Linux/skas/process.c:598
 #50 0x0000000060004a3d in start_uml () at arch/um/kernel/skas/process.c:45
 #51 0x00000000600047b2 in linux_main (...) at arch/um/kernel/um_arch.c:334
 #52 0x000000006000574f in main (...) at arch/um/os-Linux/main.c:144

indicating that the UML function openpty_cb() calls openpty(),
which internally calls __getgrnam_r(), which causes the nsswitch
machinery to get started.

This loads, through lots of indirection that I snipped, the
libcom_err.so.2 library, which (in an unknown function, "??")
calls sem_init().

Now, of course it wants to get libpthread's sem_init(), since
it's linked against libpthread. However, the dynamic linker
looks up that symbol against the binary first, and gets the
kernel's sem_init().

Hajime Tazaki noted that "objcopy -L" can localize a symbol,
so the dynamic linker wouldn't do the lookup this way. I tried,
but for some reason that didn't seem to work.

Doing the same thing in the linker script instead does seem to
work, though I cannot entirely explain - it *also* works if I
just add "VERSION { { global: *; }; }" instead, indicating that
something else is happening that I don't really understand. It
may be that explicitly doing that marks them with some kind of
empty version, and that's different from the default.

Explicitly marking them with a version breaks kallsyms, so that
doesn't seem to be possible.

Marking all the symbols as local seems correct, and does seem
to address the issue, so do that. Also do it for static link,
nsswitch libraries could still be loaded there.

[1] https://bugs.debian.org/983379

Reported-by: Ritesh Raj Sarraf <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Acked-By: Anton Ivanov <[email protected]>
Tested-By: Ritesh Raj Sarraf <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request May 24, 2021
[ Upstream commit d5027ca ]

Ritesh reported a bug [1] against UML, noting that it crashed on
startup. The backtrace shows the following (heavily redacted):

(gdb) bt
...
 #26 0x0000000060015b5d in sem_init () at ipc/sem.c:268
 #27 0x00007f89906d92f7 in ?? () from /lib/x86_64-linux-gnu/libcom_err.so.2
 #28 0x00007f8990ab8fb2 in call_init (...) at dl-init.c:72
...
 #40 0x00007f89909bf3a6 in nss_load_library (...) at nsswitch.c:359
...
 #44 0x00007f8990895e35 in _nss_compat_getgrnam_r (...) at nss_compat/compat-grp.c:486
 #45 0x00007f8990968b85 in __getgrnam_r [...]
 #46 0x00007f89909d6b77 in grantpt [...]
 #47 0x00007f8990a9394e in __GI_openpty [...]
 #48 0x00000000604a1f65 in openpty_cb (...) at arch/um/os-Linux/sigio.c:407
 #49 0x00000000604a58d0 in start_idle_thread (...) at arch/um/os-Linux/skas/process.c:598
 #50 0x0000000060004a3d in start_uml () at arch/um/kernel/skas/process.c:45
 #51 0x00000000600047b2 in linux_main (...) at arch/um/kernel/um_arch.c:334
 #52 0x000000006000574f in main (...) at arch/um/os-Linux/main.c:144

indicating that the UML function openpty_cb() calls openpty(),
which internally calls __getgrnam_r(), which causes the nsswitch
machinery to get started.

This loads, through lots of indirection that I snipped, the
libcom_err.so.2 library, which (in an unknown function, "??")
calls sem_init().

Now, of course it wants to get libpthread's sem_init(), since
it's linked against libpthread. However, the dynamic linker
looks up that symbol against the binary first, and gets the
kernel's sem_init().

Hajime Tazaki noted that "objcopy -L" can localize a symbol,
so the dynamic linker wouldn't do the lookup this way. I tried,
but for some reason that didn't seem to work.

Doing the same thing in the linker script instead does seem to
work, though I cannot entirely explain - it *also* works if I
just add "VERSION { { global: *; }; }" instead, indicating that
something else is happening that I don't really understand. It
may be that explicitly doing that marks them with some kind of
empty version, and that's different from the default.

Explicitly marking them with a version breaks kallsyms, so that
doesn't seem to be possible.

Marking all the symbols as local seems correct, and does seem
to address the issue, so do that. Also do it for static link,
nsswitch libraries could still be loaded there.

[1] https://bugs.debian.org/983379

Reported-by: Ritesh Raj Sarraf <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Acked-By: Anton Ivanov <[email protected]>
Tested-By: Ritesh Raj Sarraf <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request May 29, 2021
[ Upstream commit d5027ca ]

Ritesh reported a bug [1] against UML, noting that it crashed on
startup. The backtrace shows the following (heavily redacted):

(gdb) bt
...
 #26 0x0000000060015b5d in sem_init () at ipc/sem.c:268
 #27 0x00007f89906d92f7 in ?? () from /lib/x86_64-linux-gnu/libcom_err.so.2
 #28 0x00007f8990ab8fb2 in call_init (...) at dl-init.c:72
...
 #40 0x00007f89909bf3a6 in nss_load_library (...) at nsswitch.c:359
...
 #44 0x00007f8990895e35 in _nss_compat_getgrnam_r (...) at nss_compat/compat-grp.c:486
 #45 0x00007f8990968b85 in __getgrnam_r [...]
 #46 0x00007f89909d6b77 in grantpt [...]
 #47 0x00007f8990a9394e in __GI_openpty [...]
 #48 0x00000000604a1f65 in openpty_cb (...) at arch/um/os-Linux/sigio.c:407
 #49 0x00000000604a58d0 in start_idle_thread (...) at arch/um/os-Linux/skas/process.c:598
 #50 0x0000000060004a3d in start_uml () at arch/um/kernel/skas/process.c:45
 #51 0x00000000600047b2 in linux_main (...) at arch/um/kernel/um_arch.c:334
 #52 0x000000006000574f in main (...) at arch/um/os-Linux/main.c:144

indicating that the UML function openpty_cb() calls openpty(),
which internally calls __getgrnam_r(), which causes the nsswitch
machinery to get started.

This loads, through lots of indirection that I snipped, the
libcom_err.so.2 library, which (in an unknown function, "??")
calls sem_init().

Now, of course it wants to get libpthread's sem_init(), since
it's linked against libpthread. However, the dynamic linker
looks up that symbol against the binary first, and gets the
kernel's sem_init().

Hajime Tazaki noted that "objcopy -L" can localize a symbol,
so the dynamic linker wouldn't do the lookup this way. I tried,
but for some reason that didn't seem to work.

Doing the same thing in the linker script instead does seem to
work, though I cannot entirely explain - it *also* works if I
just add "VERSION { { global: *; }; }" instead, indicating that
something else is happening that I don't really understand. It
may be that explicitly doing that marks them with some kind of
empty version, and that's different from the default.

Explicitly marking them with a version breaks kallsyms, so that
doesn't seem to be possible.

Marking all the symbols as local seems correct, and does seem
to address the issue, so do that. Also do it for static link,
nsswitch libraries could still be loaded there.

[1] https://bugs.debian.org/983379

Reported-by: Ritesh Raj Sarraf <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Acked-By: Anton Ivanov <[email protected]>
Tested-By: Ritesh Raj Sarraf <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Jul 20, 2021
[ Upstream commit f86625a ]

Make sure mt7921_pm_wake_work wouldn't be scheduled after the driver is
in suspend mode to fix the following the kernel crash.

[ 3515.390012] mt7921e 0000:01:00.0: calling pci_pm_suspend+0x0/0x22c @ 2869, parent: 0000:00:00.0
[ 3515.390015] mt7921e 0000:01:00.0: mt7921_pci_suspend +
[ 3515.396395] anx7625 3-0058: anx7625_suspend+0x0/0x6c returned 0 after 0 usecs
[ 3515.405965] mt7921e 0000:01:00.0: mt7921_pci_suspend -
[ 3515.411336] usb 1-1.4: usb_dev_suspend+0x0/0x2c returned 0 after 1 usecs
[ 3515.411513] SError Interrupt on CPU7, code 0xbe000011 -- SError
[ 3515.411515] CPU: 7 PID: 2849 Comm: kworker/u16:27 Not tainted 5.4.114 #44
[ 3515.411516] Hardware name: MediaTek Asurada rev1 board (DT)
[ 3515.411517] Workqueue: mt76 mt7921_pm_wake_work [mt7921e]
[ 3515.411518] pstate: 80c00009 (Nzcv daif +PAN +UAO)
[ 3515.411519] pc : mt76_mmio_rr+0x30/0xf0 [mt76]
[ 3515.411520] lr : mt7921_rr+0x38/0x44 [mt7921e]
[ 3515.411520] sp : ffffffc015813c50
[ 3515.411521] x29: ffffffc015813c50 x28: 0000000000000402
[ 3515.411522] x27: ffffffe5a2012138 x26: ffffffe5a1eea018
[ 3515.411524] x25: 00000000328be505 x24: 00000000000a0002
[ 3515.411525] x23: 0000000000000006 x22: ffffffbd29b7a300
[ 3515.411527] x21: ffffffbd29b7a300 x20: 00000000000e0010
[ 3515.411528] x19: 00000000eac08f43 x18: 0000000000000000
[ 3515.411529] x17: 0000000000000000 x16: ffffffe5a16b2914
[ 3515.411531] x15: 0000000000000010 x14: 0000000000000010
[ 3515.411532] x13: 00000000003dd3a2 x12: 0000000000010000
[ 3515.411533] x11: ffffffe597abec14 x10: 0000000000000010
[ 3515.411535] x9 : ffffffe597abeba8 x8 : ffffffc013ce0010
[ 3515.411536] x7 : 000000b2b5593519 x6 : 0000000000300000
[ 3515.411537] x5 : 0000000000000000 x4 : 0000000000000032
[ 3515.411539] x3 : 0000000000000000 x2 : 0000000000000004
[ 3515.411540] x1 : 00000000000e0010 x0 : ffffffbd29b7a300
[ 3515.411542] Kernel panic - not syncing: Asynchronous SError Interrupt
[ 3515.411543] CPU: 7 PID: 2849 Comm: kworker/u16:27 Not tainted 5.4.114 #44
[ 3515.411544] Hardware name: MediaTek Asurada rev1 board (DT)
[ 3515.411544] Workqueue: mt76 mt7921_pm_wake_work [mt7921e]
[ 3515.411545] Call trace:
[ 3515.411546]  dump_backtrace+0x0/0x14c
[ 3515.411546]  show_stack+0x20/0x2c
[ 3515.411547]  dump_stack+0xa0/0xfc
[ 3515.411548]  panic+0x154/0x350
[ 3515.411548]  panic+0x0/0x350
[ 3515.411549]  arm64_serror_panic+0x78/0x84
[ 3515.411550]  do_serror+0x0/0x118
[ 3515.411550]  do_serror+0xa4/0x118
[ 3515.411551]  el1_error+0x84/0xf8
[ 3515.411552]  mt76_mmio_rr+0x30/0xf0 [mt76]
[ 3515.411552]  mt7921_rr+0x38/0x44 [mt7921e]
[ 3515.411553]  __mt76_poll_msec+0x5c/0x9c [mt76]
[ 3515.411554]  __mt7921_mcu_drv_pmctrl+0x50/0x94 [mt7921e]
[ 3515.411555]  mt7921_mcu_drv_pmctrl+0x38/0xb0 [mt7921e]
[ 3515.411555]  mt7921_pm_wake_work+0x34/0xd4 [mt7921e]
[ 3515.411556]  process_one_work+0x208/0x3c8
[ 3515.411557]  worker_thread+0x23c/0x3e8
[ 3515.411557]  kthread+0x144/0x178
[ 3515.411558]  ret_from_fork+0x10/0x18
[ 3515.418831] SMP: stopping secondary CPUs
[ 3515.418832] Kernel Offset: 0x2590c00000 from 0xffffffc010000000
[ 3515.418832] PHYS_OFFSET: 0xffffffc400000000
[ 3515.418833] CPU features: 0x080026,2a80aa18
[ 3515.418834] Memory Limit: none
[DL] 00000000 00000000 010701

Fixes: 1d8efc7 ("mt76: mt7921: introduce Runtime PM support")
Co-developed-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: Sean Wang <[email protected]>
Signed-off-by: Felix Fietkau <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
kitakar5525 pushed a commit to kitakar5525/linux-kernel that referenced this pull request Jul 26, 2021
These two types of XDP progs (BPF_XDP_DEVMAP, BPF_XDP_CPUMAP) will not be
executed directly in the driver, therefore we should also not directly
run them from here. To run in these two situations, there must be further
preparations done, otherwise these may cause a kernel panic.

For more details, see also dev_xdp_attach().

  [   46.982479] BUG: kernel NULL pointer dereference, address: 0000000000000000
  [   46.984295] #PF: supervisor read access in kernel mode
  [   46.985777] #PF: error_code(0x0000) - not-present page
  [   46.987227] PGD 800000010dca4067 P4D 800000010dca4067 PUD 10dca6067 PMD 0
  [   46.989201] Oops: 0000 [#1] SMP PTI
  [   46.990304] CPU: 7 PID: 562 Comm: a.out Not tainted 5.13.0+ linux-surface#44
  [   46.992001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/24
  [   46.995113] RIP: 0010:___bpf_prog_run+0x17b/0x1710
  [   46.996586] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02
  [   47.001562] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246
  [   47.003115] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000
  [   47.005163] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98
  [   47.007135] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff
  [   47.009171] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98
  [   47.011172] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8
  [   47.013244] FS:  00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000
  [   47.015705] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   47.017475] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0
  [   47.019558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   47.021595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   47.023574] PKRU: 55555554
  [   47.024571] Call Trace:
  [   47.025424]  __bpf_prog_run32+0x32/0x50
  [   47.026296]  ? printk+0x53/0x6a
  [   47.027066]  ? ktime_get+0x39/0x90
  [   47.027895]  bpf_test_run.cold.28+0x23/0x123
  [   47.028866]  ? printk+0x53/0x6a
  [   47.029630]  bpf_prog_test_run_xdp+0x149/0x1d0
  [   47.030649]  __sys_bpf+0x1305/0x23d0
  [   47.031482]  __x64_sys_bpf+0x17/0x20
  [   47.032316]  do_syscall_64+0x3a/0x80
  [   47.033165]  entry_SYSCALL_64_after_hwframe+0x44/0xae
  [   47.034254] RIP: 0033:0x7f04a51364dd
  [   47.035133] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 48
  [   47.038768] RSP: 002b:00007fff8f9fc518 EFLAGS: 00000213 ORIG_RAX: 0000000000000141
  [   47.040344] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f04a51364dd
  [   47.041749] RDX: 0000000000000048 RSI: 0000000020002a80 RDI: 000000000000000a
  [   47.043171] RBP: 00007fff8f9fc530 R08: 0000000002049300 R09: 0000000020000100
  [   47.044626] R10: 0000000000000004 R11: 0000000000000213 R12: 0000000000401070
  [   47.046088] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
  [   47.047579] Modules linked in:
  [   47.048318] CR2: 0000000000000000
  [   47.049120] ---[ end trace 7ad34443d5be719a ]---
  [   47.050273] RIP: 0010:___bpf_prog_run+0x17b/0x1710
  [   47.051343] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02
  [   47.054943] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246
  [   47.056068] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000
  [   47.057522] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98
  [   47.058961] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff
  [   47.060390] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98
  [   47.061803] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8
  [   47.063249] FS:  00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000
  [   47.065070] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   47.066307] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0
  [   47.067747] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   47.069217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   47.070652] PKRU: 55555554
  [   47.071318] Kernel panic - not syncing: Fatal exception
  [   47.072854] Kernel Offset: disabled
  [   47.073683] ---[ end Kernel panic - not syncing: Fatal exception ]---

Fixes: 9216477 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap")
Fixes: fbee97f ("bpf: Add support to attach bpf program to a devmap entry")
Reported-by: Abaci <[email protected]>
Signed-off-by: Xuan Zhuo <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Dust Li <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Acked-by: David Ahern <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
qzed pushed a commit that referenced this pull request Aug 1, 2021
[ Upstream commit 5e21bb4 ]

These two types of XDP progs (BPF_XDP_DEVMAP, BPF_XDP_CPUMAP) will not be
executed directly in the driver, therefore we should also not directly
run them from here. To run in these two situations, there must be further
preparations done, otherwise these may cause a kernel panic.

For more details, see also dev_xdp_attach().

  [   46.982479] BUG: kernel NULL pointer dereference, address: 0000000000000000
  [   46.984295] #PF: supervisor read access in kernel mode
  [   46.985777] #PF: error_code(0x0000) - not-present page
  [   46.987227] PGD 800000010dca4067 P4D 800000010dca4067 PUD 10dca6067 PMD 0
  [   46.989201] Oops: 0000 [#1] SMP PTI
  [   46.990304] CPU: 7 PID: 562 Comm: a.out Not tainted 5.13.0+ #44
  [   46.992001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/24
  [   46.995113] RIP: 0010:___bpf_prog_run+0x17b/0x1710
  [   46.996586] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02
  [   47.001562] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246
  [   47.003115] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000
  [   47.005163] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98
  [   47.007135] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff
  [   47.009171] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98
  [   47.011172] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8
  [   47.013244] FS:  00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000
  [   47.015705] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   47.017475] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0
  [   47.019558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   47.021595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   47.023574] PKRU: 55555554
  [   47.024571] Call Trace:
  [   47.025424]  __bpf_prog_run32+0x32/0x50
  [   47.026296]  ? printk+0x53/0x6a
  [   47.027066]  ? ktime_get+0x39/0x90
  [   47.027895]  bpf_test_run.cold.28+0x23/0x123
  [   47.028866]  ? printk+0x53/0x6a
  [   47.029630]  bpf_prog_test_run_xdp+0x149/0x1d0
  [   47.030649]  __sys_bpf+0x1305/0x23d0
  [   47.031482]  __x64_sys_bpf+0x17/0x20
  [   47.032316]  do_syscall_64+0x3a/0x80
  [   47.033165]  entry_SYSCALL_64_after_hwframe+0x44/0xae
  [   47.034254] RIP: 0033:0x7f04a51364dd
  [   47.035133] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 48
  [   47.038768] RSP: 002b:00007fff8f9fc518 EFLAGS: 00000213 ORIG_RAX: 0000000000000141
  [   47.040344] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f04a51364dd
  [   47.041749] RDX: 0000000000000048 RSI: 0000000020002a80 RDI: 000000000000000a
  [   47.043171] RBP: 00007fff8f9fc530 R08: 0000000002049300 R09: 0000000020000100
  [   47.044626] R10: 0000000000000004 R11: 0000000000000213 R12: 0000000000401070
  [   47.046088] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
  [   47.047579] Modules linked in:
  [   47.048318] CR2: 0000000000000000
  [   47.049120] ---[ end trace 7ad34443d5be719a ]---
  [   47.050273] RIP: 0010:___bpf_prog_run+0x17b/0x1710
  [   47.051343] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02
  [   47.054943] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246
  [   47.056068] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000
  [   47.057522] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98
  [   47.058961] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff
  [   47.060390] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98
  [   47.061803] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8
  [   47.063249] FS:  00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000
  [   47.065070] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   47.066307] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0
  [   47.067747] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   47.069217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   47.070652] PKRU: 55555554
  [   47.071318] Kernel panic - not syncing: Fatal exception
  [   47.072854] Kernel Offset: disabled
  [   47.073683] ---[ end Kernel panic - not syncing: Fatal exception ]---

Fixes: 9216477 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap")
Fixes: fbee97f ("bpf: Add support to attach bpf program to a devmap entry")
Reported-by: Abaci <[email protected]>
Signed-off-by: Xuan Zhuo <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Dust Li <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Acked-by: David Ahern <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Aug 1, 2021
[ Upstream commit 5e21bb4 ]

These two types of XDP progs (BPF_XDP_DEVMAP, BPF_XDP_CPUMAP) will not be
executed directly in the driver, therefore we should also not directly
run them from here. To run in these two situations, there must be further
preparations done, otherwise these may cause a kernel panic.

For more details, see also dev_xdp_attach().

  [   46.982479] BUG: kernel NULL pointer dereference, address: 0000000000000000
  [   46.984295] #PF: supervisor read access in kernel mode
  [   46.985777] #PF: error_code(0x0000) - not-present page
  [   46.987227] PGD 800000010dca4067 P4D 800000010dca4067 PUD 10dca6067 PMD 0
  [   46.989201] Oops: 0000 [#1] SMP PTI
  [   46.990304] CPU: 7 PID: 562 Comm: a.out Not tainted 5.13.0+ #44
  [   46.992001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/24
  [   46.995113] RIP: 0010:___bpf_prog_run+0x17b/0x1710
  [   46.996586] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02
  [   47.001562] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246
  [   47.003115] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000
  [   47.005163] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98
  [   47.007135] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff
  [   47.009171] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98
  [   47.011172] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8
  [   47.013244] FS:  00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000
  [   47.015705] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   47.017475] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0
  [   47.019558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   47.021595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   47.023574] PKRU: 55555554
  [   47.024571] Call Trace:
  [   47.025424]  __bpf_prog_run32+0x32/0x50
  [   47.026296]  ? printk+0x53/0x6a
  [   47.027066]  ? ktime_get+0x39/0x90
  [   47.027895]  bpf_test_run.cold.28+0x23/0x123
  [   47.028866]  ? printk+0x53/0x6a
  [   47.029630]  bpf_prog_test_run_xdp+0x149/0x1d0
  [   47.030649]  __sys_bpf+0x1305/0x23d0
  [   47.031482]  __x64_sys_bpf+0x17/0x20
  [   47.032316]  do_syscall_64+0x3a/0x80
  [   47.033165]  entry_SYSCALL_64_after_hwframe+0x44/0xae
  [   47.034254] RIP: 0033:0x7f04a51364dd
  [   47.035133] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 48
  [   47.038768] RSP: 002b:00007fff8f9fc518 EFLAGS: 00000213 ORIG_RAX: 0000000000000141
  [   47.040344] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f04a51364dd
  [   47.041749] RDX: 0000000000000048 RSI: 0000000020002a80 RDI: 000000000000000a
  [   47.043171] RBP: 00007fff8f9fc530 R08: 0000000002049300 R09: 0000000020000100
  [   47.044626] R10: 0000000000000004 R11: 0000000000000213 R12: 0000000000401070
  [   47.046088] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
  [   47.047579] Modules linked in:
  [   47.048318] CR2: 0000000000000000
  [   47.049120] ---[ end trace 7ad34443d5be719a ]---
  [   47.050273] RIP: 0010:___bpf_prog_run+0x17b/0x1710
  [   47.051343] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02
  [   47.054943] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246
  [   47.056068] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000
  [   47.057522] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98
  [   47.058961] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff
  [   47.060390] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98
  [   47.061803] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8
  [   47.063249] FS:  00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000
  [   47.065070] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   47.066307] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0
  [   47.067747] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [   47.069217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [   47.070652] PKRU: 55555554
  [   47.071318] Kernel panic - not syncing: Fatal exception
  [   47.072854] Kernel Offset: disabled
  [   47.073683] ---[ end Kernel panic - not syncing: Fatal exception ]---

Fixes: 9216477 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap")
Fixes: fbee97f ("bpf: Add support to attach bpf program to a devmap entry")
Reported-by: Abaci <[email protected]>
Signed-off-by: Xuan Zhuo <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Dust Li <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Acked-by: David Ahern <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Feb 3, 2022
[ Upstream commit c1020d3 ]

On an arm64 platform with the Spectrum ASIC, after loading and executing
a new kernel via kexec, the following trace [1] is observed. This seems
to be caused by the fact that the device is not properly shutdown before
executing the new kernel.

Fix this by implementing a shutdown method which mirrors the remove
method, as recommended by the kexec maintainer [2][3].

[1]
BUG: Bad page state in process devlink pfn:22f73d
page:fffffe00089dcf40 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x2ffff00000000000()
raw: 2ffff00000000000 0000000000000000 ffffffff089d0201 0000000000000000
raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
page dumped because: nonzero _refcount
Modules linked in:
CPU: 1 PID: 16346 Comm: devlink Tainted: G B 5.8.0-rc6-custom-273020-gac6b365b1bf5 #44
Hardware name: Marvell Armada 7040 TX4810M (DT)
Call trace:
 dump_backtrace+0x0/0x1d0
 show_stack+0x1c/0x28
 dump_stack+0xbc/0x118
 bad_page+0xcc/0xf8
 check_free_page_bad+0x80/0x88
 __free_pages_ok+0x3f8/0x418
 __free_pages+0x38/0x60
 kmem_freepages+0x200/0x2a8
 slab_destroy+0x28/0x68
 slabs_destroy+0x60/0x90
 ___cache_free+0x1b4/0x358
 kfree+0xc0/0x1d0
 skb_free_head+0x2c/0x38
 skb_release_data+0x110/0x1a0
 skb_release_all+0x2c/0x38
 consume_skb+0x38/0x130
 __dev_kfree_skb_any+0x44/0x50
 mlxsw_pci_rdq_fini+0x8c/0xb0
 mlxsw_pci_queue_fini.isra.0+0x28/0x58
 mlxsw_pci_queue_group_fini+0x58/0x88
 mlxsw_pci_aqs_fini+0x2c/0x60
 mlxsw_pci_fini+0x34/0x50
 mlxsw_core_bus_device_unregister+0x104/0x1d0
 mlxsw_devlink_core_bus_device_reload_down+0x2c/0x48
 devlink_reload+0x44/0x158
 devlink_nl_cmd_reload+0x270/0x290
 genl_rcv_msg+0x188/0x2f0
 netlink_rcv_skb+0x5c/0x118
 genl_rcv+0x3c/0x50
 netlink_unicast+0x1bc/0x278
 netlink_sendmsg+0x194/0x390
 __sys_sendto+0xe0/0x158
 __arm64_sys_sendto+0x2c/0x38
 el0_svc_common.constprop.0+0x70/0x168
 do_el0_svc+0x28/0x88
 el0_sync_handler+0x88/0x190
 el0_sync+0x140/0x180

[2]
https://www.mail-archive.com/[email protected]/msg1195432.html

[3]
https://patchwork.kernel.org/project/linux-scsi/patch/[email protected]/#20116693

Cc: Eric Biederman <[email protected]>
Signed-off-by: Danielle Ratson <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Jul 21, 2022
The return thunk call makes the fastop functions larger, just like IBT
does. Consider a 16-byte FASTOP_SIZE when CONFIG_RETHUNK is enabled.

Otherwise, functions will be incorrectly aligned and when computing their
position for differently sized operators, they will executed in the middle
or end of a function, which may as well be an int3, leading to a crash
like:

[   36.091116] int3: 0000 [#1] SMP NOPTI
[   36.091119] CPU: 3 PID: 1371 Comm: qemu-system-x86 Not tainted 5.15.0-41-generic #44
[   36.091120] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[   36.091121] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm]
[   36.091185] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc
[   36.091186] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202
[   36.091188] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000
[   36.091188] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200
[   36.091189] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002
[   36.091190] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70
[   36.091190] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000
[   36.091191] FS:  00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000
[   36.091192] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.091192] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0
[   36.091195] PKRU: 55555554
[   36.091195] Call Trace:
[   36.091197]  <TASK>
[   36.091198]  ? fastop+0x5a/0xa0 [kvm]
[   36.091222]  x86_emulate_insn+0x7b8/0xe90 [kvm]
[   36.091244]  x86_emulate_instruction+0x2f4/0x630 [kvm]
[   36.091263]  ? kvm_arch_vcpu_load+0x7c/0x230 [kvm]
[   36.091283]  ? vmx_prepare_switch_to_host+0xf7/0x190 [kvm_intel]
[   36.091290]  complete_emulated_mmio+0x297/0x320 [kvm]
[   36.091310]  kvm_arch_vcpu_ioctl_run+0x32f/0x550 [kvm]
[   36.091330]  kvm_vcpu_ioctl+0x29e/0x6d0 [kvm]
[   36.091344]  ? kvm_vcpu_ioctl+0x120/0x6d0 [kvm]
[   36.091357]  ? __fget_files+0x86/0xc0
[   36.091362]  ? __fget_files+0x86/0xc0
[   36.091363]  __x64_sys_ioctl+0x92/0xd0
[   36.091366]  do_syscall_64+0x59/0xc0
[   36.091369]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091370]  ? do_syscall_64+0x69/0xc0
[   36.091371]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091372]  ? __x64_sys_writev+0x1c/0x30
[   36.091374]  ? do_syscall_64+0x69/0xc0
[   36.091374]  ? exit_to_user_mode_prepare+0x37/0xb0
[   36.091378]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091379]  ? do_syscall_64+0x69/0xc0
[   36.091379]  ? do_syscall_64+0x69/0xc0
[   36.091380]  ? do_syscall_64+0x69/0xc0
[   36.091381]  ? do_syscall_64+0x69/0xc0
[   36.091381]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   36.091384] RIP: 0033:0x7efdfe6d1aff
[   36.091390] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[   36.091391] RSP: 002b:00007efdfce8c460 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   36.091393] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007efdfe6d1aff
[   36.091393] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000c
[   36.091394] RBP: 0000558f1609e220 R08: 0000558f13fb8190 R09: 00000000ffffffff
[   36.091394] R10: 0000558f16b5e950 R11: 0000000000000246 R12: 0000000000000000
[   36.091394] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
[   36.091396]  </TASK>
[   36.091397] Modules linked in: isofs nls_iso8859_1 kvm_intel joydev kvm input_leds serio_raw sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler drm msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net net_failover crypto_simd ahci xhci_pci cryptd psmouse virtio_blk libahci xhci_pci_renesas failover
[   36.123271] ---[ end trace db3c0ab5a48fabcc ]---
[   36.123272] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm]
[   36.123319] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc
[   36.123320] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202
[   36.123321] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000
[   36.123321] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200
[   36.123322] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002
[   36.123322] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70
[   36.123323] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000
[   36.123323] FS:  00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000
[   36.123324] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.123325] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0
[   36.123327] PKRU: 55555554
[   36.123328] Kernel panic - not syncing: Fatal exception in interrupt
[   36.123410] Kernel Offset: 0x1400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   36.135305] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Fixes: aa3d480 ("x86: Use return-thunk in asm code")
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Co-developed-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Reported-by: Linux Kernel Functional Testing <[email protected]>
Message-Id: <[email protected]>
Tested-by: Jack Wang <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
qzed pushed a commit that referenced this pull request Aug 4, 2022
commit 84e7051 upstream.

The return thunk call makes the fastop functions larger, just like IBT
does. Consider a 16-byte FASTOP_SIZE when CONFIG_RETHUNK is enabled.

Otherwise, functions will be incorrectly aligned and when computing their
position for differently sized operators, they will executed in the middle
or end of a function, which may as well be an int3, leading to a crash
like:

[   36.091116] int3: 0000 [#1] SMP NOPTI
[   36.091119] CPU: 3 PID: 1371 Comm: qemu-system-x86 Not tainted 5.15.0-41-generic #44
[   36.091120] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[   36.091121] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm]
[   36.091185] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc
[   36.091186] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202
[   36.091188] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000
[   36.091188] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200
[   36.091189] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002
[   36.091190] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70
[   36.091190] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000
[   36.091191] FS:  00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000
[   36.091192] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.091192] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0
[   36.091195] PKRU: 55555554
[   36.091195] Call Trace:
[   36.091197]  <TASK>
[   36.091198]  ? fastop+0x5a/0xa0 [kvm]
[   36.091222]  x86_emulate_insn+0x7b8/0xe90 [kvm]
[   36.091244]  x86_emulate_instruction+0x2f4/0x630 [kvm]
[   36.091263]  ? kvm_arch_vcpu_load+0x7c/0x230 [kvm]
[   36.091283]  ? vmx_prepare_switch_to_host+0xf7/0x190 [kvm_intel]
[   36.091290]  complete_emulated_mmio+0x297/0x320 [kvm]
[   36.091310]  kvm_arch_vcpu_ioctl_run+0x32f/0x550 [kvm]
[   36.091330]  kvm_vcpu_ioctl+0x29e/0x6d0 [kvm]
[   36.091344]  ? kvm_vcpu_ioctl+0x120/0x6d0 [kvm]
[   36.091357]  ? __fget_files+0x86/0xc0
[   36.091362]  ? __fget_files+0x86/0xc0
[   36.091363]  __x64_sys_ioctl+0x92/0xd0
[   36.091366]  do_syscall_64+0x59/0xc0
[   36.091369]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091370]  ? do_syscall_64+0x69/0xc0
[   36.091371]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091372]  ? __x64_sys_writev+0x1c/0x30
[   36.091374]  ? do_syscall_64+0x69/0xc0
[   36.091374]  ? exit_to_user_mode_prepare+0x37/0xb0
[   36.091378]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091379]  ? do_syscall_64+0x69/0xc0
[   36.091379]  ? do_syscall_64+0x69/0xc0
[   36.091380]  ? do_syscall_64+0x69/0xc0
[   36.091381]  ? do_syscall_64+0x69/0xc0
[   36.091381]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   36.091384] RIP: 0033:0x7efdfe6d1aff
[   36.091390] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[   36.091391] RSP: 002b:00007efdfce8c460 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   36.091393] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007efdfe6d1aff
[   36.091393] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000c
[   36.091394] RBP: 0000558f1609e220 R08: 0000558f13fb8190 R09: 00000000ffffffff
[   36.091394] R10: 0000558f16b5e950 R11: 0000000000000246 R12: 0000000000000000
[   36.091394] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
[   36.091396]  </TASK>
[   36.091397] Modules linked in: isofs nls_iso8859_1 kvm_intel joydev kvm input_leds serio_raw sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler drm msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net net_failover crypto_simd ahci xhci_pci cryptd psmouse virtio_blk libahci xhci_pci_renesas failover
[   36.123271] ---[ end trace db3c0ab5a48fabcc ]---
[   36.123272] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm]
[   36.123319] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc
[   36.123320] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202
[   36.123321] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000
[   36.123321] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200
[   36.123322] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002
[   36.123322] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70
[   36.123323] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000
[   36.123323] FS:  00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000
[   36.123324] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.123325] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0
[   36.123327] PKRU: 55555554
[   36.123328] Kernel panic - not syncing: Fatal exception in interrupt
[   36.123410] Kernel Offset: 0x1400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   36.135305] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Fixes: aa3d480 ("x86: Use return-thunk in asm code")
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Co-developed-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Reported-by: Linux Kernel Functional Testing <[email protected]>
Message-Id: <[email protected]>
Tested-by: Jack Wang <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
qzed pushed a commit that referenced this pull request Oct 14, 2022
[ Upstream commit 37fafe6 ]

If a request fails at virtio_queue_rqs(), it is inserted to requeue_list
and passed to virtio_queue_rq(). Then blk_mq_start_request() can be called
again at virtio_queue_rq() and trigger WARN_ON_ONCE like below trace because
request state was already set to MQ_RQ_IN_FLIGHT in virtio_queue_rqs()
despite the failure.

[    1.890468] ------------[ cut here ]------------
[    1.890776] WARNING: CPU: 2 PID: 122 at block/blk-mq.c:1143
blk_mq_start_request+0x8a/0xe0
[    1.891045] Modules linked in:
[    1.891250] CPU: 2 PID: 122 Comm: journal-offline Not tainted 5.19.0+ #44
[    1.891504] Hardware name: ChromiumOS crosvm, BIOS 0
[    1.891739] RIP: 0010:blk_mq_start_request+0x8a/0xe0
[    1.891961] Code: 12 80 74 22 48 8b 4b 10 8b 89 64 01 00 00 8b 53
20 83 fa ff 75 08 ba 00 00 00 80 0b 53 24 c1 e1 10 09 d1 89 48 34 5b
41 5e c3 <0f> 0b eb b8 65 8b 05 2b 39 b6 7e 89 c0 48 0f a3 05 39 77 5b
01 0f
[    1.892443] RSP: 0018:ffffc900002777b0 EFLAGS: 00010202
[    1.892673] RAX: 0000000000000000 RBX: ffff888004bc0000 RCX: 0000000000000000
[    1.892952] RDX: 0000000000000000 RSI: ffff888003d7c200 RDI: ffff888004bc0000
[    1.893228] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff888004bc0100
[    1.893506] R10: ffffffffffffffff R11: ffffffff8185ca10 R12: ffff888004bc0000
[    1.893797] R13: ffffc90000277900 R14: ffff888004ab2340 R15: ffff888003d86e00
[    1.894060] FS:  00007ffa143a4640(0000) GS:ffff88807dd00000(0000)
knlGS:0000000000000000
[    1.894412] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.894682] CR2: 00005648577d9088 CR3: 00000000053da004 CR4: 0000000000170ee0
[    1.894953] Call Trace:
[    1.895139]  <TASK>
[    1.895303]  virtblk_prep_rq+0x1e5/0x280
[    1.895509]  virtio_queue_rq+0x5c/0x310
[    1.895710]  ? virtqueue_add_sgs+0x95/0xb0
[    1.895905]  ? _raw_spin_unlock_irqrestore+0x16/0x30
[    1.896133]  ? virtio_queue_rqs+0x340/0x390
[    1.896453]  ? sbitmap_get+0xfa/0x220
[    1.896678]  __blk_mq_issue_directly+0x41/0x180
[    1.896906]  blk_mq_plug_issue_direct+0xd8/0x2c0
[    1.897115]  blk_mq_flush_plug_list+0x115/0x180
[    1.897342]  blk_add_rq_to_plug+0x51/0x130
[    1.897543]  blk_mq_submit_bio+0x3a1/0x570
[    1.897750]  submit_bio_noacct_nocheck+0x418/0x520
[    1.897985]  ? submit_bio_noacct+0x1e/0x260
[    1.897989]  ext4_bio_write_page+0x222/0x420
[    1.898000]  mpage_process_page_bufs+0x178/0x1c0
[    1.899451]  mpage_prepare_extent_to_map+0x2d2/0x440
[    1.899603]  ext4_writepages+0x495/0x1020
[    1.899733]  do_writepages+0xcb/0x220
[    1.899871]  ? __seccomp_filter+0x171/0x7e0
[    1.900006]  file_write_and_wait_range+0xcd/0xf0
[    1.900167]  ext4_sync_file+0x72/0x320
[    1.900308]  __x64_sys_fsync+0x66/0xa0
[    1.900449]  do_syscall_64+0x31/0x50
[    1.900595]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[    1.900747] RIP: 0033:0x7ffa16ec96ea
[    1.900883] Code: b8 4a 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3
48 83 ec 18 89 7c 24 0c e8 e3 02 f8 ff 8b 7c 24 0c 89 c2 b8 4a 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 43 03 f8 ff 8b
44 24
[    1.901302] RSP: 002b:00007ffa143a3ac0 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[    1.901499] RAX: ffffffffffffffda RBX: 0000560277ec6fe0 RCX: 00007ffa16ec96ea
[    1.901696] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000016
[    1.901884] RBP: 0000560277ec5910 R08: 0000000000000000 R09: 00007ffa143a4640
[    1.902082] R10: 00007ffa16e4d39e R11: 0000000000000293 R12: 00005602773f59e0
[    1.902459] R13: 0000000000000000 R14: 00007fffbfc007ff R15: 00007ffa13ba4000
[    1.902763]  </TASK>
[    1.902877] ---[ end trace 0000000000000000 ]---

To avoid calling blk_mq_start_request() twice, This patch moves the
execution of blk_mq_start_request() to the end of virtblk_prep_rq().
And instead of requeuing failed request to plug list in the error path of
virtblk_add_req_batch(), it uses blk_mq_requeue_request() to change failed
request state to MQ_RQ_IDLE. Then virtblk can safely handle the request
on the next trial.

Fixes: 0e9911f ("virtio-blk: support mq_ops->queue_rqs()")
Reported-by: Alexandre Courbot <[email protected]>
Tested-by: Alexandre Courbot <[email protected]>
Signed-off-by: Suwan Kim <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Stefan Hajnoczi <[email protected]>
Reviewed-by: Pankaj Raghav <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Jan 2, 2023
[ Upstream commit ea60a4a ]

When the dev init failed, should cleanup the sysfs, otherwise, the
module will never be loaded since can not create duplicate sysfs
directory:

  sysfs: cannot create duplicate filename '/fs/orangefs'

  CPU: 1 PID: 6549 Comm: insmod Tainted: G        W          6.0.0+ #44
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x34/0x44
   sysfs_warn_dup.cold+0x17/0x24
   sysfs_create_dir_ns+0x16d/0x180
   kobject_add_internal+0x156/0x3a0
   kobject_init_and_add+0xcf/0x120
   orangefs_sysfs_init+0x7e/0x3a0 [orangefs]
   orangefs_init+0xfe/0x1000 [orangefs]
   do_one_initcall+0x87/0x2a0
   do_init_module+0xdf/0x320
   load_module+0x2f98/0x3330
   __do_sys_finit_module+0x113/0x1b0
   do_syscall_64+0x35/0x80
   entry_SYSCALL_64_after_hwframe+0x46/0xb0

  kobject_add_internal failed for orangefs with -EEXIST, don't try to register things with the same name in the same directory.

Fixes: 2f83ace ("orangefs: put register_chrdev immediately before register_filesystem")
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Mike Marshall <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Jan 2, 2023
[ Upstream commit ea60a4a ]

When the dev init failed, should cleanup the sysfs, otherwise, the
module will never be loaded since can not create duplicate sysfs
directory:

  sysfs: cannot create duplicate filename '/fs/orangefs'

  CPU: 1 PID: 6549 Comm: insmod Tainted: G        W          6.0.0+ #44
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x34/0x44
   sysfs_warn_dup.cold+0x17/0x24
   sysfs_create_dir_ns+0x16d/0x180
   kobject_add_internal+0x156/0x3a0
   kobject_init_and_add+0xcf/0x120
   orangefs_sysfs_init+0x7e/0x3a0 [orangefs]
   orangefs_init+0xfe/0x1000 [orangefs]
   do_one_initcall+0x87/0x2a0
   do_init_module+0xdf/0x320
   load_module+0x2f98/0x3330
   __do_sys_finit_module+0x113/0x1b0
   do_syscall_64+0x35/0x80
   entry_SYSCALL_64_after_hwframe+0x46/0xb0

  kobject_add_internal failed for orangefs with -EEXIST, don't try to register things with the same name in the same directory.

Fixes: 2f83ace ("orangefs: put register_chrdev immediately before register_filesystem")
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Mike Marshall <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Jun 14, 2023
[ Upstream commit 71c6aa0 ]

We encountered a crash when using SMCRv2. It is caused by a logical
error in smc_llc_fill_ext_v2().

 BUG: kernel NULL pointer dereference, address: 0000000000000014
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 7 PID: 453 Comm: kworker/7:4 Kdump: loaded Tainted: G        W   E      6.4.0-rc3+ #44
 Workqueue: events smc_llc_add_link_work [smc]
 RIP: 0010:smc_llc_fill_ext_v2+0x117/0x280 [smc]
 RSP: 0018:ffffacb5c064bd88 EFLAGS: 00010282
 RAX: ffff9a6bc1c3c02c RBX: ffff9a6be3558000 RCX: 0000000000000000
 RDX: 0000000000000002 RSI: 0000000000000002 RDI: 000000000000000a
 RBP: ffffacb5c064bdb8 R08: 0000000000000040 R09: 000000000000000c
 R10: ffff9a6bc0910300 R11: 0000000000000002 R12: 0000000000000000
 R13: 0000000000000002 R14: ffff9a6bc1c3c02c R15: ffff9a6be3558250
 FS:  0000000000000000(0000) GS:ffff9a6eefdc0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000014 CR3: 000000010b078003 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  smc_llc_send_add_link+0x1ae/0x2f0 [smc]
  smc_llc_srv_add_link+0x2c9/0x5a0 [smc]
  ? cc_mkenc+0x40/0x60
  smc_llc_add_link_work+0xb8/0x140 [smc]
  process_one_work+0x1e5/0x3f0
  worker_thread+0x4d/0x2f0
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xe5/0x120
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x2c/0x50
  </TASK>

When an alernate RNIC is available in system, SMC will try to add a new
link based on the RNIC for resilience. All the RMBs in use will be mapped
to the new link. Then the RMBs' MRs corresponding to the new link will be
filled into SMCRv2 LLC ADD LINK messages.

However, smc_llc_fill_ext_v2() mistakenly accesses to unused RMBs which
haven't been mapped to the new link and have no valid MRs, thus causing
a crash. So this patch fixes the logic.

Fixes: b4ba465 ("net/smc: extend LLC layer for SMC-Rv2")
Signed-off-by: Wen Gu <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Jun 14, 2023
[ Upstream commit 71c6aa0 ]

We encountered a crash when using SMCRv2. It is caused by a logical
error in smc_llc_fill_ext_v2().

 BUG: kernel NULL pointer dereference, address: 0000000000000014
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP PTI
 CPU: 7 PID: 453 Comm: kworker/7:4 Kdump: loaded Tainted: G        W   E      6.4.0-rc3+ #44
 Workqueue: events smc_llc_add_link_work [smc]
 RIP: 0010:smc_llc_fill_ext_v2+0x117/0x280 [smc]
 RSP: 0018:ffffacb5c064bd88 EFLAGS: 00010282
 RAX: ffff9a6bc1c3c02c RBX: ffff9a6be3558000 RCX: 0000000000000000
 RDX: 0000000000000002 RSI: 0000000000000002 RDI: 000000000000000a
 RBP: ffffacb5c064bdb8 R08: 0000000000000040 R09: 000000000000000c
 R10: ffff9a6bc0910300 R11: 0000000000000002 R12: 0000000000000000
 R13: 0000000000000002 R14: ffff9a6bc1c3c02c R15: ffff9a6be3558250
 FS:  0000000000000000(0000) GS:ffff9a6eefdc0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000014 CR3: 000000010b078003 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  smc_llc_send_add_link+0x1ae/0x2f0 [smc]
  smc_llc_srv_add_link+0x2c9/0x5a0 [smc]
  ? cc_mkenc+0x40/0x60
  smc_llc_add_link_work+0xb8/0x140 [smc]
  process_one_work+0x1e5/0x3f0
  worker_thread+0x4d/0x2f0
  ? __pfx_worker_thread+0x10/0x10
  kthread+0xe5/0x120
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x2c/0x50
  </TASK>

When an alernate RNIC is available in system, SMC will try to add a new
link based on the RNIC for resilience. All the RMBs in use will be mapped
to the new link. Then the RMBs' MRs corresponding to the new link will be
filled into SMCRv2 LLC ADD LINK messages.

However, smc_llc_fill_ext_v2() mistakenly accesses to unused RMBs which
haven't been mapped to the new link and have no valid MRs, thus causing
a crash. So this patch fixes the logic.

Fixes: b4ba465 ("net/smc: extend LLC layer for SMC-Rv2")
Signed-off-by: Wen Gu <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
StollD pushed a commit that referenced this pull request Mar 9, 2024
[ Upstream commit 4849775 ]

If the net_cls subsystem is already mounted, attempting to mount it again
in setup_classid_environment() will result in a failure with the error code
EBUSY. Despite this, tmpfs will have been successfully mounted at
/sys/fs/cgroup/net_cls. Consequently, the /sys/fs/cgroup/net_cls directory
will be empty, causing subsequent setup operations to fail.

Here's an error log excerpt illustrating the issue when net_cls has already
been mounted at /sys/fs/cgroup/net_cls prior to running
setup_classid_environment():

- Before that change

  $ tools/testing/selftests/bpf/test_progs --name=cgroup_v1v2
  test_cgroup_v1v2:PASS:server_fd 0 nsec
  test_cgroup_v1v2:PASS:client_fd 0 nsec
  test_cgroup_v1v2:PASS:cgroup_fd 0 nsec
  test_cgroup_v1v2:PASS:server_fd 0 nsec
  run_test:PASS:skel_open 0 nsec
  run_test:PASS:prog_attach 0 nsec
  test_cgroup_v1v2:PASS:cgroup-v2-only 0 nsec
  (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup.procs
  (cgroup_helpers.c:540: errno: No such file or directory) Opening cgroup classid: /sys/fs/cgroup/net_cls/cgroup-test-work-dir/net_cls.classid
  run_test:PASS:skel_open 0 nsec
  run_test:PASS:prog_attach 0 nsec
  (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup-test-work-dir/cgroup.procs
  run_test:FAIL:join_classid unexpected error: 1 (errno 2)
  test_cgroup_v1v2:FAIL:cgroup-v1v2 unexpected error: -1 (errno 2)
  (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup.procs
  #44      cgroup_v1v2:FAIL
  Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

- After that change
  $ tools/testing/selftests/bpf/test_progs --name=cgroup_v1v2
  #44      cgroup_v1v2:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Yafang Shao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
qzed pushed a commit that referenced this pull request Jul 18, 2024
[ Upstream commit 5d92c7c ]

If the ata_port_alloc() call in ata_host_alloc() fails,
ata_host_release() will get called.

However, the code in ata_host_release() tries to free ata_port struct
members unconditionally, which can lead to the following:

BUG: unable to handle page fault for address: 0000000000003990
PGD 0 P4D 0
Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 10 PID: 594 Comm: (udev-worker) Not tainted 6.10.0-rc5 #44
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
RIP: 0010:ata_host_release.cold+0x2f/0x6e [libata]
Code: e4 4d 63 f4 44 89 e2 48 c7 c6 90 ad 32 c0 48 c7 c7 d0 70 33 c0 49 83 c6 0e 41
RSP: 0018:ffffc90000ebb968 EFLAGS: 00010246
RAX: 0000000000000041 RBX: ffff88810fb52e78 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88813b3218c0 RDI: ffff88813b3218c0
RBP: ffff88810fb52e40 R08: 0000000000000000 R09: 6c65725f74736f68
R10: ffffc90000ebb738 R11: 73692033203a746e R12: 0000000000000004
R13: 0000000000000000 R14: 0000000000000011 R15: 0000000000000006
FS:  00007f6cc55b9980(0000) GS:ffff88813b300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000003990 CR3: 00000001122a2000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
 <TASK>
 ? __die_body.cold+0x19/0x27
 ? page_fault_oops+0x15a/0x2f0
 ? exc_page_fault+0x7e/0x180
 ? asm_exc_page_fault+0x26/0x30
 ? ata_host_release.cold+0x2f/0x6e [libata]
 ? ata_host_release.cold+0x2f/0x6e [libata]
 release_nodes+0x35/0xb0
 devres_release_group+0x113/0x140
 ata_host_alloc+0xed/0x120 [libata]
 ata_host_alloc_pinfo+0x14/0xa0 [libata]
 ahci_init_one+0x6c9/0xd20 [ahci]

Do not access ata_port struct members unconditionally.

Fixes: 633273a ("libata-pmp: hook PMP support and enable it")
Cc: [email protected]
Reviewed-by: Damien Le Moal <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: John Garry <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Niklas Cassel <[email protected]>
Stable-dep-of: f6549f5 ("ata,scsi: libata-core: Do not leak memory for ata_port struct members")
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants