Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to LTS v5.15.120 #5

Conversation

tarmeh2r
Copy link

Vudentz and others added 30 commits July 11, 2023 17:19
commit c5d2b6fa26b5b8386a9cc902cdece3a46bef2bd2 upstream.

Similar to commit 0f7d9b31ce7a ("netfilter: nf_tables: fix use-after-free
in nft_set_catchall_destroy()"). We can not access k after kfree_rcu()
call.

Cc: [email protected]
Signed-off-by: Min Li <[email protected]>
Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 47c5d829a3e326b7395352a10fc8a6effe7afa15 upstream.

Since commit 3e4be65 ("Bluetooth: hci_qca: Add poweroff support
during hci down for wcn3990"), the setup callback which registers the
debugfs interface can be called multiple times.

This specifically leads to the following error when powering on the
controller:

	debugfs: Directory 'ibs' with parent 'hci0' already present!

Add a driver flag to avoid trying to register the debugfs interface more
than once.

Fixes: 3e4be65 ("Bluetooth: hci_qca: Add poweroff support during hci down for wcn3990")
Cc: [email protected]	# 4.20
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Luiz Augusto von Dentz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 436eeae0411acdfc54521ddea80ee76d4ae8a7ea upstream.

After TEE has completed processing of TEE_CMD_ID_LOAD_TA, set proper
value in 'return_origin' argument passed by open_session() call. To do
so, add 'return_origin' field to the structure tee_cmd_load_ta. The
Trusted OS shall update return_origin as part of TEE processing.

This change to 'struct tee_cmd_load_ta' interface requires a similar update
in AMD-TEE Trusted OS's TEE_CMD_ID_LOAD_TA interface.

This patch has been verified on Phoenix Birman setup. On older APUs,
return_origin value will be 0.

Cc: [email protected]
Fixes: 757cc3e ("tee: add AMD-TEE driver")
Tested-by: Sourabh Das <[email protected]>
Signed-off-by: Rijo Thomas <[email protected]>
Acked-by: Sumit Garg <[email protected]>
Signed-off-by: Jens Wiklander <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 09fe05c57b5aaf23e2c35036c98ea9f282b19a77 upstream.

Move RBD_OBJ_FLAG_COPYUP_ENABLED flag setting into the object request
state machine to allow for the snapshot context to be captured in the
image request state machine rather than in rbd_queue_workfn().

Cc: [email protected]
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 870611e4877eff1e8413c3fb92a585e45d5291f6 upstream.

Move capturing the snapshot context into the image request state
machine, after exclusive lock is ensured to be held for the duration of
dealing with the image request.  This is needed to ensure correctness
of fast-diff states (OBJECT_EXISTS vs OBJECT_EXISTS_CLEAN) and object
deltas computed based off of them.  Otherwise the object map that is
forked for the snapshot isn't guaranteed to accurately reflect the
contents of the snapshot when the snapshot is taken under I/O.  This
breaks differential backup and snapshot-based mirroring use cases with
fast-diff enabled: since some object deltas may be incomplete, the
destination image may get corrupted.

Cc: [email protected]
Link: https://tracker.ceph.com/issues/61472
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Dongsheng Yang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 5b10ff013e8a57f8845615ac2cc37edf7f6eef05 upstream.

Without this, the gpio cannot be explicitly mux'ed to its gpio function.

Fixes: 83c5668 ("pinctrl: meson-axg: Add new pinctrl driver for Meson AXG SoC")
Cc: [email protected]
Signed-off-by: Martin Hundebøll <[email protected]>
Reviewed-by: Neil Armstrong <[email protected]>
Reviewed-by: Dmitry Rokosov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 0143d148d1e882fb1538dc9974c94d63961719b9 upstream.

The current implementation of usbdev_mmap uses usb_alloc_coherent to
allocate memory pages that will later be mapped into the user space.
Meanwhile, usb_alloc_coherent employs three different methods to
allocate memory, as outlined below:
 * If hcd->localmem_pool is non-null, it uses gen_pool_dma_alloc to
   allocate memory;
 * If DMA is not available, it uses kmalloc to allocate memory;
 * Otherwise, it uses dma_alloc_coherent.

However, it should be noted that gen_pool_dma_alloc does not guarantee
that the resulting memory will be page-aligned. Furthermore, trying to
map slab pages (i.e., memory allocated by kmalloc) into the user space
is not resonable and can lead to problems, such as a type confusion bug
when PAGE_TABLE_CHECK=y [1].

To address these issues, this patch introduces hcd_alloc_coherent_pages,
which addresses the above two problems. Specifically,
hcd_alloc_coherent_pages uses gen_pool_dma_alloc_align instead of
gen_pool_dma_alloc to ensure that the memory is page-aligned. To replace
kmalloc, hcd_alloc_coherent_pages directly allocates pages by calling
__get_free_pages.

Reported-by: [email protected]
Closes: https://lore.kernel.org/lkml/[email protected]/ [1]
Fixes: f7d34b4 ("USB: Add support for usbfs zerocopy.")
Fixes: ff2437b ("usb: host: Fix excessive alignment restriction for local memory allocations")
Cc: [email protected]
Signed-off-by: Ruihan Li <[email protected]>
Acked-by: Alan Stern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit d0b861653f8c16839c3035875b556afc4472f941 upstream.

When hcd->localmem_pool is non-null, localmem_pool is used to allocate
DMA memory. In this case, the dma address will be properly returned (in
dma_handle), and dma_mmap_coherent should be used to map this memory
into the user space. However, the current implementation uses
pfn_remap_range, which is supposed to map normal pages.

Instead of repeating the logic in the memory allocation function, this
patch introduces a more robust solution. Here, the type of allocated
memory is checked by testing whether dma_handle is properly set. If
dma_handle is properly returned, it means some DMA pages are allocated
and dma_mmap_coherent should be used to map them. Otherwise, normal
pages are allocated and pfn_remap_range should be called. This ensures
that the correct mmap functions are used consistently, independently
with logic details that determine which type of memory gets allocated.

Fixes: a0e710a ("USB: usbfs: fix mmap dma mismatch")
Cc: [email protected]
Signed-off-by: Ruihan Li <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit 6b0db163ff9200a55dc77a652dad1d4b0a853f63 ]

There is no atmel,shdwc-debouncer property for SHDWC. The right DT property
is debounce-delay-us. Use it.

Fixes: 16b161b ("ARM: dts: at91: sama7g5: add shdwc node")
Signed-off-by: Claudiu Beznea <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 6e7a6d4797ef521c0762914610ed682e102b9d36 ]

regmap-sdw does not support multi register writes, so there is
no point in setting this flag. This also leads to incorrect
programming of WSA codecs with regmap_multi_reg_write() call.

This invalid configuration should have been rejected by regmap-sdw.

Fixes: a0aab9e ("ASoC: codecs: add wsa881x amplifier support")
Signed-off-by: Srinivas Kandagatla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
…0-lite boards

[ Upstream commit 3a735530c159b75e1402c08abe1ba4eb99a1f7a3 ]

In general, the three SKUs of sc7180 (lite, normal, and pro) are
handled dynamically.

The cpufreq table in sc7180.dtsi includes the superset of all CPU
frequencies. The "qcom-cpufreq-hw" driver in Linux shows that we can
dynamically detect which frequencies are actually available on the
currently running CPU and then we can just enable those ones.

The GPU is similarly dynamic. The nvmem has a fuse in it (see
"gpu_speed_bin" in sc7180.dtsi) that the GPU driver can use to figure
out which frequencies to enable.

There is one part, however, that is not so dynamic. The way SDRAM
frequency works in sc7180 is that it's tied to cpufreq. At the busiest
cpufreq operating points we'll pick the top supported SDRAM frequency.
They ramp down together.

For the "pro" SKU of sc7180, we only enable one extra cpufreq step.
That extra cpufreq step runs SDRAM at the same speed as the step
below. Thus, for normal and pro things are OK. There is no sc7180-pro
device tree snippet.

For the "lite" SKU if sc7180, however, things aren't so easy. The
"lite" SKU drops 3 cpufreq entries but can still run SDRAM at max
frequency. That messed things up with the whole scheme. This is why we
added the "sc7180-lite" fragment in commit 8fd01e0 ("arm64: dts:
qcom: sc7180-lite: Tweak DDR/L3 scaling on SC7180-lite").

When the lite scheme came about, it was agreed that the WiFi SKUs of
lazor would _always_ be "lite" and would, in fact, be the only "lite"
devices. Unfortunately, this decision changed and folks didn't realize
that it would be a problem. Specifically, some later lazor WiFi-only
devices were built with "pro" CPUs.

Building WiFi-only lazor with "pro" CPUs isn't the end of the world.
The SDRAM will ramp up a little sooner than it otherwise would, but
aside from a small power hit things work OK. One problem, though, is
that the SDRAM scaling becomes a bit quirky. Specifically, with the
current tables we'll max out SDRAM frequency at 2.1GHz but then
_lower_ it at 2.2GHz / 2.3GHz only to raise it back to max for 2.4GHz
and 2.55GHz.

Let's at least fix this so that the SDRAM frequency doesn't go down in
that quirky way. On true "lite" SKUs this change will be a no-op
because the operating points we're touching are disabled. This change
is only useful when a board that thinks it has a "lite" CPU actually
has a "normal" or "pro" one stuffed.

Fixes: 8fd01e0 ("arm64: dts: qcom: sc7180-lite: Tweak DDR/L3 scaling on SC7180-lite")
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Konrad Dybcio <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
Link: https://lore.kernel.org/r/20230515171929.1.Ic8dee2cb79ce39ffc04eab2a344dde47b2f9459f@changeid
Signed-off-by: Sasha Levin <[email protected]>
…eturning void

[ Upstream commit 6461fee68064ba970e3ba90241fe5f5e038aa9d4 ]

The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is (mostly) ignored
and this typically results in resource leaks. To improve here there is a
quest to make the remove callback return void. In the first step of this
quest all drivers are converted to .remove_new() which already returns
void.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Acked-by: Takashi Iwai <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Stable-dep-of: dc93f0dcb436 ("ASoC: mediatek: mt8195: fix use-after-free in driver remove path")
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit dc93f0dcb436dfd24a06c5b3c0f4c5cd9296e8e5 ]

During mt8195_afe_init_clock(), mt8195_audsys_clk_register() was called
followed by several other devm functions. At mt8195_afe_deinit_clock()
located at mt8195_afe_pcm_dev_remove(), mt8195_audsys_clk_unregister()
was called.

However, there was an issue with the order in which these functions were
called. Specifically, the remove callback of platform_driver was called
before devres released the resource, resulting in a use-after-free issue
during remove time.

At probe time, the order of calls was:
1. mt8195_audsys_clk_register
2. afe_priv->clk = devm_kcalloc
3. afe_priv->clk[i] = devm_clk_get

At remove time, the order of calls was:
1. mt8195_audsys_clk_unregister
3. free afe_priv->clk[i]
2. free afe_priv->clk

To resolve the problem, we can utilize devm_add_action_or_reset() in
mt8195_audsys_clk_register() so that the remove order can be changed to
3->2->1.

Fixes: 6746cc8 ("ASoC: mediatek: mt8195: add platform driver")
Signed-off-by: Trevor Wu <[email protected]>
Reviewed-by: Douglas Anderson <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 9bf2e534313fcf420367668cc1f30e10469901dc ]

The final production baseboard had a different chip select than
earlier prototype boards.  When the newer board was released,
the SPI stopped working because the wrong pin was used in the device
tree and conflicted with the UART RTS. Fix the pinmux for
production boards.

Fixes: 36ca3c8 ("arm64: dts: imx: Add Beacon i.MX8M Nano development kit")
Signed-off-by: Adam Ford <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 5578d0a79b6430fa1543640dd6f2d397d0886ce7 ]

There seems to be a bug within the mv64xxx I2C controller, wherein the
status register may not necessarily contain valid value immediately
after the IFLG flag is set in the control register.

My theory is that the controller:
- first sets the IFLG in control register
- then updates the status register
- then raises an interrupt

This may sometime cause weird bugs when in atomic mode, since in this
mode we do not wait for an interrupt, but instead we poll the control
register for IFLG and read status register immediately after.

I encountered -ENXIO from mv64xxx_i2c_fsm() due to this issue when using
this driver in atomic mode.

Note that I've only seen this issue on Armada 385, I don't know whether
other SOCs with this controller are also affected. Also note that this
fix has been in U-Boot for over 4 years [1] without anybody complaining,
so it should not cause regressions.

[1] https://source.denx.de/u-boot/u-boot/-/commit/d50e29662f78

Fixes: 544a8d75f3d6 ("i2c: mv64xxx: Add atomic_xfer method to driver")
Signed-off-by: Marek Behún <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 3aa0519a4780f1b8e11966bd879d4a2934ba455f ]

As described in the commit 111a833dc5cb ("firmware: arm_ffa: Set
reserved/MBZ fields to zero in the memory descriptors") some fields in
the memory descriptor have to be zeroed explicitly. The handle field is
one of these, but it was left out from that change, fix this now.

Fixes: 111a833dc5cb ("firmware: arm_ffa: Set reserved/MBZ fields to zero in the memory descriptors")
Reported-by: Imre Kis <[email protected]>
Signed-off-by: Balint Dobszay <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sudeep Holla <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit ca0aa17f2db3468fd017038d23a78e17388e2f67 ]

If pm runtime resume fails the .remove callback used to exit early. This
resulted in an error message by the driver core but the device gets
removed anyhow. This lets the registered i2c adapter stay around with an
unbound parent device.

So only skip clk disabling if resume failed, but do delete the adapter.

Fixes: 8b9ec07 ("i2c: Add Spreadtrum I2C controller driver")
Signed-off-by: Uwe Kleine-König <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 7f3c782b3914e510b646a77aedc3adeac2e4a63b ]

Selecting only REGMAP_I2C can leave REGMAP unset, causing build errors,
so also select REGMAP to prevent the build errors.

../drivers/misc/eeprom/at24.c:540:42: warning: 'struct regmap_config' declared inside parameter list will not be visible outside of this definition or declaration
  540 |                                   struct regmap_config *regmap_config)
../drivers/misc/eeprom/at24.c: In function 'at24_make_dummy_client':
../drivers/misc/eeprom/at24.c:552:18: error: implicit declaration of function 'devm_regmap_init_i2c' [-Werror=implicit-function-declaration]
  552 |         regmap = devm_regmap_init_i2c(dummy_client, regmap_config);
../drivers/misc/eeprom/at24.c:552:16: warning: assignment to 'struct regmap *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
  552 |         regmap = devm_regmap_init_i2c(dummy_client, regmap_config);
../drivers/misc/eeprom/at24.c: In function 'at24_probe':
../drivers/misc/eeprom/at24.c:586:16: error: variable 'regmap_config' has initializer but incomplete type
  586 |         struct regmap_config regmap_config = { };
../drivers/misc/eeprom/at24.c:586:30: error: storage size of 'regmap_config' isn't known
  586 |         struct regmap_config regmap_config = { };
../drivers/misc/eeprom/at24.c:586:30: warning: unused variable 'regmap_config' [-Wunused-variable]

Fixes: 5c01525 ("eeprom: at24: add basic regmap_i2c support")
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 99a670b2069c725a7b50318aa681d9cae8f89325 ]

On riscv qemu platform, when add kprobe event on do_sys_open() to show
filename string arg, it just print fault as follow:

echo 'p:myprobe do_sys_open dfd=$arg1 filename=+0($arg2):string flags=$arg3
mode=$arg4' > kprobe_events

bash-166     [000] ...1.   360.195367: myprobe: (do_sys_open+0x0/0x84)
dfd=0xffffffffffffff9c filename=(fault) flags=0x8241 mode=0x1b6

bash-166     [000] ...1.   360.219369: myprobe: (do_sys_open+0x0/0x84)
dfd=0xffffffffffffff9c filename=(fault) flags=0x8241 mode=0x1b6

bash-191     [000] ...1.   360.378827: myprobe: (do_sys_open+0x0/0x84)
dfd=0xffffffffffffff9c filename=(fault) flags=0x98800 mode=0x0

As riscv do not select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE,
the +0($arg2) addr is processed as a kernel address though it is a
userspace address, cause the above filename=(fault) print. So select
ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE to avoid the issue, after that the
kprobe trace is ok as below:

bash-166     [000] ...1.    96.767641: myprobe: (do_sys_open+0x0/0x84)
dfd=0xffffffffffffff9c filename="/dev/null" flags=0x8241 mode=0x1b6

bash-166     [000] ...1.    96.793751: myprobe: (do_sys_open+0x0/0x84)
dfd=0xffffffffffffff9c filename="/dev/null" flags=0x8241 mode=0x1b6

bash-177     [000] ...1.    96.962354: myprobe: (do_sys_open+0x0/0x84)
dfd=0xffffffffffffff9c filename="/sys/kernel/debug/tracing/events/kprobes/"
flags=0x98800 mode=0x0

Signed-off-by: Ruan Jinjie <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Fixes: 0ebeea8 ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit a90e8608eb0ed93d31ac0feb055f77ce59512542 ]

Syzkaller hits a kernel WARN when the first character of the dev name
provided is NULL. Solution is to add a NULL check before calling
cdev_device_add() in vduse_create_dev().

kobject: (0000000072042169): attempted to be registered with empty name!
WARNING: CPU: 0 PID: 112695 at lib/kobject.c:236
Call Trace:
 kobject_add_varg linux/src/lib/kobject.c:390 [inline]
 kobject_add+0xf6/0x150 linux/src/lib/kobject.c:442
 device_add+0x28f/0xc20 linux/src/drivers/base/core.c:2167
 cdev_device_add+0x83/0xc0 linux/src/fs/char_dev.c:546
 vduse_create_dev linux/src/drivers/vdpa/vdpa_user/vduse_dev.c:2254 [inline]
 vduse_ioctl+0x7b5/0xf30 linux/src/drivers/vdpa/vdpa_user/vduse_dev.c:2316
 vfs_ioctl linux/src/fs/ioctl.c:47 [inline]
 file_ioctl linux/src/fs/ioctl.c:510 [inline]
 do_vfs_ioctl+0x14b/0xa80 linux/src/fs/ioctl.c:697
 ksys_ioctl+0x7c/0xa0 linux/src/fs/ioctl.c:714
 __do_sys_ioctl linux/src/fs/ioctl.c:721 [inline]
 __se_sys_ioctl linux/src/fs/ioctl.c:719 [inline]
 __x64_sys_ioctl+0x42/0x50 linux/src/fs/ioctl.c:719
 do_syscall_64+0x94/0x330 linux/src/arch/x86/entry/common.c:291
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: c8a6153 ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Cc: "Xie Yongji" <[email protected]>
Reported-by: Xianjun Zeng <[email protected]>
Signed-off-by: Sheng Zhao <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Jason Wang <[email protected]>
Reviewed-by: Xie Yongji <[email protected]>
Cc: "Michael S. Tsirkin"<[email protected]>, "Jason Wang"<[email protected]>,
Reviewed-by: Xie Yongji <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 55d8122f5cd62d5aaa225d7167dcd14a44c850b9 ]

Use the right structs for PACKED or split vqs when setting and
getting the vring base.

Fixes: 4c8cf31 ("vhost: introduce vDPA-based backend")
Signed-off-by: Shannon Nelson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit beee7fdb5b56a46415a4992d28dd4c2d06eb52df ]

Use the right structs for PACKED or split vqs when setting and
getting the vring base.

Fixes: 4c8cf31 ("vhost: introduce vDPA-based backend")
Signed-off-by: Shannon Nelson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
…s re-enabled"

commit 1b29243933098cdbc31b579b5616e183b4275e2f upstream.

This reverts commit a44be64bbecb15a452496f60db6eacfee2b59c79.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit dea9d8f7643fab07bf89a1155f1f94f37d096a5e upstream.

ext4_xattr_block_set() relies on its caller to call dquot_initialize()
on the inode.  To assure that this has happened there are WARN_ON
checks.  Unfortunately, this is subject to false positives if there is
an antagonist thread which is flipping the file system at high rates
between r/o and rw.  So only do the check if EXT4_XATTR_DEBUG is
enabled.

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
This reverts commit 503e554 which is
commit 0af462f19e635ad522f28981238334620881badc upstream.

Guenter reports problems with it, and it's not quite obvious why, so
revert it for now.

Reported-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: Ido Schimmel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 22ed903eee23a5b174e240f1cdfa9acf393a5210 upstream.

syzbot detected a crash during log recovery:

XFS (loop0): Mounting V5 Filesystem bfdc47fc-10d8-4eed-a562-11a831b3f791
XFS (loop0): Torn write (CRC failure) detected at log block 0x180. Truncating head block from 0x200.
XFS (loop0): Starting recovery (logdev: internal)
==================================================================
BUG: KASAN: slab-out-of-bounds in xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813
Read of size 8 at addr ffff88807e89f258 by task syz-executor132/5074

CPU: 0 PID: 5074 Comm: syz-executor132 Not tainted 6.2.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106
 print_address_description+0x74/0x340 mm/kasan/report.c:306
 print_report+0x107/0x1f0 mm/kasan/report.c:417
 kasan_report+0xcd/0x100 mm/kasan/report.c:517
 xfs_btree_lookup_get_block+0x15c/0x6d0 fs/xfs/libxfs/xfs_btree.c:1813
 xfs_btree_lookup+0x346/0x12c0 fs/xfs/libxfs/xfs_btree.c:1913
 xfs_btree_simple_query_range+0xde/0x6a0 fs/xfs/libxfs/xfs_btree.c:4713
 xfs_btree_query_range+0x2db/0x380 fs/xfs/libxfs/xfs_btree.c:4953
 xfs_refcount_recover_cow_leftovers+0x2d1/0xa60 fs/xfs/libxfs/xfs_refcount.c:1946
 xfs_reflink_recover_cow+0xab/0x1b0 fs/xfs/xfs_reflink.c:930
 xlog_recover_finish+0x824/0x920 fs/xfs/xfs_log_recover.c:3493
 xfs_log_mount_finish+0x1ec/0x3d0 fs/xfs/xfs_log.c:829
 xfs_mountfs+0x146a/0x1ef0 fs/xfs/xfs_mount.c:933
 xfs_fs_fill_super+0xf95/0x11f0 fs/xfs/xfs_super.c:1666
 get_tree_bdev+0x400/0x620 fs/super.c:1282
 vfs_get_tree+0x88/0x270 fs/super.c:1489
 do_new_mount+0x289/0xad0 fs/namespace.c:3145
 do_mount fs/namespace.c:3488 [inline]
 __do_sys_mount fs/namespace.c:3697 [inline]
 __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f89fa3f4aca
Code: 83 c4 08 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fffd5fb5ef8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00646975756f6e2c RCX: 00007f89fa3f4aca
RDX: 0000000020000100 RSI: 0000000020009640 RDI: 00007fffd5fb5f10
RBP: 00007fffd5fb5f10 R08: 00007fffd5fb5f50 R09: 000000000000970d
R10: 0000000000200800 R11: 0000000000000206 R12: 0000000000000004
R13: 0000555556c6b2c0 R14: 0000000000200800 R15: 00007fffd5fb5f50
 </TASK>

The fuzzed image contains an AGF with an obviously garbage
agf_refcount_level value of 32, and a dirty log with a buffer log item
for that AGF.  The ondisk AGF has a higher LSN than the recovered log
item.  xlog_recover_buf_commit_pass2 reads the buffer, compares the
LSNs, and decides to skip replay because the ondisk buffer appears to be
newer.

Unfortunately, the ondisk buffer is corrupt, but recovery just read the
buffer with no buffer ops specified:

	error = xfs_buf_read(mp->m_ddev_targp, buf_f->blf_blkno,
			buf_f->blf_len, buf_flags, &bp, NULL);

Skipping the buffer leaves its contents in memory unverified.  This sets
us up for a kernel crash because xfs_refcount_recover_cow_leftovers
reads the buffer (which is still around in XBF_DONE state, so no read
verification) and creates a refcountbt cursor of height 32.  This is
impossible so we run off the end of the cursor object and crash.

Fix this by invoking the verifier on all skipped buffers and aborting
log recovery if the ondisk buffer is corrupt.  It might be smarter to
force replay the log item atop the buffer and then see if it'll pass the
write verifier (like ext4 does) but for now let's go with the
conservative option where we stop immediately.

Link: https://syzkaller.appspot.com/bug?extid=7e9494b8b399902e994e
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Signed-off-by: Leah Rumancik <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
This reverts commit a97e60b which is
commit fda2093860df4812d69052a8cf4997e53853a340 upstream.

Ben reports that this should not have been backported to the older
kernels as the rest of the macro is not empty.  It was a clean-up patch
in 6.4-rc1 only, it did not add new device ids.

Reported-by: Ben Hutchings <[email protected]>
Cc: Philipp Hortmann <[email protected]>
Cc: Sasha Levin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Chris Paterson (CIP) <[email protected]>
Tested-by: Shuah Khan <[email protected]>
Tested-by: Bagas Sanjaya <[email protected]>
Tested-by: Linux Kernel Functional Testing <[email protected]>
Tested-by: Jon Hunter <[email protected]>
Tested-by: Harshit Mogalapalli <[email protected]>
Tested-by: Sudip Mukherjee <[email protected]>
Tested-by: Allen Pais <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Tested-by: Ron Economos <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit f7d85515bd21902b218370a1a6301f76e4e636ff ]

strtobool() is the same as kstrtobool().
However, the latter is more used within the kernel.

In order to remove strtobool() and slightly simplify kstrtox.h, switch to
the other function name.

While at it, include the corresponding header file (<linux/kstrtox.h>)

Signed-off-by: Christophe JAILLET <[email protected]>
Acked-by: Luis Chamberlain <[email protected]>
Link: https://lore.kernel.org/r/34f04735d20e0138695dd4070651bd860a36b81c.1673688120.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Stable-dep-of: 4acfe3dfde68 ("test_firmware: prevent race conditions by a correct implementation of locking")
Signed-off-by: Sasha Levin <[email protected]>
… locking

[ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ]

Dan Carpenter spotted a race condition in a couple of situations like
these in the test_firmware driver:

static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
        u8 val;
        int ret;

        ret = kstrtou8(buf, 10, &val);
        if (ret)
                return ret;

        mutex_lock(&test_fw_mutex);
        *(u8 *)cfg = val;
        mutex_unlock(&test_fw_mutex);

        /* Always return full write size even if we didn't consume all */
        return size;
}

static ssize_t config_num_requests_store(struct device *dev,
                                         struct device_attribute *attr,
                                         const char *buf, size_t count)
{
        int rc;

        mutex_lock(&test_fw_mutex);
        if (test_fw_config->reqs) {
                pr_err("Must call release_all_firmware prior to changing config\n");
                rc = -EINVAL;
                mutex_unlock(&test_fw_mutex);
                goto out;
        }
        mutex_unlock(&test_fw_mutex);

        rc = test_dev_config_update_u8(buf, count,
                                       &test_fw_config->num_requests);

out:
        return rc;
}

static ssize_t config_read_fw_idx_store(struct device *dev,
                                        struct device_attribute *attr,
                                        const char *buf, size_t count)
{
        return test_dev_config_update_u8(buf, count,
                                         &test_fw_config->read_fw_idx);
}

The function test_dev_config_update_u8() is called from both the locked
and the unlocked context, function config_num_requests_store() and
config_read_fw_idx_store() which can both be called asynchronously as
they are driver's methods, while test_dev_config_update_u8() and siblings
change their argument pointed to by u8 *cfg or similar pointer.

To avoid deadlock on test_fw_mutex, the lock is dropped before calling
test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8()
itself, but alas this creates a race condition.

Having two locks wouldn't assure a race-proof mutual exclusion.

This situation is best avoided by the introduction of a new, unlocked
function __test_dev_config_update_u8() which can be called from the locked
context and reducing test_dev_config_update_u8() to:

static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg)
{
        int ret;

        mutex_lock(&test_fw_mutex);
        ret = __test_dev_config_update_u8(buf, size, cfg);
        mutex_unlock(&test_fw_mutex);

        return ret;
}

doing the locking and calling the unlocked primitive, which enables both
locked and unlocked versions without duplication of code.

The similar approach was applied to all functions called from the locked
and the unlocked context, which safely mitigates both deadlocks and race
conditions in the driver.

__test_dev_config_update_bool(), __test_dev_config_update_u8() and
__test_dev_config_update_size_t() unlocked versions of the functions
were introduced to be called from the locked contexts as a workaround
without releasing the main driver's lock and thereof causing a race
condition.

The test_dev_config_update_bool(), test_dev_config_update_u8() and
test_dev_config_update_size_t() locked versions of the functions
are being called from driver methods without the unnecessary multiplying
of the locking and unlocking code for each method, and complicating
the code with saving of the return value across lock.

Fixes: 7feebfa ("test_firmware: add support for request_firmware_into_buf")
Cc: Luis Chamberlain <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Russ Weight <[email protected]>
Cc: Takashi Iwai <[email protected]>
Cc: Tianfei Zhang <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: [email protected]
Cc: [email protected] # v5.4
Suggested-by: Dan Carpenter <[email protected]>
Signed-off-by: Mirsad Goran Todorovac <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
glandium and others added 8 commits July 11, 2023 17:19
commit 5fe251112646d8626818ea90f7af325bab243efa upstream.

commit 498ba2069035 ("HID: logitech-hidpp: Don't restart communication if
not necessary") put restarting communication behind that flag, and this
was apparently necessary on the T651, but the flag was not set for it.

Fixes: 498ba2069035 ("HID: logitech-hidpp: Don't restart communication if not necessary")
Cc: [email protected]
Signed-off-by: Mike Hommey <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
… leak in mtk_thermal_probe"

commit 86edac7d3888c715fe3a81bd61f3617ecfe2e1dd upstream.

This reverts commit f05c7b7d9ea9477fcc388476c6f4ade8c66d2d26.

That change was causing a regression in the generic-adc-thermal-probed
bootrr test as reported in the kernelci-results list [1].
A proper rework will take longer, so revert it for now.

[1] https://groups.io/g/kernelci-results/message/42660

Fixes: f05c7b7d9ea9 ("thermal/drivers/mediatek: Use devm_of_iomap to avoid resource leak in mtk_thermal_probe")
Signed-off-by: Ricardo Cañuelo <[email protected]>
Suggested-by: AngeloGioacchino Del Regno <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
…match stext

commit 1c249565426e3a9940102c0ba9f63914f7cda73d upstream.

This problem was encountered on an arm64 system with a lot of memory.
Without kernel debug symbols installed, and with both kcore and kallsyms
available, perf managed to get confused and returned "unknown" for all
of the kernel symbols that it tried to look up.

On this system, stext fell within the vmalloc segment.  The kcore symbol
matching code tries to find the first segment that contains stext and
uses that to replace the segment generated from just the kallsyms
information.  In this case, however, there were two: a very large
vmalloc segment, and the text segment.  This caused perf to get confused
because multiple overlapping segments were inserted into the RB tree
that holds the discovered segments.  However, that alone wasn't
sufficient to cause the problem. Even when we could find the segment,
the offsets were adjusted in such a way that the newly generated symbols
didn't line up with the instruction addresses in the trace.  The most
obvious solution would be to consult which segment type is text from
kcore, but this information is not exposed to users.

Instead, select the smallest matching segment that contains stext
instead of the first matching segment.  This allows us to match the text
segment instead of vmalloc, if one is contained within the other.

Reviewed-by: Adrian Hunter <[email protected]>
Signed-off-by: Krister Johansen <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: David Reaver <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Krister Johansen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit e1b37563caffc410bb4b55f153ccb14dede66815 upstream.

gtags considers any file outside of its current working directory
"outside the source tree" and refuses to index it. For O= kernel builds,
or when "make" is invoked from a directory other then the kernel source
tree, gtags ignores the entire kernel source and generates an empty
index.

Force-set gtags current working directory to the kernel source tree.

Due to commit 9da0763 ("kbuild: Use relative path when building in
a subdir of the source tree"), if the kernel build is done in a
sub-directory of the kernel source tree, the kernel Makefile will set
the kernel's $srctree to ".." for shorter compile-time and run-time
warnings. Consequently, the list of files to be indexed will be in the
"../*" form, rendering all such paths invalid once gtags switches to the
kernel source tree as its current working directory.

If gtags indexing is requested and the build directory is not the kernel
source tree, index all files in absolute-path form.

Note, indexing in absolute-path form will not affect the generated
index, as paths in gtags indices are always relative to the gtags "root
directory" anyway (as evidenced by "gtags --dump").

Signed-off-by: Ahmed S. Darwish <[email protected]>
Cc: <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit a2b308044dcaca8d3e580959a4f867a1d5c37fac upstream.

None have been defined yet, so reject anybody setting any. Mesa sets
it to 0 anyway.

Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit b5b2a02bcaac7c287694aa0db4837a07bf178626 upstream.

We define sp and ipsw in <asm/asmregs.h> using ".reg", and when using
current binutils (snapshot 2.40.50.20230611) the definitions in
<asm/assembly.h> using "=" conflict with those:

arch/parisc/include/asm/assembly.h: Assembler messages:
arch/parisc/include/asm/assembly.h:93: Error: symbol `sp' is already defined
arch/parisc/include/asm/assembly.h:95: Error: symbol `ipsw' is already defined

Delete the duplicate definitions in <asm/assembly.h>.

Also delete the definition of gp, which isn't used anywhere.

Signed-off-by: Ben Hutchings <[email protected]>
Cc: [email protected] # v6.0+
Signed-off-by: Helge Deller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 0e96647cff9224db564a1cee6efccb13dbe11ee2 upstream.

The conversion to proc_create_single_data() introduced a regression
whereby reading a file in /proc/bus/nubus results in a seg fault:

    # grep -r . /proc/bus/nubus/e/
    Data read fault at 0x00000020 in Super Data (pc=0x1074c2)
    BAD KERNEL BUSERR
    Oops: 00000000
    Modules linked in:
    PC: [<001074c2>] PDE_DATA+0xc/0x16
    SR: 2010  SP: 38284958  a2: 01152370
    d0: 00000001    d1: 01013000    d2: 01002790    d3: 00000000
    d4: 00000001    d5: 0008ce2e    a0: 00000000    a1: 00222a40
    Process grep (pid: 45, task=142f8727)
    Frame format=B ssw=074d isc=2008 isb=4e5e daddr=00000020 dobuf=01199e70
    baddr=001074c8 dibuf=ffffffff ver=f
    Stack from 01199e48:
	    01199e70 00222a58 01002790 00000000 011a3000 01199eb0 015000c0 00000000
	    00000000 01199ec0 01199ec0 000d551a 011a3000 00000001 00000000 00018000
	    d003f000 00000003 00000001 0002800d 01052840 01199fa8 c01f8000 00000000
	    00000029 0b532b80 00000000 00000000 00000029 0b532b80 01199ee4 00103640
	    011198c0 d003f000 00018000 01199fa8 00000000 011198c0 00000000 01199f4c
	    000b3344 011198c0 d003f000 00018000 01199fa8 00000000 00018000 011198c0
    Call Trace: [<00222a58>] nubus_proc_rsrc_show+0x18/0xa0
     [<000d551a>] seq_read+0xc4/0x510
     [<00018000>] fp_fcos+0x2/0x82
     [<0002800d>] __sys_setreuid+0x115/0x1c6
     [<00103640>] proc_reg_read+0x5c/0xb0
     [<00018000>] fp_fcos+0x2/0x82
     [<000b3344>] __vfs_read+0x2c/0x13c
     [<00018000>] fp_fcos+0x2/0x82
     [<00018000>] fp_fcos+0x2/0x82
     [<000b8aa2>] sys_statx+0x60/0x7e
     [<000b34b6>] vfs_read+0x62/0x12a
     [<00018000>] fp_fcos+0x2/0x82
     [<00018000>] fp_fcos+0x2/0x82
     [<000b39c2>] ksys_read+0x48/0xbe
     [<00018000>] fp_fcos+0x2/0x82
     [<000b3a4e>] sys_read+0x16/0x1a
     [<00018000>] fp_fcos+0x2/0x82
     [<00002b84>] syscall+0x8/0xc
     [<00018000>] fp_fcos+0x2/0x82
     [<0000c016>] not_ext+0xa/0x18
    Code: 4e5e 4e75 4e56 0000 206e 0008 2068 ffe8 <2068> 0020 2008 4e5e 4e75 4e56 0000 2f0b 206e 0008 2068 0004 2668 0020 206b ffe8
    Disabling lock debugging due to kernel taint

    Segmentation fault

The proc_create_single_data() conversion does not work because
single_open(file, nubus_proc_rsrc_show, PDE_DATA(inode)) is not
equivalent to the original code.

Fixes: 3f3942a ("proc: introduce proc_create_single{,_data}")
Cc: Christoph Hellwig <[email protected]>
Cc: [email protected] # 5.6+
Signed-off-by: Finn Thain <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/d4e2a586e793cc8d9442595684ab8a077c0fe726.1678783919.git.fthain@linux-m68k.org
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Ron Economos <[email protected]>
Tested-by: Harshit Mogalapalli <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Tested-by: Linux Kernel Functional Testing <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
@ahsanhussain
Copy link

@tarmeh2r although its a second update for this kernel, but it's still good a have a general comment about the nature of conflict resolution. This is so we have a record of what changed that we need to test more thoroughly. We also post the boot log to show that the kernel runs fine on the target. See this example #4

Copy link

@ahsanhussain ahsanhussain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase itself looks good to me. Thanks

@tarmeh2r
Copy link
Author

@tarmeh2r although its a second update for this kernel, but it's still good a have a general comment about the nature of conflict resolution. This is so we have a record of what changed that we need to test more thoroughly. We also post the boot log to show that the kernel runs fine on the target. See this example #4

I posted the boot logs on the ticket and added a comment here on the PR as I thought posting them here would be redundant, is this fine or I should post the boot logs on PR as well? I also mentioned the details of how I resolved all the conflicts on the ticket. I will add a general comment similar to the example you've shared. Just a quick question, do we need to go through all the commit titles or only the ones where we got a conflict to add a general comment about the components which should be tested?

@ahsanhussain
Copy link

Going through all the details of the conflict resolution would be counterproductive so just the general nature of those conflicts e.g. USB driver, Ethernet driver etc. is what we've been doing

@Noor-Ahsan Noor-Ahsan merged commit c661a2d into MentorEmbedded:ginkgo-imx-5.15.71_2.2.0-5.15.71 Jul 13, 2023
Noor-Ahsan pushed a commit that referenced this pull request Jul 13, 2023
[ Upstream commit 05bb0167c80b8f93c6a4e0451b7da9b96db990c2 ]

ACPICA commit 770653e3ba67c30a629ca7d12e352d83c2541b1e

Before this change we see the following UBSAN stack trace in Fuchsia:

  #0    0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302
  #1.2  0x000020d0f660777f in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x3d77f
  #1.1  0x000020d0f660777f in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x3d77f
  #1    0x000020d0f660777f in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:387 <libclang_rt.asan.so>+0x3d77f
  #2    0x000020d0f660b96d in handlepointer_overflow_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:809 <libclang_rt.asan.so>+0x4196d
  #3    0x000020d0f660b50d in compiler-rt/lib/ubsan/ubsan_handlers.cpp:815 <libclang_rt.asan.so>+0x4150d
  #4    0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302
  #5    0x000021e4213e2369 in acpi_ds_call_control_method(struct acpi_thread_state*, struct acpi_walk_state*, union acpi_parse_object*) ../../third_party/acpica/source/components/dispatcher/dsmethod.c:605 <platform-bus-x86.so>+0x262369
  #6    0x000021e421437fac in acpi_ps_parse_aml(struct acpi_walk_state*) ../../third_party/acpica/source/components/parser/psparse.c:550 <platform-bus-x86.so>+0x2b7fac
  #7    0x000021e4214464d2 in acpi_ps_execute_method(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/parser/psxface.c:244 <platform-bus-x86.so>+0x2c64d2
  #8    0x000021e4213aa052 in acpi_ns_evaluate(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/namespace/nseval.c:250 <platform-bus-x86.so>+0x22a052
  #9    0x000021e421413dd8 in acpi_ns_init_one_device(acpi_handle, u32, void*, void**) ../../third_party/acpica/source/components/namespace/nsinit.c:735 <platform-bus-x86.so>+0x293dd8
  #10   0x000021e421429e98 in acpi_ns_walk_namespace(acpi_object_type, acpi_handle, u32, u32, acpi_walk_callback, acpi_walk_callback, void*, void**) ../../third_party/acpica/source/components/namespace/nswalk.c:298 <platform-bus-x86.so>+0x2a9e98
  #11   0x000021e4214131ac in acpi_ns_initialize_devices(u32) ../../third_party/acpica/source/components/namespace/nsinit.c:268 <platform-bus-x86.so>+0x2931ac
  #12   0x000021e42147c40d in acpi_initialize_objects(u32) ../../third_party/acpica/source/components/utilities/utxfinit.c:304 <platform-bus-x86.so>+0x2fc40d
  #13   0x000021e42126d603 in acpi::acpi_impl::initialize_acpi(acpi::acpi_impl*) ../../src/devices/board/lib/acpi/acpi-impl.cc:224 <platform-bus-x86.so>+0xed603

Add a simple check that avoids incrementing a pointer by zero, but
otherwise behaves as before. Note that our findings are against ACPICA
20221020, but the same code exists on master.

Link: acpica/acpica@770653e3
Signed-off-by: Bob Moore <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Noor-Ahsan pushed a commit that referenced this pull request Jul 13, 2023
[ Upstream commit 63f8793ee60513a09f110ea460a6ff2c33811cdb ]

Make sure to check device queue mode in the null_validate_conf() and
return error for NULL_Q_RQ as we don't allow legacy I/O path, without
this patch we get OOPs when queue mode is set to 1 from configfs,
following are repro steps :-

modprobe null_blk nr_devices=0
mkdir config/nullb/nullb0
echo 1 > config/nullb/nullb0/memory_backed
echo 4096 > config/nullb/nullb0/blocksize
echo 20480 > config/nullb/nullb0/size
echo 1 > config/nullb/nullb0/queue_mode
echo 1 > config/nullb/nullb0/power

Entering kdb (current=0xffff88810acdd080, pid 2372) on processor 42 Oops: (null)
due to oops @ 0xffffffffc041c329
CPU: 42 PID: 2372 Comm: sh Tainted: G           O     N 6.3.0-rc5lblk+ #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
RIP: 0010:null_add_dev.part.0+0xd9/0x720 [null_blk]
Code: 01 00 00 85 d2 0f 85 a1 03 00 00 48 83 bb 08 01 00 00 00 0f 85 f7 03 00 00 80 bb 62 01 00 00 00 48 8b 75 20 0f 85 6d 02 00 00 <48> 89 6e 60 48 8b 75 20 bf 06 00 00 00 e8 f5 37 2c c1 48 8b 75 20
RSP: 0018:ffffc900052cbde0 EFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff88811084d800 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888100042e00
RBP: ffff8881053d8200 R08: ffffc900052cbd68 R09: ffff888105db2000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000002
R13: ffff888104765200 R14: ffff88810eec1748 R15: ffff88810eec1740
FS:  00007fd445fd1740(0000) GS:ffff8897dfc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000060 CR3: 0000000166a00000 CR4: 0000000000350ee0
DR0: ffffffff8437a488 DR1: ffffffff8437a489 DR2: ffffffff8437a48a
DR3: ffffffff8437a48b DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 nullb_device_power_store+0xd1/0x120 [null_blk]
 configfs_write_iter+0xb4/0x120
 vfs_write+0x2ba/0x3c0
 ksys_write+0x5f/0xe0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7fd4460c57a7
Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
RSP: 002b:00007ffd3792a4a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4460c57a7
RDX: 0000000000000002 RSI: 000055b43c02e4c0 RDI: 0000000000000001
RBP: 000055b43c02e4c0 R08: 000000000000000a R09: 00007fd44615b4e0
R10: 00007fd44615b3e0 R11: 0000000000000246 R12: 0000000000000002
R13: 00007fd446198520 R14: 0000000000000002 R15: 00007fd446198700
 </TASK>

Signed-off-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Damien Le Moal <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Reviewed-by: Nitesh Shetty <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Noor-Ahsan pushed a commit that referenced this pull request Jul 13, 2023
[ Upstream commit 98bea253aa28ad8be2ce565a9ca21beb4a9419e5 ]

Log load and replay is part of the metadata handle flow during mount
operation. The $MFT record will be loaded and used while replaying logs.
However, a malformed $MFT record, say, has RECORD_FLAG_DIR flag set and
contains an ATTR_ROOT attribute will misguide kernel to treat it as a
directory, and try to free the allocated resources when the
corresponding inode is freed, which will cause an invalid kfree because
the memory hasn't actually been allocated.

[  101.368647] BUG: KASAN: invalid-free in kvfree+0x2c/0x40
[  101.369457]
[  101.369986] CPU: 0 PID: 198 Comm: mount Not tainted 6.0.0-rc7+ #5
[  101.370529] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  101.371362] Call Trace:
[  101.371795]  <TASK>
[  101.372157]  dump_stack_lvl+0x49/0x63
[  101.372658]  print_report.cold+0xf5/0x689
[  101.373022]  ? ni_write_inode+0x754/0xd90
[  101.373378]  ? kvfree+0x2c/0x40
[  101.373698]  kasan_report_invalid_free+0x77/0xf0
[  101.374058]  ? kvfree+0x2c/0x40
[  101.374352]  ? kvfree+0x2c/0x40
[  101.374668]  __kasan_slab_free+0x189/0x1b0
[  101.374992]  ? kvfree+0x2c/0x40
[  101.375271]  kfree+0x168/0x3b0
[  101.375717]  kvfree+0x2c/0x40
[  101.376002]  indx_clear+0x26/0x60
[  101.376316]  ni_clear+0xc5/0x290
[  101.376661]  ntfs_evict_inode+0x45/0x70
[  101.377001]  evict+0x199/0x280
[  101.377432]  iput.part.0+0x286/0x320
[  101.377819]  iput+0x32/0x50
[  101.378166]  ntfs_loadlog_and_replay+0x143/0x320
[  101.378656]  ? ntfs_bio_fill_1+0x510/0x510
[  101.378968]  ? iput.part.0+0x286/0x320
[  101.379367]  ntfs_fill_super+0xecb/0x1ba0
[  101.379729]  ? put_ntfs+0x1d0/0x1d0
[  101.380046]  ? vsprintf+0x20/0x20
[  101.380542]  ? mutex_unlock+0x81/0xd0
[  101.380914]  ? set_blocksize+0x95/0x150
[  101.381597]  get_tree_bdev+0x232/0x370
[  101.382254]  ? put_ntfs+0x1d0/0x1d0
[  101.382699]  ntfs_fs_get_tree+0x15/0x20
[  101.383094]  vfs_get_tree+0x4c/0x130
[  101.383675]  path_mount+0x654/0xfe0
[  101.384203]  ? putname+0x80/0xa0
[  101.384540]  ? finish_automount+0x2e0/0x2e0
[  101.384943]  ? putname+0x80/0xa0
[  101.385362]  ? kmem_cache_free+0x1c4/0x440
[  101.385968]  ? putname+0x80/0xa0
[  101.386666]  do_mount+0xd6/0xf0
[  101.387228]  ? path_mount+0xfe0/0xfe0
[  101.387585]  ? __kasan_check_write+0x14/0x20
[  101.387979]  __x64_sys_mount+0xca/0x110
[  101.388436]  do_syscall_64+0x3b/0x90
[  101.388757]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  101.389289] RIP: 0033:0x7fa0f70e948a
[  101.390048] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[  101.391297] RSP: 002b:00007ffc24fdecc8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  101.391988] RAX: ffffffffffffffda RBX: 000055932c183060 RCX: 00007fa0f70e948a
[  101.392494] RDX: 000055932c183260 RSI: 000055932c1832e0 RDI: 000055932c18bce0
[  101.393053] RBP: 0000000000000000 R08: 000055932c183280 R09: 0000000000000020
[  101.393577] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 000055932c18bce0
[  101.394044] R13: 000055932c183260 R14: 0000000000000000 R15: 00000000ffffffff
[  101.394747]  </TASK>
[  101.395402]
[  101.396047] Allocated by task 198:
[  101.396724]  kasan_save_stack+0x26/0x50
[  101.397400]  __kasan_slab_alloc+0x6d/0x90
[  101.397974]  kmem_cache_alloc_lru+0x192/0x5a0
[  101.398524]  ntfs_alloc_inode+0x23/0x70
[  101.399137]  alloc_inode+0x3b/0xf0
[  101.399534]  iget5_locked+0x54/0xa0
[  101.400026]  ntfs_iget5+0xaf/0x1780
[  101.400414]  ntfs_loadlog_and_replay+0xe5/0x320
[  101.400883]  ntfs_fill_super+0xecb/0x1ba0
[  101.401313]  get_tree_bdev+0x232/0x370
[  101.401774]  ntfs_fs_get_tree+0x15/0x20
[  101.402224]  vfs_get_tree+0x4c/0x130
[  101.402673]  path_mount+0x654/0xfe0
[  101.403160]  do_mount+0xd6/0xf0
[  101.403537]  __x64_sys_mount+0xca/0x110
[  101.404058]  do_syscall_64+0x3b/0x90
[  101.404333]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  101.404816]
[  101.405067] The buggy address belongs to the object at ffff888008cc9ea0
[  101.405067]  which belongs to the cache ntfs_inode_cache of size 992
[  101.406171] The buggy address is located 232 bytes inside of
[  101.406171]  992-byte region [ffff888008cc9ea0, ffff888008cca280)
[  101.406995]
[  101.408559] The buggy address belongs to the physical page:
[  101.409320] page:00000000dccf19dd refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cc8
[  101.410654] head:00000000dccf19dd order:2 compound_mapcount:0 compound_pincount:0
[  101.411533] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
[  101.412665] raw: 000fffffc0010200 0000000000000000 dead000000000122 ffff888003695140
[  101.413209] raw: 0000000000000000 00000000800e000e 00000001ffffffff 0000000000000000
[  101.413799] page dumped because: kasan: bad access detected
[  101.414213]
[  101.414427] Memory state around the buggy address:
[  101.414991]  ffff888008cc9e80: fc fc fc fc 00 00 00 00 00 00 00 00 00 00 00 00
[  101.415785]  ffff888008cc9f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  101.416933] >ffff888008cc9f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  101.417857]                       ^
[  101.418566]  ffff888008cca000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  101.419704]  ffff888008cca080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Signed-off-by: Edward Lo <[email protected]>
Signed-off-by: Konstantin Komarov <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Noor-Ahsan pushed a commit that referenced this pull request Jul 13, 2023
[ Upstream commit bdc1c5fac982845a58d28690cdb56db8c88a530d ]

In binder_transaction_buffer_release() the 'failed_at' offset indicates
the number of objects to clean up. However, this function was changed by
commit 44d8047 ("binder: use standard functions to allocate fds"),
to release all the objects in the buffer when 'failed_at' is zero.

This introduced an issue when a transaction buffer is released without
any objects having been processed so far. In this case, 'failed_at' is
indeed zero yet it is misinterpreted as releasing the entire buffer.

This leads to use-after-free errors where nodes are incorrectly freed
and subsequently accessed. Such is the case in the following KASAN
report:

  ==================================================================
  BUG: KASAN: slab-use-after-free in binder_thread_read+0xc40/0x1f30
  Read of size 8 at addr ffff4faf037cfc58 by task poc/474

  CPU: 6 PID: 474 Comm: poc Not tainted 6.3.0-12570-g7df047b3f0aa #5
  Hardware name: linux,dummy-virt (DT)
  Call trace:
   dump_backtrace+0x94/0xec
   show_stack+0x18/0x24
   dump_stack_lvl+0x48/0x60
   print_report+0xf8/0x5b8
   kasan_report+0xb8/0xfc
   __asan_load8+0x9c/0xb8
   binder_thread_read+0xc40/0x1f30
   binder_ioctl+0xd9c/0x1768
   __arm64_sys_ioctl+0xd4/0x118
   invoke_syscall+0x60/0x188
  [...]

  Allocated by task 474:
   kasan_save_stack+0x3c/0x64
   kasan_set_track+0x2c/0x40
   kasan_save_alloc_info+0x24/0x34
   __kasan_kmalloc+0xb8/0xbc
   kmalloc_trace+0x48/0x5c
   binder_new_node+0x3c/0x3a4
   binder_transaction+0x2b58/0x36f0
   binder_thread_write+0x8e0/0x1b78
   binder_ioctl+0x14a0/0x1768
   __arm64_sys_ioctl+0xd4/0x118
   invoke_syscall+0x60/0x188
  [...]

  Freed by task 475:
   kasan_save_stack+0x3c/0x64
   kasan_set_track+0x2c/0x40
   kasan_save_free_info+0x38/0x5c
   __kasan_slab_free+0xe8/0x154
   __kmem_cache_free+0x128/0x2bc
   kfree+0x58/0x70
   binder_dec_node_tmpref+0x178/0x1fc
   binder_transaction_buffer_release+0x430/0x628
   binder_transaction+0x1954/0x36f0
   binder_thread_write+0x8e0/0x1b78
   binder_ioctl+0x14a0/0x1768
   __arm64_sys_ioctl+0xd4/0x118
   invoke_syscall+0x60/0x188
  [...]
  ==================================================================

In order to avoid these issues, let's always calculate the intended
'failed_at' offset beforehand. This is renamed and wrapped in a helper
function to make it clear and convenient.

Fixes: 32e9f56a96d8 ("binder: don't detect sender/target during buffer cleanup")
Reported-by: Zi Fan Tan <[email protected]>
Cc: [email protected]
Signed-off-by: Carlos Llamas <[email protected]>
Acked-by: Todd Kjos <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
tarmeh2r pushed a commit to tarmeh2r/linux-flex-imx that referenced this pull request Jul 31, 2023
[ Upstream commit b684c09f09e7a6af3794d4233ef785819e72db79 ]

ppc_save_regs() skips one stack frame while saving the CPU register states.
Instead of saving current R1, it pulls the previous stack frame pointer.

When vmcores caused by direct panic call (such as `echo c >
/proc/sysrq-trigger`), are debugged with gdb, gdb fails to show the
backtrace correctly. On further analysis, it was found that it was because
of mismatch between r1 and NIP.

GDB uses NIP to get current function symbol and uses corresponding debug
info of that function to unwind previous frames, but due to the
mismatching r1 and NIP, the unwinding does not work, and it fails to
unwind to the 2nd frame and hence does not show the backtrace.

GDB backtrace with vmcore of kernel without this patch:

---------
(gdb) bt
 #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>,
    newregs=0xc000000004f8f8d8) at ./arch/powerpc/include/asm/kexec.h:69
 MentorEmbedded#1  __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974
 MentorEmbedded#2  0x0000000000000063 in ?? ()
 MentorEmbedded#3  0xc000000003579320 in ?? ()
---------

Further analysis revealed that the mismatch occurred because
"ppc_save_regs" was saving the previous stack's SP instead of the current
r1. This patch fixes this by storing current r1 in the saved pt_regs.

GDB backtrace with vmcore of patched kernel:

--------
(gdb) bt
 #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=0x0, newregs=0xc00000000670b8d8)
    at ./arch/powerpc/include/asm/kexec.h:69
 MentorEmbedded#1  __crash_kexec (regs=regs@entry=0x0) at kernel/kexec_core.c:974
 MentorEmbedded#2  0xc000000000168918 in panic (fmt=fmt@entry=0xc000000001654a60 "sysrq triggered crash\n")
    at kernel/panic.c:358
 MentorEmbedded#3  0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155
 MentorEmbedded#4  0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false)
    at drivers/tty/sysrq.c:602
 MentorEmbedded#5  0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>,
    count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163
 MentorEmbedded#6  0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>,
    buf=<optimized out>, file=<optimized out>, pde=0xc00000000362cb40) at fs/proc/inode.c:340
 MentorEmbedded#7  proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>,
    ppos=<optimized out>) at fs/proc/inode.c:352
 MentorEmbedded#8  0xc0000000005b3bbc in vfs_write (file=file@entry=0xc000000006aa6b00,
    buf=buf@entry=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>,
    count=count@entry=2, pos=pos@entry=0xc00000000670bda0) at fs/read_write.c:582
 MentorEmbedded#9  0xc0000000005b4264 in ksys_write (fd=<optimized out>,
    buf=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=2)
    at fs/read_write.c:637
 MentorEmbedded#10 0xc00000000002ea2c in system_call_exception (regs=0xc00000000670be80, r0=<optimized out>)
    at arch/powerpc/kernel/syscall.c:171
 #11 0xc00000000000c270 in system_call_vectored_common ()
    at arch/powerpc/kernel/interrupt_64.S:192
--------

Nick adds:
  So this now saves regs as though it was an interrupt taken in the
  caller, at the instruction after the call to ppc_save_regs, whereas
  previously the NIP was there, but R1 came from the caller's caller and
  that mismatch is what causes gdb's dwarf unwinder to go haywire.

Signed-off-by: Aditya Gupta <[email protected]>
Fixes: d16a58f ("powerpc: Improve ppc_save_regs()")
Reivewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
pragow0k pushed a commit to pragow0k/linux-flex-imx that referenced this pull request Mar 25, 2024
[ Upstream commit e3e82fcb79eeb3f1a88a89f676831773caff514a ]

When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a
cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when
removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be
dereferenced as wrong struct in irdma_free_pending_cqp_request().

  PID: 3669   TASK: ffff88aef892c000  CPU: 28  COMMAND: "kworker/28:0"
   #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34
   MentorEmbedded#1 [fffffe0000549e40] nmi_handle at ffffffff810788b2
   MentorEmbedded#2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f
   MentorEmbedded#3 [fffffe0000549eb8] do_nmi at ffffffff81079582
   MentorEmbedded#4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4
      [exception RIP: native_queued_spin_lock_slowpath+1291]
      RIP: ffffffff8127e72b  RSP: ffff88aa841ef778  RFLAGS: 00000046
      RAX: 0000000000000000  RBX: ffff88b01f849700  RCX: ffffffff8127e47e
      RDX: 0000000000000000  RSI: 0000000000000004  RDI: ffffffff83857ec0
      RBP: ffff88afe3e4efc8   R8: ffffed15fc7c9dfa   R9: ffffed15fc7c9dfa
      R10: 0000000000000001  R11: ffffed15fc7c9df9  R12: 0000000000740000
      R13: ffff88b01f849708  R14: 0000000000000003  R15: ffffed1603f092e1
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  -- <NMI exception stack> --
   MentorEmbedded#5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b
   MentorEmbedded#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4
   MentorEmbedded#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363
   MentorEmbedded#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma]
   MentorEmbedded#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma]
   MentorEmbedded#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma]
   #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma]
   #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb
   #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6
   #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278
   #15 [ffff88aa841efb88] device_del at ffffffff82179d23
   #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice]
   #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice]
   #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a
   #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff
   #20 [ffff88aa841eff10] kthread at ffffffff811d87a0
   #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f

Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/[email protected]
Suggested-by: "Ismail, Mustafa" <[email protected]>
Signed-off-by: Shifeng Li <[email protected]>
Reviewed-by: Shiraz Saleem <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
pragow0k pushed a commit to pragow0k/linux-flex-imx that referenced this pull request Mar 25, 2024
[ Upstream commit 14694179e561b5f2f7e56a0f590e2cb49a9cc7ab ]

Trying to suspend to RAM on SAMA5D27 EVK leads to the following lockdep
warning:

 ============================================
 WARNING: possible recursive locking detected
 6.7.0-rc5-wt+ #532 Not tainted
 --------------------------------------------
 sh/92 is trying to acquire lock:
 c3cf306c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 but task is already holding lock:
 c3d7c46c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&irq_desc_lock_class);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 6 locks held by sh/92:
  #0: c3aa0258 (sb_writers#6){.+.+}-{0:0}, at: ksys_write+0xd8/0x178
  MentorEmbedded#1: c4c2df44 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x138/0x284
  MentorEmbedded#2: c32684a0 (kn->active){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x148/0x284
  MentorEmbedded#3: c232b6d4 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x13c/0x4e8
  MentorEmbedded#4: c387b088 (&dev->mutex){....}-{3:3}, at: __device_suspend+0x1e8/0x91c
  MentorEmbedded#5: c3d7c46c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100

 stack backtrace:
 CPU: 0 PID: 92 Comm: sh Not tainted 6.7.0-rc5-wt+ #532
 Hardware name: Atmel SAMA5
  unwind_backtrace from show_stack+0x18/0x1c
  show_stack from dump_stack_lvl+0x34/0x48
  dump_stack_lvl from __lock_acquire+0x19ec/0x3a0c
  __lock_acquire from lock_acquire.part.0+0x124/0x2d0
  lock_acquire.part.0 from _raw_spin_lock_irqsave+0x5c/0x78
  _raw_spin_lock_irqsave from __irq_get_desc_lock+0xe8/0x100
  __irq_get_desc_lock from irq_set_irq_wake+0xa8/0x204
  irq_set_irq_wake from atmel_gpio_irq_set_wake+0x58/0xb4
  atmel_gpio_irq_set_wake from irq_set_irq_wake+0x100/0x204
  irq_set_irq_wake from gpio_keys_suspend+0xec/0x2b8
  gpio_keys_suspend from dpm_run_callback+0xe4/0x248
  dpm_run_callback from __device_suspend+0x234/0x91c
  __device_suspend from dpm_suspend+0x224/0x43c
  dpm_suspend from dpm_suspend_start+0x9c/0xa8
  dpm_suspend_start from suspend_devices_and_enter+0x1e0/0xa84
  suspend_devices_and_enter from pm_suspend+0x460/0x4e8
  pm_suspend from state_store+0x78/0xe4
  state_store from kernfs_fop_write_iter+0x1a0/0x284
  kernfs_fop_write_iter from vfs_write+0x38c/0x6f4
  vfs_write from ksys_write+0xd8/0x178
  ksys_write from ret_fast_syscall+0x0/0x1c
 Exception stack(0xc52b3fa8 to 0xc52b3ff0)
 3fa0:                   00000004 005a0ae8 00000001 005a0ae8 00000004 00000001
 3fc0: 00000004 005a0ae8 00000001 00000004 00000004 b6c616c0 00000020 0059d190
 3fe0: 00000004 b6c61678 aec5a041 aebf1a26

This warning is raised because pinctrl-at91-pio4 uses chained IRQ. Whenever
a wake up source configures an IRQ through irq_set_irq_wake, it will
lock the corresponding IRQ desc, and then call irq_set_irq_wake on "parent"
IRQ which will do the same on its own IRQ desc, but since those two locks
share the same class, lockdep reports this as an issue.

Fix lockdep false positive by setting a different class for parent and
children IRQ

Fixes: 7761808 ("pinctrl: introduce driver for Atmel PIO4 controller")
Signed-off-by: Alexis Lothoré <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
pragow0k pushed a commit to pragow0k/linux-flex-imx that referenced this pull request Mar 25, 2024
[ Upstream commit 33eae65c6f49770fec7a662935d4eb4a6406d24b ]

A small CIFS buffer (448 bytes) isn't big enough to hold
SMB2_QUERY_INFO request along with user's input data from
CIFS_QUERY_INFO ioctl.  That is, if the user passed an input buffer >
344 bytes, the client will memcpy() off the end of @req->Buffer in
SMB2_query_info_init() thus causing the following KASAN splat:

  BUG: KASAN: slab-out-of-bounds in SMB2_query_info_init+0x242/0x250 [cifs]
  Write of size 1023 at addr ffff88801308c5a8 by task a.out/1240

  CPU: 1 PID: 1240 Comm: a.out Not tainted 6.7.0-rc4 MentorEmbedded#5
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
  rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x4a/0x80
   print_report+0xcf/0x650
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __phys_addr+0x46/0x90
   kasan_report+0xd8/0x110
   ? SMB2_query_info_init+0x242/0x250 [cifs]
   ? SMB2_query_info_init+0x242/0x250 [cifs]
   kasan_check_range+0x105/0x1b0
   __asan_memcpy+0x3c/0x60
   SMB2_query_info_init+0x242/0x250 [cifs]
   ? __pfx_SMB2_query_info_init+0x10/0x10 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? smb_rqst_len+0xa6/0xc0 [cifs]
   smb2_ioctl_query_info+0x4f4/0x9a0 [cifs]
   ? __pfx_smb2_ioctl_query_info+0x10/0x10 [cifs]
   ? __pfx_cifsConvertToUTF16+0x10/0x10 [cifs]
   ? kasan_set_track+0x25/0x30
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __kasan_kmalloc+0x8f/0xa0
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? cifs_strndup_to_utf16+0x12d/0x1a0 [cifs]
   ? __build_path_from_dentry_optional_prefix+0x19d/0x2d0 [cifs]
   ? __pfx_smb2_ioctl_query_info+0x10/0x10 [cifs]
   cifs_ioctl+0x11c7/0x1de0 [cifs]
   ? __pfx_cifs_ioctl+0x10/0x10 [cifs]
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? rcu_is_watching+0x23/0x50
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __rseq_handle_notify_resume+0x6cd/0x850
   ? __pfx___schedule+0x10/0x10
   ? blkcg_iostat_update+0x250/0x290
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? ksys_write+0xe9/0x170
   __x64_sys_ioctl+0xc9/0x100
   do_syscall_64+0x47/0xf0
   entry_SYSCALL_64_after_hwframe+0x6f/0x77
  RIP: 0033:0x7f893dde49cf
  Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48
  89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89>
  c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
  RSP: 002b:00007ffc03ff4160 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00007ffc03ff4378 RCX: 00007f893dde49cf
  RDX: 00007ffc03ff41d0 RSI: 00000000c018cf07 RDI: 0000000000000003
  RBP: 00007ffc03ff4260 R08: 0000000000000410 R09: 0000000000000001
  R10: 00007f893dce7300 R11: 0000000000000246 R12: 0000000000000000
  R13: 00007ffc03ff4388 R14: 00007f893df15000 R15: 0000000000406de0
   </TASK>

Fix this by increasing size of SMB2_QUERY_INFO request buffers and
validating input length to prevent other callers from overflowing @Req
in SMB2_query_info_init() as well.

Fixes: f5b05d6 ("cifs: add IOCTL for QUERY_INFO passthrough to userspace")
Cc: [email protected]
Reported-by: Robert Morris <[email protected]>
Signed-off-by: Paulo Alcantara <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
pragow0k pushed a commit to pragow0k/linux-flex-imx that referenced this pull request Mar 25, 2024
[ Upstream commit fc3a5534e2a8855427403113cbeb54af5837bbe0 ]

An issue occurred while reading an ELF file in libbpf.c during fuzzing:

	Program received signal SIGSEGV, Segmentation fault.
	0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	4206 in libbpf.c
	(gdb) bt
	#0 0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	MentorEmbedded#1 0x000000000094f9d6 in bpf_object.collect_relos () at libbpf.c:6706
	MentorEmbedded#2 0x000000000092bef3 in bpf_object_open () at libbpf.c:7437
	MentorEmbedded#3 0x000000000092c046 in bpf_object.open_mem () at libbpf.c:7497
	MentorEmbedded#4 0x0000000000924afa in LLVMFuzzerTestOneInput () at fuzz/bpf-object-fuzzer.c:16
	MentorEmbedded#5 0x000000000060be11 in testblitz_engine::fuzzer::Fuzzer::run_one ()
	MentorEmbedded#6 0x000000000087ad92 in tracing::span::Span::in_scope ()
	MentorEmbedded#7 0x00000000006078aa in testblitz_engine::fuzzer::util::walkdir ()
	MentorEmbedded#8 0x00000000005f3217 in testblitz_engine::entrypoint::main::{{closure}} ()
	MentorEmbedded#9 0x00000000005f2601 in main ()
	(gdb)

scn_data was null at this code(tools/lib/bpf/src/libbpf.c):

	if (rel->r_offset % BPF_INSN_SZ || rel->r_offset >= scn_data->d_size) {

The scn_data is derived from the code above:

	scn = elf_sec_by_idx(obj, sec_idx);
	scn_data = elf_sec_data(obj, scn);

	relo_sec_name = elf_sec_str(obj, shdr->sh_name);
	sec_name = elf_sec_name(obj, scn);
	if (!relo_sec_name || !sec_name)// don't check whether scn_data is NULL
		return -EINVAL;

In certain special scenarios, such as reading a malformed ELF file,
it is possible that scn_data may be a null pointer

Signed-off-by: Mingyi Zhang <[email protected]>
Signed-off-by: Xin Liu <[email protected]>
Signed-off-by: Changye Wu <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
pragow0k pushed a commit to pragow0k/linux-flex-imx that referenced this pull request Mar 25, 2024
commit 817349b6d26aadd8b38283a05ce0bab106b4c765 upstream.

Upstream commit bac1ec5 ("usb: xhci: Set quirk for
XHCI_SG_TRB_CACHE_SIZE_QUIRK") introduced a new quirk in XHCI
which fixes XHC timeout, which was seen on synopsys XHCs while
using SG buffers. But the support for this quirk isn't present
in the DWC3 layer.

We will encounter this XHCI timeout/hung issue if we run iperf
loopback tests using RTL8156 ethernet adaptor on DWC3 targets
with scatter-gather enabled. This gets resolved after enabling
the XHCI_SG_TRB_CACHE_SIZE_QUIRK. This patch enables it using
the xhci device property since its needed for DWC3 controller.

In Synopsys DWC3 databook,
Table 9-3: xHCI Debug Capability Limitations
Chained TRBs greater than TRB cache size: The debug capability
driver must not create a multi-TRB TD that describes smaller
than a 1K packet that spreads across 8 or more TRBs on either
the IN TR or the OUT TR.

Cc: [email protected] MentorEmbedded#5.11
Signed-off-by: Prashanth K <[email protected]>
Acked-by: Thinh Nguyen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
pragow0k pushed a commit to pragow0k/linux-flex-imx that referenced this pull request Sep 11, 2024
commit c2274b908db05529980ec056359fae916939fdaa upstream.

The reader code in rb_get_reader_page() swaps a new reader page into the
ring buffer by doing cmpxchg on old->list.prev->next to point it to the
new page. Following that, if the operation is successful,
old->list.next->prev gets updated too. This means the underlying
doubly-linked list is temporarily inconsistent, page->prev->next or
page->next->prev might not be equal back to page for some page in the
ring buffer.

The resize operation in ring_buffer_resize() can be invoked in parallel.
It calls rb_check_pages() which can detect the described inconsistency
and stop further tracing:

[  190.271762] ------------[ cut here ]------------
[  190.271771] WARNING: CPU: 1 PID: 6186 at kernel/trace/ring_buffer.c:1467 rb_check_pages.isra.0+0x6a/0xa0
[  190.271789] Modules linked in: [...]
[  190.271991] Unloaded tainted modules: intel_uncore_frequency(E):1 skx_edac(E):1
[  190.272002] CPU: 1 PID: 6186 Comm: cmd.sh Kdump: loaded Tainted: G            E      6.9.0-rc6-default MentorEmbedded#5 158d3e1e6d0b091c34c3b96bfd99a1c58306d79f
[  190.272011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552c-rebuilt.opensuse.org 04/01/2014
[  190.272015] RIP: 0010:rb_check_pages.isra.0+0x6a/0xa0
[  190.272023] Code: [...]
[  190.272028] RSP: 0018:ffff9c37463abb70 EFLAGS: 00010206
[  190.272034] RAX: ffff8eba04b6cb80 RBX: 0000000000000007 RCX: ffff8eba01f13d80
[  190.272038] RDX: ffff8eba01f130c0 RSI: ffff8eba04b6cd00 RDI: ffff8eba0004c700
[  190.272042] RBP: ffff8eba0004c700 R08: 0000000000010002 R09: 0000000000000000
[  190.272045] R10: 00000000ffff7f52 R11: ffff8eba7f600000 R12: ffff8eba0004c720
[  190.272049] R13: ffff8eba00223a00 R14: 0000000000000008 R15: ffff8eba067a8000
[  190.272053] FS:  00007f1bd64752c0(0000) GS:ffff8eba7f680000(0000) knlGS:0000000000000000
[  190.272057] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  190.272061] CR2: 00007f1bd6662590 CR3: 000000010291e001 CR4: 0000000000370ef0
[  190.272070] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  190.272073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  190.272077] Call Trace:
[  190.272098]  <TASK>
[  190.272189]  ring_buffer_resize+0x2ab/0x460
[  190.272199]  __tracing_resize_ring_buffer.part.0+0x23/0xa0
[  190.272206]  tracing_resize_ring_buffer+0x65/0x90
[  190.272216]  tracing_entries_write+0x74/0xc0
[  190.272225]  vfs_write+0xf5/0x420
[  190.272248]  ksys_write+0x67/0xe0
[  190.272256]  do_syscall_64+0x82/0x170
[  190.272363]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  190.272373] RIP: 0033:0x7f1bd657d263
[  190.272381] Code: [...]
[  190.272385] RSP: 002b:00007ffe72b643f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  190.272391] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1bd657d263
[  190.272395] RDX: 0000000000000002 RSI: 0000555a6eb538e0 RDI: 0000000000000001
[  190.272398] RBP: 0000555a6eb538e0 R08: 000000000000000a R09: 0000000000000000
[  190.272401] R10: 0000555a6eb55190 R11: 0000000000000246 R12: 00007f1bd6662500
[  190.272404] R13: 0000000000000002 R14: 00007f1bd6667c00 R15: 0000000000000002
[  190.272412]  </TASK>
[  190.272414] ---[ end trace 0000000000000000 ]---

Note that ring_buffer_resize() calls rb_check_pages() only if the parent
trace_buffer has recording disabled. Recent commit d78ab792705c
("tracing: Stop current tracer when resizing buffer") causes that it is
now always the case which makes it more likely to experience this issue.

The window to hit this race is nonetheless very small. To help
reproducing it, one can add a delay loop in rb_get_reader_page():

 ret = rb_head_page_replace(reader, cpu_buffer->reader_page);
 if (!ret)
 	goto spin;
 for (unsigned i = 0; i < 1U << 26; i++)  /* inserted delay loop */
 	__asm__ __volatile__ ("" : : : "memory");
 rb_list_head(reader->list.next)->prev = &cpu_buffer->reader_page->list;

.. and then run the following commands on the target system:

 echo 1 > /sys/kernel/tracing/events/sched/sched_switch/enable
 while true; do
 	echo 16 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1
 	echo 8 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1
 done &
 while true; do
 	for i in /sys/kernel/tracing/per_cpu/*; do
 		timeout 0.1 cat $i/trace_pipe; sleep 0.2
 	done
 done

To fix the problem, make sure ring_buffer_resize() doesn't invoke
rb_check_pages() concurrently with a reader operating on the same
ring_buffer_per_cpu by taking its cpu_buffer->reader_lock.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]

Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Fixes: 659f451 ("ring-buffer: Add integrity check at end of iter read")
Signed-off-by: Petr Pavlu <[email protected]>
[ Fixed whitespace ]
Signed-off-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
pragow0k pushed a commit to pragow0k/linux-flex-imx that referenced this pull request Sep 11, 2024
commit 22f00812862564b314784167a89f27b444f82a46 upstream.

The syzbot fuzzer found that the interrupt-URB completion callback in
the cdc-wdm driver was taking too long, and the driver's immediate
resubmission of interrupt URBs with -EPROTO status combined with the
dummy-hcd emulation to cause a CPU lockup:

cdc_wdm 1-1:1.0: nonzero urb status received: -71
cdc_wdm 1-1:1.0: wdm_int_callback - 0 bytes
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [syz-executor782:6625]
CPU#0 Utilization every 4s during lockup:
	MentorEmbedded#1:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
	MentorEmbedded#2:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
	MentorEmbedded#3:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
	MentorEmbedded#4:  98% system,	  0% softirq,	  3% hardirq,	  0% idle
	MentorEmbedded#5:  98% system,	  1% softirq,	  3% hardirq,	  0% idle
Modules linked in:
irq event stamp: 73096
hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_emit_next_record kernel/printk/printk.c:2935 [inline]
hardirqs last  enabled at (73095): [<ffff80008037bc00>] console_flush_all+0x650/0xb74 kernel/printk/printk.c:2994
hardirqs last disabled at (73096): [<ffff80008af10b00>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline]
hardirqs last disabled at (73096): [<ffff80008af10b00>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551
softirqs last  enabled at (73048): [<ffff8000801ea530>] softirq_handle_end kernel/softirq.c:400 [inline]
softirqs last  enabled at (73048): [<ffff8000801ea530>] handle_softirqs+0xa60/0xc34 kernel/softirq.c:582
softirqs last disabled at (73043): [<ffff800080020de8>] __do_softirq+0x14/0x20 kernel/softirq.c:588
CPU: 0 PID: 6625 Comm: syz-executor782 Tainted: G        W          6.10.0-rc2-syzkaller-g8867bbd4a056 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024

Testing showed that the problem did not occur if the two error
messages -- the first two lines above -- were removed; apparently adding
material to the kernel log takes a surprisingly large amount of time.

In any case, the best approach for preventing these lockups and to
avoid spamming the log with thousands of error messages per second is
to ratelimit the two dev_err() calls.  Therefore we replace them with
dev_err_ratelimited().

Signed-off-by: Alan Stern <[email protected]>
Suggested-by: Greg KH <[email protected]>
Reported-and-tested-by: [email protected]
Closes: https://lore.kernel.org/linux-usb/[email protected]/
Reported-and-tested-by: [email protected]
Closes: https://lore.kernel.org/linux-usb/[email protected]/
Fixes: 9908a32 ("USB: remove err() macro from usb class drivers")
Link: https://lore.kernel.org/linux-usb/[email protected]/
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
pragow0k pushed a commit to pragow0k/linux-flex-imx that referenced this pull request Sep 11, 2024
commit be346c1a6eeb49d8fda827d2a9522124c2f72f36 upstream.

The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits().  This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.

Extent tree manipulations do often extend the current transaction but not
in all of the cases.  For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents.  Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error.  This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.

To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().

Heming Zhao said:

------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"

PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
  #0 machine_kexec at ffffffff8c069932
  MentorEmbedded#1 __crash_kexec at ffffffff8c1338fa
  MentorEmbedded#2 panic at ffffffff8c1d69b9
  MentorEmbedded#3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
  MentorEmbedded#4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
  MentorEmbedded#5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
  MentorEmbedded#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
  MentorEmbedded#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
  MentorEmbedded#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
  MentorEmbedded#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
MentorEmbedded#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
#11 dio_complete at ffffffff8c2b9fa7
#12 do_blockdev_direct_IO at ffffffff8c2bc09f
#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
#14 generic_file_direct_write at ffffffff8c1dcf14
#15 __generic_file_write_iter at ffffffff8c1dd07b
#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
#17 aio_write at ffffffff8c2cc72e
#18 kmem_cache_alloc at ffffffff8c248dde
#19 do_io_submit at ffffffff8c2ccada
#20 do_syscall_64 at ffffffff8c004984
#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io")
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Joseph Qi <[email protected]>
Reviewed-by: Heming Zhao <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
pragow0k pushed a commit to pragow0k/linux-flex-imx that referenced this pull request Sep 11, 2024
[ Upstream commit 86a41ea9fd79ddb6145cb8ebf5aeafceabca6f7d ]

When l2tp tunnels use a socket provided by userspace, we can hit
lockdep splats like the below when data is transmitted through another
(unrelated) userspace socket which then gets routed over l2tp.

This issue was previously discussed here:
https://lore.kernel.org/netdev/[email protected]/

The solution is to have lockdep treat socket locks of l2tp tunnel
sockets separately than those of standard INET sockets. To do so, use
a different lockdep subclass where lock nesting is possible.

  ============================================
  WARNING: possible recursive locking detected
  6.10.0+ #34 Not tainted
  --------------------------------------------
  iperf3/771 is trying to acquire lock:
  ffff8881027601d8 (slock-AF_INET/1){+.-.}-{2:2}, at: l2tp_xmit_skb+0x243/0x9d0

  but task is already holding lock:
  ffff888102650d98 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1848/0x1e10

  other info that might help us debug this:
   Possible unsafe locking scenario:

         CPU0
         ----
    lock(slock-AF_INET/1);
    lock(slock-AF_INET/1);

   *** DEADLOCK ***

   May be due to missing lock nesting notation

  10 locks held by iperf3/771:
   #0: ffff888102650258 (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendmsg+0x1a/0x40
   MentorEmbedded#1: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x4b/0xbc0
   MentorEmbedded#2: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x17a/0x1130
   MentorEmbedded#3: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: process_backlog+0x28b/0x9f0
   MentorEmbedded#4: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0xf9/0x260
   MentorEmbedded#5: ffff888102650d98 (slock-AF_INET/1){+.-.}-{2:2}, at: tcp_v4_rcv+0x1848/0x1e10
   MentorEmbedded#6: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x4b/0xbc0
   MentorEmbedded#7: ffffffff822ac220 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x17a/0x1130
   MentorEmbedded#8: ffffffff822ac1e0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0xcc/0x1450
   MentorEmbedded#9: ffff888101f33258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock#2){+...}-{2:2}, at: __dev_queue_xmit+0x513/0x1450

  stack backtrace:
  CPU: 2 UID: 0 PID: 771 Comm: iperf3 Not tainted 6.10.0+ #34
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
  Call Trace:
   <IRQ>
   dump_stack_lvl+0x69/0xa0
   dump_stack+0xc/0x20
   __lock_acquire+0x135d/0x2600
   ? srso_alias_return_thunk+0x5/0xfbef5
   lock_acquire+0xc4/0x2a0
   ? l2tp_xmit_skb+0x243/0x9d0
   ? __skb_checksum+0xa3/0x540
   _raw_spin_lock_nested+0x35/0x50
   ? l2tp_xmit_skb+0x243/0x9d0
   l2tp_xmit_skb+0x243/0x9d0
   l2tp_eth_dev_xmit+0x3c/0xc0
   dev_hard_start_xmit+0x11e/0x420
   sch_direct_xmit+0xc3/0x640
   __dev_queue_xmit+0x61c/0x1450
   ? ip_finish_output2+0xf4c/0x1130
   ip_finish_output2+0x6b6/0x1130
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __ip_finish_output+0x217/0x380
   ? srso_alias_return_thunk+0x5/0xfbef5
   __ip_finish_output+0x217/0x380
   ip_output+0x99/0x120
   __ip_queue_xmit+0xae4/0xbc0
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? tcp_options_write.constprop.0+0xcb/0x3e0
   ip_queue_xmit+0x34/0x40
   __tcp_transmit_skb+0x1625/0x1890
   __tcp_send_ack+0x1b8/0x340
   tcp_send_ack+0x23/0x30
   __tcp_ack_snd_check+0xa8/0x530
   ? srso_alias_return_thunk+0x5/0xfbef5
   tcp_rcv_established+0x412/0xd70
   tcp_v4_do_rcv+0x299/0x420
   tcp_v4_rcv+0x1991/0x1e10
   ip_protocol_deliver_rcu+0x50/0x220
   ip_local_deliver_finish+0x158/0x260
   ip_local_deliver+0xc8/0xe0
   ip_rcv+0xe5/0x1d0
   ? __pfx_ip_rcv+0x10/0x10
   __netif_receive_skb_one_core+0xce/0xe0
   ? process_backlog+0x28b/0x9f0
   __netif_receive_skb+0x34/0xd0
   ? process_backlog+0x28b/0x9f0
   process_backlog+0x2cb/0x9f0
   __napi_poll.constprop.0+0x61/0x280
   net_rx_action+0x332/0x670
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? find_held_lock+0x2b/0x80
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   handle_softirqs+0xda/0x480
   ? __dev_queue_xmit+0xa2c/0x1450
   do_softirq+0xa1/0xd0
   </IRQ>
   <TASK>
   __local_bh_enable_ip+0xc8/0xe0
   ? __dev_queue_xmit+0xa2c/0x1450
   __dev_queue_xmit+0xa48/0x1450
   ? ip_finish_output2+0xf4c/0x1130
   ip_finish_output2+0x6b6/0x1130
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __ip_finish_output+0x217/0x380
   ? srso_alias_return_thunk+0x5/0xfbef5
   __ip_finish_output+0x217/0x380
   ip_output+0x99/0x120
   __ip_queue_xmit+0xae4/0xbc0
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? tcp_options_write.constprop.0+0xcb/0x3e0
   ip_queue_xmit+0x34/0x40
   __tcp_transmit_skb+0x1625/0x1890
   tcp_write_xmit+0x766/0x2fb0
   ? __entry_text_end+0x102ba9/0x102bad
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? __might_fault+0x74/0xc0
   ? srso_alias_return_thunk+0x5/0xfbef5
   __tcp_push_pending_frames+0x56/0x190
   tcp_push+0x117/0x310
   tcp_sendmsg_locked+0x14c1/0x1740
   tcp_sendmsg+0x28/0x40
   inet_sendmsg+0x5d/0x90
   sock_write_iter+0x242/0x2b0
   vfs_write+0x68d/0x800
   ? __pfx_sock_write_iter+0x10/0x10
   ksys_write+0xc8/0xf0
   __x64_sys_write+0x3d/0x50
   x64_sys_call+0xfaf/0x1f50
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x76/0x7e
  RIP: 0033:0x7f4d143af992
  Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 01 cc ff ff 41 54 b8 02 00 00 0
  RSP: 002b:00007ffd65032058 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
  RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f4d143af992
  RDX: 0000000000000025 RSI: 00007f4d143f3bcc RDI: 0000000000000005
  RBP: 00007f4d143f2b28 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4d143f3bcc
  R13: 0000000000000005 R14: 0000000000000000 R15: 00007ffd650323f0
   </TASK>

Fixes: 0b2c59720e65 ("l2tp: close all race conditions in l2tp_tunnel_register()")
Suggested-by: Eric Dumazet <[email protected]>
Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=6acef9e0a4d1f46c83d4
CC: [email protected]
CC: [email protected]
Signed-off-by: James Chapman <[email protected]>
Signed-off-by: Tom Parkin <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.