-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for registering misc devices #44
Conversation
Can you describe the design decisions you went for here in a bit more detail? (e.g. exposing Pin in the public API) |
Yes, absolutely. When misc_register succeeds, the miscdevice that was passed in is added to an intrusive linked list. This means that it is not safe to move the miscdevice we used when calling misc_register, otherwise we will create dangling pointers (in the list managed by the kernel). I can see at least two options:
Option 1 has the drawbacks that it requires an extra allocation per registration, and the user may not be aware that they cannot move it. I chose option 2 because it makes it clear to the user that they are not supposed to move it. Also, when we create devices, we usually create additional state beyond the registration. Option 2 allows for zero-cost abstraction in this case, i.e., we can allocate a single struct containing the registration plus any additional fields we need. Option 1 would require us to allocate the registration separately and store a pointer to it in the struct holding additional state for the device. Additionally, option 2 does not preclude option 1: we can always have an API that wraps a registration in a box and returns that if the caller does not need zero-cost. One aspect of this implementation that I'm not satisfied with is the presence of the "registered" field. At the moment I have it because I need to ensure that we don't try to unregister a registration that wasn't ever registered or whose registration attempt failed. I tried using MaybeUninit (which would prevent drop from being called on a registration that isn't completed yet), but I couldn't find a good combination of MaybUninit and Pin that gave me the behaviour that I needed. If you all have suggestions for improving this, I'm all ears. Thanks! |
drivers/char/rust_example/src/lib.rs
Outdated
@@ -36,8 +51,14 @@ impl KernelModule for RustExample { | |||
println!("Parameters:"); | |||
println!(" my_bool: {}", my_bool.read()); | |||
println!(" my_i32: {}", my_i32.read()); | |||
|
|||
let mut d = kernel::try_alloc_pinned(miscdev::Registration::new())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to have Registration::new()
do the pin? is there ever a use case for a non-pinned value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot pin in new
because we do not have a location that we know can be pinned.
We could create one, which is option 2 I mention in the other message, but it entails an extra allocation (i.e., makes the abstraction not zero-cost).
Another option is to pass an uninitialised pinned Registration, which is then initialised in new(). But I couldn't find a clean way to do this when the registration is embedded in another struct because the documentation says that we cannot initialise individual fields of structs using MaybeUninit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm clearly thinking slowly, but the implementation right now already involves a heap alloc, is there a way to use this API that doesn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we can allocate the registration on the stack, pin it, register, use the device, sleep/wait, then return (the implicit drop would unregister before the memory is freed). The pin_mut macro allows us to do this without any unsafe code.
But this isn't the most common usage. The most common usage would be something like this:
struct DeviceContext {
x: SomeType,
y: SomeType,
z: SomeType,
reg: miscdev::Registration,
}
This needs a single allocation. The user would allocate a single block of memory for all fields in the struct then pin the whole thing. To call Registration::register, one would use pinned project (there are crates to do this without any unsafe code, for example this).
If we were to allocate inside Registration::new
, the example DeviceContext above would require two allocations instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'm following now. Can we do Registration::new_boxed()
and Registration::new()
then, to let folks who don't mind the extra alloc have a convenient API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll add new_boxed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually added new_pinned
-- since the constructor will have a pinned memory location, I also call register from the it so the returned Registration is ready in one shot on success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, took me a few read throughts, but now I get everything that's going on here and I've got an actual review.
Thanks for being patient with me!
drivers/char/rust_example/src/lib.rs
Outdated
@@ -36,8 +51,14 @@ impl KernelModule for RustExample { | |||
println!("Parameters:"); | |||
println!(" my_bool: {}", my_bool.read()); | |||
println!(" my_i32: {}", my_i32.read()); | |||
|
|||
let mut d = kernel::try_alloc_pinned(miscdev::Registration::new())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'm following now. Can we do Registration::new_boxed()
and Registration::new()
then, to let folks who don't mind the extra alloc have a convenient API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few trivial comments from a very quick read.
Signed-off-by: Wedson Almeida Filho <[email protected]>
These two types of XDP progs (BPF_XDP_DEVMAP, BPF_XDP_CPUMAP) will not be executed directly in the driver, therefore we should also not directly run them from here. To run in these two situations, there must be further preparations done, otherwise these may cause a kernel panic. For more details, see also dev_xdp_attach(). [ 46.982479] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 46.984295] #PF: supervisor read access in kernel mode [ 46.985777] #PF: error_code(0x0000) - not-present page [ 46.987227] PGD 800000010dca4067 P4D 800000010dca4067 PUD 10dca6067 PMD 0 [ 46.989201] Oops: 0000 [#1] SMP PTI [ 46.990304] CPU: 7 PID: 562 Comm: a.out Not tainted 5.13.0+ #44 [ 46.992001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/24 [ 46.995113] RIP: 0010:___bpf_prog_run+0x17b/0x1710 [ 46.996586] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02 [ 47.001562] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246 [ 47.003115] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000 [ 47.005163] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98 [ 47.007135] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff [ 47.009171] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98 [ 47.011172] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8 [ 47.013244] FS: 00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 [ 47.015705] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.017475] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0 [ 47.019558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.021595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.023574] PKRU: 55555554 [ 47.024571] Call Trace: [ 47.025424] __bpf_prog_run32+0x32/0x50 [ 47.026296] ? printk+0x53/0x6a [ 47.027066] ? ktime_get+0x39/0x90 [ 47.027895] bpf_test_run.cold.28+0x23/0x123 [ 47.028866] ? printk+0x53/0x6a [ 47.029630] bpf_prog_test_run_xdp+0x149/0x1d0 [ 47.030649] __sys_bpf+0x1305/0x23d0 [ 47.031482] __x64_sys_bpf+0x17/0x20 [ 47.032316] do_syscall_64+0x3a/0x80 [ 47.033165] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 47.034254] RIP: 0033:0x7f04a51364dd [ 47.035133] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 48 [ 47.038768] RSP: 002b:00007fff8f9fc518 EFLAGS: 00000213 ORIG_RAX: 0000000000000141 [ 47.040344] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f04a51364dd [ 47.041749] RDX: 0000000000000048 RSI: 0000000020002a80 RDI: 000000000000000a [ 47.043171] RBP: 00007fff8f9fc530 R08: 0000000002049300 R09: 0000000020000100 [ 47.044626] R10: 0000000000000004 R11: 0000000000000213 R12: 0000000000401070 [ 47.046088] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 47.047579] Modules linked in: [ 47.048318] CR2: 0000000000000000 [ 47.049120] ---[ end trace 7ad34443d5be719a ]--- [ 47.050273] RIP: 0010:___bpf_prog_run+0x17b/0x1710 [ 47.051343] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02 [ 47.054943] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246 [ 47.056068] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000 [ 47.057522] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98 [ 47.058961] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff [ 47.060390] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98 [ 47.061803] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8 [ 47.063249] FS: 00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 [ 47.065070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.066307] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0 [ 47.067747] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.069217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.070652] PKRU: 55555554 [ 47.071318] Kernel panic - not syncing: Fatal exception [ 47.072854] Kernel Offset: disabled [ 47.073683] ---[ end Kernel panic - not syncing: Fatal exception ]--- Fixes: 9216477 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap") Fixes: fbee97f ("bpf: Add support to attach bpf program to a devmap entry") Reported-by: Abaci <[email protected]> Signed-off-by: Xuan Zhuo <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Dust Li <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Acked-by: David Ahern <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
The return thunk call makes the fastop functions larger, just like IBT does. Consider a 16-byte FASTOP_SIZE when CONFIG_RETHUNK is enabled. Otherwise, functions will be incorrectly aligned and when computing their position for differently sized operators, they will executed in the middle or end of a function, which may as well be an int3, leading to a crash like: [ 36.091116] int3: 0000 [#1] SMP NOPTI [ 36.091119] CPU: 3 PID: 1371 Comm: qemu-system-x86 Not tainted 5.15.0-41-generic #44 [ 36.091120] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 [ 36.091121] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm] [ 36.091185] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc [ 36.091186] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202 [ 36.091188] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000 [ 36.091188] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200 [ 36.091189] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002 [ 36.091190] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70 [ 36.091190] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000 [ 36.091191] FS: 00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000 [ 36.091192] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 36.091192] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0 [ 36.091195] PKRU: 55555554 [ 36.091195] Call Trace: [ 36.091197] <TASK> [ 36.091198] ? fastop+0x5a/0xa0 [kvm] [ 36.091222] x86_emulate_insn+0x7b8/0xe90 [kvm] [ 36.091244] x86_emulate_instruction+0x2f4/0x630 [kvm] [ 36.091263] ? kvm_arch_vcpu_load+0x7c/0x230 [kvm] [ 36.091283] ? vmx_prepare_switch_to_host+0xf7/0x190 [kvm_intel] [ 36.091290] complete_emulated_mmio+0x297/0x320 [kvm] [ 36.091310] kvm_arch_vcpu_ioctl_run+0x32f/0x550 [kvm] [ 36.091330] kvm_vcpu_ioctl+0x29e/0x6d0 [kvm] [ 36.091344] ? kvm_vcpu_ioctl+0x120/0x6d0 [kvm] [ 36.091357] ? __fget_files+0x86/0xc0 [ 36.091362] ? __fget_files+0x86/0xc0 [ 36.091363] __x64_sys_ioctl+0x92/0xd0 [ 36.091366] do_syscall_64+0x59/0xc0 [ 36.091369] ? syscall_exit_to_user_mode+0x27/0x50 [ 36.091370] ? do_syscall_64+0x69/0xc0 [ 36.091371] ? syscall_exit_to_user_mode+0x27/0x50 [ 36.091372] ? __x64_sys_writev+0x1c/0x30 [ 36.091374] ? do_syscall_64+0x69/0xc0 [ 36.091374] ? exit_to_user_mode_prepare+0x37/0xb0 [ 36.091378] ? syscall_exit_to_user_mode+0x27/0x50 [ 36.091379] ? do_syscall_64+0x69/0xc0 [ 36.091379] ? do_syscall_64+0x69/0xc0 [ 36.091380] ? do_syscall_64+0x69/0xc0 [ 36.091381] ? do_syscall_64+0x69/0xc0 [ 36.091381] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 36.091384] RIP: 0033:0x7efdfe6d1aff [ 36.091390] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ 36.091391] RSP: 002b:00007efdfce8c460 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 36.091393] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007efdfe6d1aff [ 36.091393] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000c [ 36.091394] RBP: 0000558f1609e220 R08: 0000558f13fb8190 R09: 00000000ffffffff [ 36.091394] R10: 0000558f16b5e950 R11: 0000000000000246 R12: 0000000000000000 [ 36.091394] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 [ 36.091396] </TASK> [ 36.091397] Modules linked in: isofs nls_iso8859_1 kvm_intel joydev kvm input_leds serio_raw sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler drm msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net net_failover crypto_simd ahci xhci_pci cryptd psmouse virtio_blk libahci xhci_pci_renesas failover [ 36.123271] ---[ end trace db3c0ab5a48fabcc ]--- [ 36.123272] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm] [ 36.123319] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc [ 36.123320] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202 [ 36.123321] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000 [ 36.123321] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200 [ 36.123322] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002 [ 36.123322] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70 [ 36.123323] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000 [ 36.123323] FS: 00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000 [ 36.123324] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 36.123325] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0 [ 36.123327] PKRU: 55555554 [ 36.123328] Kernel panic - not syncing: Fatal exception in interrupt [ 36.123410] Kernel Offset: 0x1400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 36.135305] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Fixes: aa3d480 ("x86: Use return-thunk in asm code") Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]> Co-developed-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Paolo Bonzini <[email protected]> Reported-by: Linux Kernel Functional Testing <[email protected]> Message-Id: <[email protected]> Tested-by: Jack Wang <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
KCSAN has discovered a data race in kernel/workqueue.c:2598: [ 1863.554079] ================================================================== [ 1863.554118] BUG: KCSAN: data-race in process_one_work / process_one_work [ 1863.554142] write to 0xffff963d99d79998 of 8 bytes by task 5394 on cpu 27: [ 1863.554154] process_one_work (kernel/workqueue.c:2598) [ 1863.554166] worker_thread (./include/linux/list.h:292 kernel/workqueue.c:2752) [ 1863.554177] kthread (kernel/kthread.c:389) [ 1863.554186] ret_from_fork (arch/x86/kernel/process.c:145) [ 1863.554197] ret_from_fork_asm (arch/x86/entry/entry_64.S:312) [ 1863.554213] read to 0xffff963d99d79998 of 8 bytes by task 5450 on cpu 12: [ 1863.554224] process_one_work (kernel/workqueue.c:2598) [ 1863.554235] worker_thread (./include/linux/list.h:292 kernel/workqueue.c:2752) [ 1863.554247] kthread (kernel/kthread.c:389) [ 1863.554255] ret_from_fork (arch/x86/kernel/process.c:145) [ 1863.554266] ret_from_fork_asm (arch/x86/entry/entry_64.S:312) [ 1863.554280] value changed: 0x0000000000001766 -> 0x000000000000176a [ 1863.554295] Reported by Kernel Concurrency Sanitizer on: [ 1863.554303] CPU: 12 PID: 5450 Comm: kworker/u64:1 Tainted: G L 6.5.0-rc6+ #44 [ 1863.554314] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023 [ 1863.554322] Workqueue: btrfs-endio btrfs_end_bio_work [btrfs] [ 1863.554941] ================================================================== lockdep_invariant_state(true); → pwq->stats[PWQ_STAT_STARTED]++; trace_workqueue_execute_start(work); worker->current_func(work); Moving pwq->stats[PWQ_STAT_STARTED]++; before the line raw_spin_unlock_irq(&pool->lock); resolves the data race without performance penalty. KCSAN detected at least one additional data race: [ 157.834751] ================================================================== [ 157.834770] BUG: KCSAN: data-race in process_one_work / process_one_work [ 157.834793] write to 0xffff9934453f77a0 of 8 bytes by task 468 on cpu 29: [ 157.834804] process_one_work (/home/marvin/linux/kernel/linux_torvalds/kernel/workqueue.c:2606) [ 157.834815] worker_thread (/home/marvin/linux/kernel/linux_torvalds/./include/linux/list.h:292 /home/marvin/linux/kernel/linux_torvalds/kernel/workqueue.c:2752) [ 157.834826] kthread (/home/marvin/linux/kernel/linux_torvalds/kernel/kthread.c:389) [ 157.834834] ret_from_fork (/home/marvin/linux/kernel/linux_torvalds/arch/x86/kernel/process.c:145) [ 157.834845] ret_from_fork_asm (/home/marvin/linux/kernel/linux_torvalds/arch/x86/entry/entry_64.S:312) [ 157.834859] read to 0xffff9934453f77a0 of 8 bytes by task 214 on cpu 7: [ 157.834868] process_one_work (/home/marvin/linux/kernel/linux_torvalds/kernel/workqueue.c:2606) [ 157.834879] worker_thread (/home/marvin/linux/kernel/linux_torvalds/./include/linux/list.h:292 /home/marvin/linux/kernel/linux_torvalds/kernel/workqueue.c:2752) [ 157.834890] kthread (/home/marvin/linux/kernel/linux_torvalds/kernel/kthread.c:389) [ 157.834897] ret_from_fork (/home/marvin/linux/kernel/linux_torvalds/arch/x86/kernel/process.c:145) [ 157.834907] ret_from_fork_asm (/home/marvin/linux/kernel/linux_torvalds/arch/x86/entry/entry_64.S:312) [ 157.834920] value changed: 0x000000000000052a -> 0x0000000000000532 [ 157.834933] Reported by Kernel Concurrency Sanitizer on: [ 157.834941] CPU: 7 PID: 214 Comm: kworker/u64:2 Tainted: G L 6.5.0-rc7-kcsan-00169-g81eaf55a60fc #4 [ 157.834951] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023 [ 157.834958] Workqueue: btrfs-endio btrfs_end_bio_work [btrfs] [ 157.835567] ================================================================== in code: trace_workqueue_execute_end(work, worker->current_func); → pwq->stats[PWQ_STAT_COMPLETED]++; lock_map_release(&lockdep_map); lock_map_release(&pwq->wq->lockdep_map); which needs to be resolved separately. Fixes: 725e8ec ("workqueue: Add pwq->stats[] and a monitoring script") Cc: Tejun Heo <[email protected]> Suggested-by: Lai Jiangshan <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Mirsad Goran Todorovac <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
If the net_cls subsystem is already mounted, attempting to mount it again in setup_classid_environment() will result in a failure with the error code EBUSY. Despite this, tmpfs will have been successfully mounted at /sys/fs/cgroup/net_cls. Consequently, the /sys/fs/cgroup/net_cls directory will be empty, causing subsequent setup operations to fail. Here's an error log excerpt illustrating the issue when net_cls has already been mounted at /sys/fs/cgroup/net_cls prior to running setup_classid_environment(): - Before that change $ tools/testing/selftests/bpf/test_progs --name=cgroup_v1v2 test_cgroup_v1v2:PASS:server_fd 0 nsec test_cgroup_v1v2:PASS:client_fd 0 nsec test_cgroup_v1v2:PASS:cgroup_fd 0 nsec test_cgroup_v1v2:PASS:server_fd 0 nsec run_test:PASS:skel_open 0 nsec run_test:PASS:prog_attach 0 nsec test_cgroup_v1v2:PASS:cgroup-v2-only 0 nsec (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup.procs (cgroup_helpers.c:540: errno: No such file or directory) Opening cgroup classid: /sys/fs/cgroup/net_cls/cgroup-test-work-dir/net_cls.classid run_test:PASS:skel_open 0 nsec run_test:PASS:prog_attach 0 nsec (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup-test-work-dir/cgroup.procs run_test:FAIL:join_classid unexpected error: 1 (errno 2) test_cgroup_v1v2:FAIL:cgroup-v1v2 unexpected error: -1 (errno 2) (cgroup_helpers.c:248: errno: No such file or directory) Opening Cgroup Procs: /sys/fs/cgroup/net_cls/cgroup.procs #44 cgroup_v1v2:FAIL Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED - After that change $ tools/testing/selftests/bpf/test_progs --name=cgroup_v1v2 #44 cgroup_v1v2:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yafang Shao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
If the ata_port_alloc() call in ata_host_alloc() fails, ata_host_release() will get called. However, the code in ata_host_release() tries to free ata_port struct members unconditionally, which can lead to the following: BUG: unable to handle page fault for address: 0000000000003990 PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 10 PID: 594 Comm: (udev-worker) Not tainted 6.10.0-rc5 #44 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:ata_host_release.cold+0x2f/0x6e [libata] Code: e4 4d 63 f4 44 89 e2 48 c7 c6 90 ad 32 c0 48 c7 c7 d0 70 33 c0 49 83 c6 0e 41 RSP: 0018:ffffc90000ebb968 EFLAGS: 00010246 RAX: 0000000000000041 RBX: ffff88810fb52e78 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88813b3218c0 RDI: ffff88813b3218c0 RBP: ffff88810fb52e40 R08: 0000000000000000 R09: 6c65725f74736f68 R10: ffffc90000ebb738 R11: 73692033203a746e R12: 0000000000000004 R13: 0000000000000000 R14: 0000000000000011 R15: 0000000000000006 FS: 00007f6cc55b9980(0000) GS:ffff88813b300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000003990 CR3: 00000001122a2000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? page_fault_oops+0x15a/0x2f0 ? exc_page_fault+0x7e/0x180 ? asm_exc_page_fault+0x26/0x30 ? ata_host_release.cold+0x2f/0x6e [libata] ? ata_host_release.cold+0x2f/0x6e [libata] release_nodes+0x35/0xb0 devres_release_group+0x113/0x140 ata_host_alloc+0xed/0x120 [libata] ata_host_alloc_pinfo+0x14/0xa0 [libata] ahci_init_one+0x6c9/0xd20 [ahci] Do not access ata_port struct members unconditionally. Fixes: 633273a ("libata-pmp: hook PMP support and enable it") Cc: [email protected] Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Niklas Cassel <[email protected]>
No description provided.