-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zed aborts after assertion failure in udev_device_get_sysattr_value #16705
Comments
Fixes: openzfs#16705 Not all udev devices have parent devices. Calling udev_device_get_ functions yield an assertion error if called with a NULL pointer. Changes to be committed: modified: cmd/zed/zed_disk_event.c Signed-off-by: Sietse <[email protected]>
mkdir /tmp/a find . |cpio -H newc -o |xz -T0 --check=crc32 >/boot/ugly-linux-main/initrd-6.11-ugly-linux-main texinfo libltdl-dev tk pp (libperl.so -> aarch64-linux-gnu/libperl.so.5...) gawk lzip build-essential bison flex Found the culprit, in dev_event_nvlist(struct udev_device dev):
In certain cases, like DM-CRYPT-PLAIN devices there is no parent. However from my troubleshooting a new question arises. DM_CRYPT_PLAIN devices seem to behave much like multipath devices. First an add is received for the device, followed by a change with the correct information, see log below. Should this EC_DEV_STATUS be handled as a EC_DEV_ADD just like multipath devices? Nov 3 18:04:02 santest zed[2553]: zed_udev_monitor: 0x7fd0500056d0, change, /dev/dm-4, disk |
Fixes: openzfs#16705 Not all udev devices have parent devices. Calling udev_device_get_ functions yield an assertion error if called with a NULL pointer. Changes to be committed: modified: cmd/zed/zed_disk_event.c Signed-off-by: Sietse <[email protected]>
Fixes: openzfs#16705 Not all udev devices have parent devices. Calling udev_device_get_ functions yield an assertion error if called with a NULL pointer. Changes to be committed: modified: cmd/zed/zed_disk_event.c Signed-off-by: Sietse <[email protected]>
Not all udev devices have parent devices. Calling udev_device_get_ functions yield an assertion error if called with a NULL pointer. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Sietse <[email protected]> Co-authored-by: Sietse <[email protected]> Closes openzfs#16705 Closes openzfs#16717
Not all udev devices have parent devices. Calling udev_device_get_ functions yield an assertion error if called with a NULL pointer. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Sietse <[email protected]> Co-authored-by: Sietse <[email protected]> Closes openzfs#16705 Closes openzfs#16717
Not all udev devices have parent devices. Calling udev_device_get_ functions yield an assertion error if called with a NULL pointer. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Sietse <[email protected]> Co-authored-by: Sietse <[email protected]> Closes openzfs#16705 Closes openzfs#16717
Not all udev devices have parent devices. Calling udev_device_get_ functions yield an assertion error if called with a NULL pointer. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Sietse <[email protected]> Co-authored-by: Sietse <[email protected]> Closes openzfs#16705 Closes openzfs#16717
System information
Distribution Name | custom linux
Distribution Version | n/a
Kernel Version | 6.11.5
Architecture | x86_64
OpenZFS Version | 2.2.6
zed segfaults after assertion failure in udev:
Oct 29 16:57:07 rdsan01 zed[18154]: Assertion 'udev_device' failed at src/libudev/libudev-device.c:742, function udev_device_get_sysattr_value(). Aborting.
Oct 29 16:57:07 rdsan01 systemd[1]: zfs-zed.service: Main process exited, code=dumped, status=6/ABRT
Oct 29 16:57:07 rdsan01 systemd[1]: zfs-zed.service: Failed with result 'core-dump'.
Oct 29 16:57:07 rdsan01 systemd[1]: zfs-zed.service: Scheduled restart job, restart counter is at 7.
Oct 29 16:57:07 rdsan01 systemd[1]: zfs-zed.service: Start request repeated too quickly.
Oct 29 16:57:07 rdsan01 systemd[1]: zfs-zed.service: Failed with result 'core-dump'.
Describe how to reproduce the problem
This happens during udev triggering (udevadm trigger -s block).
Include any warning/errors/backtraces from the system logs
Process 30394 (zed) of user 0 dumped core.
Module libcap.so.2 without build-id.
Module libresolv.so.2 without build-id.
Module libkeyutils.so.1 without build-id.
Module libkrb5support.so.0 without build-id.
Module libgmp.so.10 without build-id.
Module ld-linux-x86-64.so.2 without build-id.
Module libuuid.so.1 without build-id.
Module libudev.so.1 without build-id.
Module libz.so.1 without build-id.
Module libgcc_s.so.1 without build-id.
Module libc.so.6 without build-id.
Module libunwind.so.8 without build-id.
Module libcom_err.so.2 without build-id.
Module libk5crypto.so.3 without build-id.
Module libkrb5.so.3 without build-id.
Module libgssapi_krb5.so.2 without build-id.
Module libtirpc.so.3 without build-id.
Module libnvpair.so.3 without build-id.
Module libcrypto.so.3 without build-id.
Module libm.so.6 without build-id.
Module libuutil.so.3 without build-id.
Module libblkid.so.1 without build-id.
Module libzfs_core.so.3 without build-id.
Module libzfs.so.4 without build-id.
Module zed without build-id.
Stack trace of thread 31364:
#0 0x00007f17c40e9e7c __pthread_kill_implementation (libc.so.6 + 0x8de7c)
#1 0x00007f17c409b3b2 raise (libc.so.6 + 0x3f3b2)
#2 0x00007f17c40844ad abort (libc.so.6 + 0x284ad)
#3 0x00007f17c3fca995 log_assert_failed.cold (libudev.so.1 + 0x8995)
#4 0x00007f17c3ff0077 log_assert_failed_return (libudev.so.1 + 0x2e077)
#5 0x00007f17c3fcbc9f udev_device_get_sysattr_value (libudev.so.1 + 0x9c9f)
#6 0x0000561ddc78648e zed_udev_monitor (zed + 0xc48e)
#7 0x00007f17c40e81b2 start_thread (libc.so.6 + 0x8c1b2)
#8 0x00007f17c4162288 __clone3 (libc.so.6 + 0x106288)
Stack trace of thread 30394:
#0 0x00007f17c415dfdb ioctl (libc.so.6 + 0x101fdb)
#1 0x00007f17c4b2ca2c zpool_events_next (libzfs.so.4 + 0x45a2c)
#2 0x0000561ddc786e7b zed_event_service (zed + 0xce7b)
#3 0x0000561ddc784bd8 main (zed + 0xabd8)
#4 0x00007f17c4085d7a __libc_start_call_main (libc.so.6 + 0x29d7a)
#5 0x00007f17c4085e35 __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x29e35)
#6 0x0000561ddc784561 _start (zed + 0xa561)
Stack trace of thread 31363:
#0 0x00007f17c415dfdb ioctl (libc.so.6 + 0x101fdb)
#1 0x00007f17c4b133dd zpool_refresh_stats (libzfs.so.4 + 0x2c3dd)
#2 0x00007f17c4b26b65 zpool_open_silent (libzfs.so.4 + 0x3fb65)
#3 0x00007f17c4b136d0 zpool_iter (libzfs.so.4 + 0x2c6d0)
#4 0x0000561ddc78d1a1 zfs_slm_event (zed + 0x131a1)
#5 0x0000561ddc78b09b zfs_agent_consumer_thread (zed + 0x1109b)
#6 0x00007f17c40e81b2 start_thread (libc.so.6 + 0x8c1b2)
#7 0x00007f17c4162288 __clone3 (libc.so.6 + 0x106288)
ELF object binary architecture: AMD x86-64
core.zed.0.e9cc196a28654a98a7139ee0d030939f.30394.1730291497000000.zip
The text was updated successfully, but these errors were encountered: