Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash - maybe dereferenced nullptr - 0.8.0rc4 #8672

Closed
mtippmann opened this issue Apr 26, 2019 · 8 comments
Closed

crash - maybe dereferenced nullptr - 0.8.0rc4 #8672

mtippmann opened this issue Apr 26, 2019 · 8 comments
Labels
Status: Stale No recent activity for issue Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@mtippmann
Copy link

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 18.04
Linux Kernel 5.0.7-050007-generic #201904052141 SMP Fri Apr 5 21:43:20 UTC 2019
Architecture amd64
ZFS Version 0.8.0-rc3_171_g5090f7274 (one commit before rc4-tag)
SPL Version 0.8.0-rc3_171_g5090f7274

it's recent a kernel from https://kernel.ubuntu.com/~kernel-ppa/mainline/

cmdline is also different (basically disabled meltdown/sprectre mitigations)

cat /proc/cmdline 
BOOT_IMAGE=/ROOT/ubuntu@/boot/vmlinuz-5.0.7-050007-generic root=ZFS=rpool/ROOT/ubuntu ro text console=tty1 console=ttyS1,115200 cgroup_enable=memory swapaccount=1 zfs=rpool/ROOT/ubuntu pti=off spectre_v2=off l1tf=off nospec_store_bypass_disable no_stf_barrier

no idea if it's related to zfs or an upstream kernel bug or related to the disabled mitigations, I thought I'll leave it here - feel free to close. But could be a deferenced nullpointer?

Describe the problem you're observing

complete hang. load was rising, hung tasks waiting for io - only could do a hard-reset.

Describe how to reproduce the problem

not yet able to reproduce, but machine runs a buildbot and has heavy io.

Include any warning/errors/backtraces from the system logs

last lines before the crash:

Apr 24 04:46:33 thymoeides kernel: [982845.494034] general protection fault: 0000 [#1] SMP NOPTI
Apr 24 04:46:33 thymoeides kernel: [982845.499692] CPU: 7 PID: 17796 Comm: rsync Tainted: P           OE     5.0.5-050005-generic #201903271212
Apr 24 04:46:33 thymoeides kernel: [982845.509515] Hardware name: Supermicro X7DVL/X7DVL, BIOS 2.1 06/23/2008
Apr 24 04:46:33 thymoeides kernel: [982845.516431] RIP: 0010:zio_change_priority+0x84/0x110 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.522168] Code: 00 5c 01 00 48 89 83 e8 03 00 00 48 8b 83 58 01 00 00 49 39 c5 74 48 48 8b 83 58 01 00 00 48 2b 83 50 01 00 00 49 89 c4 74 35 <48> 8b 78 08 48 85 ff 74 2c 48 8b 93 50 01 00 00 49 8b 04 14 49 39
Apr 24 04:46:33 thymoeides kernel: [982845.541537] RSP: 0018:ffffbb5545263a00 EFLAGS: 00010282
Apr 24 04:46:33 thymoeides kernel: [982845.547011] RAX: dead0000000000e0 RBX: ffff9127385a89b0 RCX: 0000000000000000
Apr 24 04:46:33 thymoeides kernel: [982845.554432] RDX: ffff91290622ac00 RSI: 0000000000000000 RDI: ffff9127385a8d70
Apr 24 04:46:33 thymoeides kernel: [982845.561851] RBP: ffffbb5545263a30 R08: e2e87df153977749 R09: 9ae16a3b2f90404f
Apr 24 04:46:33 thymoeides kernel: [982845.569268] R10: ffffbb5545263b08 R11: ffff91274d1ffa90 R12: dead0000000000e0
Apr 24 04:46:33 thymoeides kernel: [982845.576687] R13: ffff9127385a8b08 R14: 0000000000000000 R15: 0000000000000000
Apr 24 04:46:33 thymoeides kernel: [982845.584107] FS:  00007f0f9612fe80(0000) GS:ffff91295fbc0000(0000) knlGS:0000000000000000
Apr 24 04:46:33 thymoeides kernel: [982845.592502] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 24 04:46:33 thymoeides kernel: [982845.598499] CR2: 00007f0f95f82ff8 CR3: 0000000260920000 CR4: 00000000000406e0
Apr 24 04:46:33 thymoeides kernel: [982845.605920] Call Trace:
Apr 24 04:46:33 thymoeides kernel: [982845.608576]  arc_read+0x773/0xff0 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.612659]  ? dbuf_rele_and_unlock+0x620/0x620 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.617981]  dbuf_read+0x611/0xb70 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.622143]  dmu_buf_hold_array_by_dnode+0x10b/0x450 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.627914]  dmu_read_uio_dnode+0x49/0x100 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.632788]  dmu_read_uio_dbuf+0x49/0x70 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.637500]  zfs_read+0x11b/0x470 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.641545]  ? path_openat+0x738/0x16d0
Apr 24 04:46:33 thymoeides kernel: [982845.645639]  zpl_read_common_iovec+0x80/0xc0 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.650708]  zpl_iter_read+0xd8/0x120 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.655110]  __vfs_read+0x145/0x1a0
Apr 24 04:46:33 thymoeides kernel: [982845.658798]  vfs_read+0x99/0x160
Apr 24 04:46:33 thymoeides kernel: [982845.662213]  ksys_read+0x55/0xc0
Apr 24 04:46:33 thymoeides kernel: [982845.665627]  __x64_sys_read+0x1a/0x20
Apr 24 04:46:33 thymoeides kernel: [982845.669493]  do_syscall_64+0x5a/0x110
Apr 24 04:46:33 thymoeides kernel: [982845.673359]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Apr 24 04:46:33 thymoeides kernel: [982845.678646] RIP: 0033:0x7f0f956386d0
Apr 24 04:46:33 thymoeides kernel: [982845.682415] Code: b6 fe ff ff 48 8d 3d 17 be 08 00 48 83 ec 08 e8 06 db 01 00 66 0f 1f 44 00 00 83 3d 39 30 2c 00 00 75 10 b8 00 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 de 9b 01 00 48 89 04 24
Apr 24 04:46:33 thymoeides kernel: [982845.701783] RSP: 002b:00007ffd2b1a4378 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Apr 24 04:46:33 thymoeides kernel: [982845.709650] RAX: ffffffffffffffda RBX: 0000563a3a6762b0 RCX: 00007f0f956386d0
Apr 24 04:46:33 thymoeides kernel: [982845.717069] RDX: 0000000000040000 RSI: 0000563a3a6762f0 RDI: 0000000000000005
Apr 24 04:46:33 thymoeides kernel: [982845.724488] RBP: 0000000000040000 R08: 0000000005aac6fa R09: 00000000006ec6fa
Apr 24 04:46:33 thymoeides kernel: [982845.741624] R10: 0000000047509566 R11: 0000000000000246 R12: 0000000000000000
Apr 24 04:46:33 thymoeides kernel: [982845.758780] R13: 0000000000040000 R14: 0000000000000000 R15: 0000000000000000
Apr 24 04:46:33 thymoeides kernel: [982845.775918] Modules linked in: cpuid unix_diag scsi_transport_iscsi veth ebtable_filter ebtables nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo br_netfilter bridge stp llc overlay sit tunnel4 ip_tunnel radeon ttm drm_kms_helper ip6table_filter ip6table_nat gpio_ich nf_nat_ipv6 ip6_tables drm xt_conntrack ppdev iptable_filter i2c_algo_bit kvm_intel xt_CHECKSUM fb_sys_fops iptable_mangle syscopyarea sysfillrect ipt_MASQUERADE sysimgblt xt_comment xt_nat xt_tcpudp kvm lpc_ich joydev input_leds xt_addrtype irqbypass i5000_edac ipmi_si serio_raw ipmi_devintf ipmi_msghandler parport_pc iptable_nat parport nf_nat_ipv4 mac_hid nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter binfmt_misc sch_fq_codel tcp_bbr w83793 w83627hf hwmon_vid i5k_amb coretemp ip_tables x_tables autofs4 zfs(POE) zunicode(POE) zavl(POE) icp(POE) zlua(POE) zcommon(POE) znvpair(POE) spl(OE) btrfs xor zstd_compress raid6_pq libcrc32c hid_generic usbhid hid psmouse ahci e1000e libahci
Apr 24 04:46:33 thymoeides kernel: [982845.943894] ---[ end trace d65e46009a432f7b ]---
Apr 24 04:46:33 thymoeides kernel: [982845.959220] RIP: 0010:zio_change_priority+0x84/0x110 [zfs]
Apr 24 04:46:33 thymoeides kernel: [982845.975374] Code: 00 5c 01 00 48 89 83 e8 03 00 00 48 8b 83 58 01 00 00 49 39 c5 74 48 48 8b 83 58 01 00 00 48 2b 83 50 01 00 00 49 89 c4 74 35 <48> 8b 78 08 48 85 ff 74 2c 48 8b 93 50 01 00 00 49 8b 04 14 49 39
Apr 24 04:46:33 thymoeides kernel: [982846.015549] RSP: 0018:ffffbb5545263a00 EFLAGS: 00010282
Apr 24 04:46:33 thymoeides kernel: [982846.031428] RAX: dead0000000000e0 RBX: ffff9127385a89b0 RCX: 0000000000000000
Apr 24 04:46:33 thymoeides kernel: [982846.049265] RDX: ffff91290622ac00 RSI: 0000000000000000 RDI: ffff9127385a8d70
Apr 24 04:46:33 thymoeides kernel: [982846.067104] RBP: ffffbb5545263a30 R08: e2e87df153977749 R09: 9ae16a3b2f90404f
Apr 24 04:46:33 thymoeides kernel: [982846.084949] R10: ffffbb5545263b08 R11: ffff91274d1ffa90 R12: dead0000000000e0
Apr 24 04:46:33 thymoeides kernel: [982846.102797] R13: ffff9127385a8b08 R14: 0000000000000000 R15: 0000000000000000
Apr 24 04:46:33 thymoeides kernel: [982846.120638] FS:  00007f0f9612fe80(0000) GS:ffff91295fbc0000(0000) knlGS:0000000000000000
Apr 24 04:46:33 thymoeides kernel: [982846.139467] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 24 04:46:33 thymoeides kernel: [982846.155899] CR2: 00007f0f95f82ff8 CR3: 0000000260920000 CR4: 00000000000406e0
@mtippmann mtippmann changed the title crash - maybe free'd nullptr - 0.8.0rc4 crash - maybe dereferenced nullptr - 0.8.0rc4 Apr 26, 2019
@behlendorf behlendorf added the Type: Defect Incorrect behavior (e.g. crash, hang) label Apr 27, 2019
@digdug3
Copy link

digdug3 commented Oct 29, 2019

Think this is still an issue with 0.8.2, had three complete lockups. Two last week, one today. Here are the logs:

Oct 23 04:12:05 pve kernel: [1281392.431393] general protection fault: 0000 [#1] SMP PTI
Oct 23 04:12:05 pve kernel: [1281392.431432] CPU: 10 PID: 16249 Comm: zvol Tainted: P           O      5.0.21-2-pve #1
Oct 23 04:12:05 pve kernel: [1281392.431464] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 3.1c 05/02/2019
Oct 23 04:12:05 pve kernel: [1281392.431564] RIP: 0010:zio_change_priority+0x84/0x110 [zfs]
Oct 23 04:12:05 pve kernel: [1281392.431594] Code: 00 5c 01 00 49 89 96 e8 03 00 00 49 8b 96 58 01 00 00 49 39 d4 74 48 49 8b 96 58 01 00 00 49 2b 96 50 01 00 00 48 89 d3 74 35 <48> 8b 7a 08 48 85 ff 74 2c 49 8b 8e 50 01 00 00 48 8b 14 0b 49 39
Oct 23 04:12:05 pve kernel: [1281392.431659] RSP: 0018:ffffb021d05dbac0 EFLAGS: 00010282
Oct 23 04:12:05 pve kernel: [1281392.431682] RAX: 0000000000000000 RBX: dead0000000000e0 RCX: 0000000000000000
Oct 23 04:12:05 pve kernel: [1281392.431704] RDX: dead0000000000e0 RSI: ffff9869f6d9e0e0 RDI: ffff98697ca66e50
Oct 23 04:12:05 pve kernel: [1281392.431725] RBP: ffffb021d05dbaf0 R08: f5ecd540a82d128e R09: 9ae16a3b2f90404f
Oct 23 04:12:05 pve kernel: [1281392.431746] R10: ffffb021d05dbbc8 R11: ffff98671c081820 R12: ffff98697ca66be8
Oct 23 04:12:05 pve kernel: [1281392.431767] R13: 0000000000000000 R14: ffff98697ca66a90 R15: ffff986a218c27b0
Oct 23 04:12:05 pve kernel: [1281392.431789] FS:  0000000000000000(0000) GS:ffff9881ffa80000(0000) knlGS:0000000000000000
Oct 23 04:12:05 pve kernel: [1281392.431812] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 23 04:12:05 pve kernel: [1281392.431830] CR2: 000000080140eb28 CR3: 0000001f7ce94006 CR4: 00000000001626e0
Oct 23 04:12:05 pve kernel: [1281392.431851] Call Trace:
Oct 23 04:12:05 pve kernel: [1281392.431888]  arc_read+0x7b2/0xff0 [zfs]
Oct 23 04:12:05 pve kernel: [1281392.431939]  ? dbuf_rele_and_unlock+0x640/0x640 [zfs]
Oct 23 04:12:05 pve kernel: [1281392.431991]  dbuf_read+0x269/0xb60 [zfs]
Oct 23 04:12:05 pve kernel: [1281392.432030]  dmu_buf_hold_array_by_dnode+0x109/0x450 [zfs]
Oct 23 04:12:05 pve kernel: [1281392.432072]  dmu_read_uio_dnode+0x49/0xf0 [zfs]
Oct 23 04:12:05 pve kernel: [1281392.432091]  ? generic_start_io_acct+0x108/0x120
Oct 23 04:12:05 pve kernel: [1281392.432150]  zvol_read+0x101/0x2c0 [zfs]
Oct 23 04:12:05 pve kernel: [1281392.432169]  taskq_thread+0x310/0x500 [spl]
Oct 23 04:12:05 pve kernel: [1281392.432185]  ? wake_up_q+0x80/0x80
Oct 23 04:12:05 pve kernel: [1281392.432206]  kthread+0x120/0x140
Oct 23 04:12:05 pve kernel: [1281392.432230]  ? task_done+0xb0/0xb0 [spl]
Oct 23 04:12:05 pve kernel: [1281392.432244]  ? __kthread_parkme+0x70/0x70
Oct 23 04:12:05 pve kernel: [1281392.432259]  ret_from_fork+0x35/0x40
Oct 23 04:12:05 pve kernel: [1281392.432271] Modules linked in: option usb_wwan usbserial uas usb_storage nft_chain_route_ipv4 nft_chain_nat_ipv4 nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c act_police cls_basic sch_ingress sch_htb nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache xt_multiport nft_counter ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nf_tables veth tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul ast crc32_pclmul ttm ghash_clmulni_intel drm_kms_helper aesni_intel aes_x86_64 drm snd_pcm crypto_simd fb_sys_fops syscopyarea snd_timer zfs(PO) mei_me cryptd ipmi_ssif snd sysfillrect mei glue_helper sysimgblt soundcore zunicode(PO) ioatdma input_leds joydev intel_cstate intel_rapl_perf zlua(PO) pcspkr ipmi_si ipmi_devintf ipmi_msghandler mac_hid acpi_power_meter acpi_pad
Oct 23 04:12:05 pve kernel: [1281392.432300]  zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp sunrpc libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 hid_generic usbkbd usbmouse usbhid hid mpt3sas igb ahci raid_class i2c_algo_bit i2c_i801 lpc_ich libahci dca scsi_transport_sas wmi
Oct 23 04:12:05 pve kernel: [1281392.432656] ---[ end trace 607e6dc985be887f ]---
Oct 23 04:12:05 pve kernel: [1281392.499940] RIP: 0010:zio_change_priority+0x84/0x110 [zfs]
Oct 23 04:12:05 pve kernel: [1281392.499984] Code: 00 5c 01 00 49 89 96 e8 03 00 00 49 8b 96 58 01 00 00 49 39 d4 74 48 49 8b 96 58 01 00 00 49 2b 96 50 01 00 00 48 89 d3 74 35 <48> 8b 7a 08 48 85 ff 74 2c 49 8b 8e 50 01 00 00 48 8b 14 0b 49 39
Oct 23 04:12:05 pve kernel: [1281392.500086] RSP: 0018:ffffb021d05dbac0 EFLAGS: 00010282
Oct 23 04:12:05 pve kernel: [1281392.500123] RAX: 0000000000000000 RBX: dead0000000000e0 RCX: 0000000000000000
Oct 23 04:12:05 pve kernel: [1281392.500168] RDX: dead0000000000e0 RSI: ffff9869f6d9e0e0 RDI: ffff98697ca66e50
Oct 23 04:12:05 pve kernel: [1281392.500213] RBP: ffffb021d05dbaf0 R08: f5ecd540a82d128e R09: 9ae16a3b2f90404f
Oct 23 04:12:05 pve kernel: [1281392.500257] R10: ffffb021d05dbbc8 R11: ffff98671c081820 R12: ffff98697ca66be8
Oct 23 04:12:05 pve kernel: [1281392.500295] R13: 0000000000000000 R14: ffff98697ca66a90 R15: ffff986a218c27b0
Oct 23 04:12:05 pve kernel: [1281392.500334] FS:  0000000000000000(0000) GS:ffff9881ffa80000(0000) knlGS:0000000000000000
Oct 23 04:12:05 pve kernel: [1281392.500380] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 23 04:12:05 pve kernel: [1281392.500412] CR2: 000000080140eb28 CR3: 0000001f7ce94006 CR4: 00000000001626e0


Oct 26 00:49:27 pve kernel: [230475.545858] general protection fault: 0000 [#1] SMP PTI
Oct 26 00:49:27 pve kernel: [230475.545898] CPU: 5 PID: 15143 Comm: zvol Tainted: P           O      5.0.21-3-pve #1
Oct 26 00:49:27 pve kernel: [230475.545934] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 3.1c 05/02/2019
Oct 26 00:49:27 pve kernel: [230475.546023] RIP: 0010:zio_change_priority+0x84/0x110 [zfs]
Oct 26 00:49:27 pve kernel: [230475.546041] Code: 00 5c 01 00 49 89 96 e8 03 00 00 49 8b 96 58 01 00 00 49 39 d4 74 48 49 8b 96 58 01 00 00 49 2b 96 50 01 00 00 48 89 d3 74 35 <48> 8b 7a 08 48 85 ff 74 2c 49 8b 8e 50 01 00 00 48 8b 14 0b 49 39
Oct 26 00:49:27 pve kernel: [230475.546092] RSP: 0018:ffffb9fbd43ebac0 EFLAGS: 00010282
Oct 26 00:49:27 pve kernel: [230475.546109] RAX: 0000000000000000 RBX: dead0000000000e0 RCX: 0000000000000000
Oct 26 00:49:27 pve kernel: [230475.546132] RDX: dead0000000000e0 RSI: ffff8fa9a668b548 RDI: ffff8faeaccbee50
Oct 26 00:49:27 pve kernel: [230475.546158] RBP: ffffb9fbd43ebaf0 R08: bae2c3abbe65738c R09: 9ae16a3b2f90404f
Oct 26 00:49:27 pve kernel: [230475.546185] R10: ffffb9fbd43ebbc8 R11: ffff8fb22d4c0980 R12: ffff8faeaccbebe8
Oct 26 00:49:27 pve kernel: [230475.546210] R13: 0000000000000000 R14: ffff8faeaccbea90 R15: ffff8fa3a047ccd0
Oct 26 00:49:27 pve kernel: [230475.546236] FS:  0000000000000000(0000) GS:ffff8fbeff940000(0000) knlGS:0000000000000000
Oct 26 00:49:27 pve kernel: [230475.546272] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 26 00:49:27 pve kernel: [230475.546291] CR2: 0000000000d8d000 CR3: 0000001f96404002 CR4: 00000000001626e0
Oct 26 00:49:27 pve kernel: [230475.546313] Call Trace:
Oct 26 00:49:27 pve kernel: [230475.546368]  arc_read+0x7b2/0xff0 [zfs]
Oct 26 00:49:27 pve kernel: [230475.546423]  ? dbuf_rele_and_unlock+0x640/0x640 [zfs]
Oct 26 00:49:27 pve kernel: [230475.546475]  dbuf_read+0x269/0xb80 [zfs]
Oct 26 00:49:27 pve kernel: [230475.546534]  dmu_buf_hold_array_by_dnode+0x109/0x480 [zfs]
Oct 26 00:49:27 pve kernel: [230475.546602]  dmu_read_uio_dnode+0x49/0xf0 [zfs]
Oct 26 00:49:27 pve kernel: [230475.546631]  ? __switch_to_asm+0x35/0x70
Oct 26 00:49:27 pve kernel: [230475.546655]  ? generic_start_io_acct+0x108/0x120
Oct 26 00:49:27 pve kernel: [230475.546721]  zvol_read+0x101/0x2d0 [zfs]
Oct 26 00:49:27 pve kernel: [230475.546741]  taskq_thread+0x310/0x500 [spl]
Oct 26 00:49:27 pve kernel: [230475.546765]  ? wake_up_q+0x80/0x80
Oct 26 00:49:27 pve kernel: [230475.546786]  kthread+0x120/0x140
Oct 26 00:49:27 pve kernel: [230475.546811]  ? task_done+0xb0/0xb0 [spl]
Oct 26 00:49:27 pve kernel: [230475.546833]  ? __kthread_parkme+0x70/0x70
Oct 26 00:49:27 pve kernel: [230475.546856]  ret_from_fork+0x35/0x40
Oct 26 00:49:27 pve kernel: [230475.546875] Modules linked in: act_police cls_basic sch_ingress sch_htb nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache tcp_diag inet_diag xt_multiport nft_counter ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nf_tables veth ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass zfs(PO) zunicode(PO) zlua(PO) ast crct10dif_pclmul crc32_pclmul ghash_clmulni_intel option usb_wwan aesni_intel ttm aes_x86_64 crypto_simd cryptd drm_kms_helper usbserial drm snd_pcm fb_sys_fops syscopyarea snd_timer sysfillrect snd glue_helper sysimgblt mei_me zcommon(PO) intel_cstate intel_rapl_perf ipmi_ssif soundcore ioatdma joydev znvpair(PO) mei input_leds pcspkr zavl(PO) ipmi_si ipmi_devintf ipmi_msghandler mac_hid icp(PO) acpi_pad acpi_power_meter spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp
Oct 26 00:49:27 pve kernel: [230475.546905]  libiscsi sunrpc scsi_transport_iscsi ip_tables x_tables autofs4 uas hid_generic usbkbd usbmouse usbhid hid usb_storage i2c_i801 lpc_ich ahci libahci igb mpt3sas i2c_algo_bit raid_class dca scsi_transport_sas wmi
Oct 26 00:49:27 pve kernel: [230475.547355] ---[ end trace b55980b529d8ab1d ]---
Oct 26 00:49:27 pve kernel: [230475.619002] RIP: 0010:zio_change_priority+0x84/0x110 [zfs]
Oct 26 00:49:27 pve kernel: [230475.619039] Code: 00 5c 01 00 49 89 96 e8 03 00 00 49 8b 96 58 01 00 00 49 39 d4 74 48 49 8b 96 58 01 00 00 49 2b 96 50 01 00 00 48 89 d3 74 35 <48> 8b 7a 08 48 85 ff 74 2c 49 8b 8e 50 01 00 00 48 8b 14 0b 49 39
Oct 26 00:49:27 pve kernel: [230475.619122] RSP: 0018:ffffb9fbd43ebac0 EFLAGS: 00010282
Oct 26 00:49:27 pve kernel: [230475.619150] RAX: 0000000000000000 RBX: dead0000000000e0 RCX: 0000000000000000
Oct 26 00:49:27 pve kernel: [230475.619185] RDX: dead0000000000e0 RSI: ffff8fa9a668b548 RDI: ffff8faeaccbee50
Oct 26 00:49:27 pve kernel: [230475.619220] RBP: ffffb9fbd43ebaf0 R08: bae2c3abbe65738c R09: 9ae16a3b2f90404f
Oct 26 00:49:27 pve kernel: [230475.619255] R10: ffffb9fbd43ebbc8 R11: ffff8fb22d4c0980 R12: ffff8faeaccbebe8
Oct 26 00:49:27 pve kernel: [230475.619290] R13: 0000000000000000 R14: ffff8faeaccbea90 R15: ffff8fa3a047ccd0
Oct 26 00:49:27 pve kernel: [230475.619324] FS:  0000000000000000(0000) GS:ffff8fbeff940000(0000) knlGS:0000000000000000
Oct 26 00:49:27 pve kernel: [230475.619363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 26 00:49:27 pve kernel: [230475.619395] CR2: 0000000000d8d000 CR3: 0000001f96404002 CR4: 00000000001626e0



Oct 29 04:20:45 pve kernel: [248804.886615] general protection fault: 0000 [#1] SMP PTI
Oct 29 04:20:45 pve kernel: [248804.886647] CPU: 6 PID: 14819 Comm: zvol Tainted: P           O      5.0.21-3-pve #1
Oct 29 04:20:45 pve kernel: [248804.886682] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 3.1c 05/02/2019
Oct 29 04:20:45 pve kernel: [248804.886774] RIP: 0010:zio_change_priority+0x84/0x110 [zfs]
Oct 29 04:20:45 pve kernel: [248804.886792] Code: 00 5c 01 00 49 89 96 e8 03 00 00 49 8b 96 58 01 00 00 49 39 d4 74 48 49 8b 96 58 01 00 00 49 2b 96 50 01 00 00 48 89 d3 74 35 <48> 8b 7a 08 48 85 ff 74 2c 49 8b 8e 50 01 00 00 48 8b 14 0b 49 39
Oct 29 04:20:45 pve kernel: [248804.886874] RSP: 0018:ffffb7bf8eae7ac0 EFLAGS: 00010282
Oct 29 04:20:45 pve kernel: [248804.886898] RAX: 0000000000000000 RBX: dead0000000000e0 RCX: 0000000000000000
Oct 29 04:20:45 pve kernel: [248804.886931] RDX: dead0000000000e0 RSI: ffff92e34e62b548 RDI: ffff92e15d687cd8
Oct 29 04:20:45 pve kernel: [248804.886961] RBP: ffffb7bf8eae7af0 R08: d9bee87fa392cdbf R09: 9ae16a3b2f90404f
Oct 29 04:20:45 pve kernel: [248804.886993] R10: ffffb7bf8eae7bc8 R11: ffff92ef4ce19a90 R12: ffff92e15d687a70
Oct 29 04:20:45 pve kernel: [248804.887022] R13: 0000000000000000 R14: ffff92e15d687918 R15: ffff92d7c7356e18
Oct 29 04:20:45 pve kernel: [248804.887051] FS:  0000000000000000(0000) GS:ffff92f57f980000(0000) knlGS:0000000000000000
Oct 29 04:20:45 pve kernel: [248804.887074] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 29 04:20:45 pve kernel: [248804.887092] CR2: 00000000492af000 CR3: 0000000645c0e005 CR4: 00000000001626e0
Oct 29 04:20:45 pve kernel: [248804.887113] Call Trace:
Oct 29 04:20:45 pve kernel: [248804.887153]  arc_read+0x7b2/0xff0 [zfs]
Oct 29 04:20:45 pve kernel: [248804.887209]  ? dbuf_rele_and_unlock+0x640/0x640 [zfs]
Oct 29 04:20:45 pve kernel: [248804.887254]  dbuf_read+0x269/0xb80 [zfs]
Oct 29 04:20:45 pve kernel: [248804.887292]  dmu_buf_hold_array_by_dnode+0x109/0x480 [zfs]
Oct 29 04:20:45 pve kernel: [248804.887335]  dmu_read_uio_dnode+0x49/0xf0 [zfs]
Oct 29 04:20:45 pve kernel: [248804.887360]  ? generic_start_io_acct+0x108/0x120
Oct 29 04:20:45 pve kernel: [248804.887432]  zvol_read+0x101/0x2d0 [zfs]
Oct 29 04:20:45 pve kernel: [248804.887463]  taskq_thread+0x310/0x500 [spl]
Oct 29 04:20:45 pve kernel: [248804.887488]  ? wake_up_q+0x80/0x80
Oct 29 04:20:45 pve kernel: [248804.887508]  kthread+0x120/0x140
Oct 29 04:20:45 pve kernel: [248804.887531]  ? task_done+0xb0/0xb0 [spl]
Oct 29 04:20:45 pve kernel: [248804.887551]  ? __kthread_parkme+0x70/0x70
Oct 29 04:20:45 pve kernel: [248804.887574]  ret_from_fork+0x35/0x40
Oct 29 04:20:45 pve kernel: [248804.887594] Modules linked in: nft_chain_route_ipv4 nft_chain_nat_ipv4 nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c option usb_wwan usbserial uas usb_storage tcp_diag inet_diag act_police cls_basic sch_ingress sch_htb nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache xt_multiport nft_counter ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nf_tables veth ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ast ttm drm_kms_helper intel_cstate drm snd_pcm intel_rapl_perf snd_timer zfs(PO) joydev fb_sys_fops snd input_leds syscopyarea ipmi_ssif sysfillrect soundcore ioatdma zunicode(PO) sysimgblt mei_me zlua(PO) mei pcspkr acpi_pad ipmi_si ipmi_devintf acpi_power_meter ipmi_msghandler mac_hid
Oct 29 04:20:45 pve kernel: [248804.887633]  zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 hid_generic usbmouse usbkbd usbhid hid i2c_i801 lpc_ich ahci libahci igb mpt3sas i2c_algo_bit raid_class dca scsi_transport_sas wmi
Oct 29 04:20:45 pve kernel: [248804.888101] ---[ end trace 00f77f69a3822fc5 ]---
Oct 29 04:20:45 pve kernel: [248804.955036] RIP: 0010:zio_change_priority+0x84/0x110 [zfs]
Oct 29 04:20:45 pve kernel: [248804.955075] Code: 00 5c 01 00 49 89 96 e8 03 00 00 49 8b 96 58 01 00 00 49 39 d4 74 48 49 8b 96 58 01 00 00 49 2b 96 50 01 00 00 48 89 d3 74 35 <48> 8b 7a 08 48 85 ff 74 2c 49 8b 8e 50 01 00 00 48 8b 14 0b 49 39
Oct 29 04:20:45 pve kernel: [248804.955156] RSP: 0018:ffffb7bf8eae7ac0 EFLAGS: 00010282
Oct 29 04:20:45 pve kernel: [248804.955184] RAX: 0000000000000000 RBX: dead0000000000e0 RCX: 0000000000000000
Oct 29 04:20:45 pve kernel: [248804.955219] RDX: dead0000000000e0 RSI: ffff92e34e62b548 RDI: ffff92e15d687cd8
Oct 29 04:20:45 pve kernel: [248804.955257] RBP: ffffb7bf8eae7af0 R08: d9bee87fa392cdbf R09: 9ae16a3b2f90404f
Oct 29 04:20:45 pve kernel: [248804.955294] R10: ffffb7bf8eae7bc8 R11: ffff92ef4ce19a90 R12: ffff92e15d687a70
Oct 29 04:20:45 pve kernel: [248804.955332] R13: 0000000000000000 R14: ffff92e15d687918 R15: ffff92d7c7356e18
Oct 29 04:20:45 pve kernel: [248804.955370] FS:  0000000000000000(0000) GS:ffff92f57f980000(0000) knlGS:0000000000000000
Oct 29 04:20:45 pve kernel: [248804.955409] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 29 04:20:45 pve kernel: [248804.955427] CR2: 00000000492af000 CR3: 0000000645c0e005 CR4: 00000000001626e0

If you need more information, please let me know.

@digdug3
Copy link

digdug3 commented Dec 3, 2019

I'm currently on 0.8.2-pve2 (proxmox: git )disabled all intensive tasks (Windows backups running inside a VM from one ZFS pool to another) that were causing high-io and the lockups.
Proxmox also added SIMD save/restore patch in this version. Could it be related to my previous lockups?

@digdug3
Copy link

digdug3 commented Dec 10, 2019

Spoke to early... Today I had the same crash again during high IO...

Dec 10 04:24:46 pve kernel: [1910968.720945] general protection fault: 0000 [#1] SMP PTI Dec 10 04:24:46 pve kernel: [1910968.720971] CPU: 8 PID: 20605 Comm: zvol Tainted: P O 5.0.21-5-pve #1 Dec 10 04:24:46 pve kernel: [1910968.720996] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 3.1c 05/02/2019 Dec 10 04:24:46 pve kernel: [1910968.721075] RIP: 0010:zio_change_priority+0x84/0x110 [zfs] Dec 10 04:24:46 pve kernel: [1910968.721095] Code: 00 5c 01 00 49 89 96 e8 03 00 00 49 8b 96 58 01 00 00 49 39 d4 74 48 49 8b 96 58 01 00 00 49 2b 96 50 01 00 00 48 89 d3 74 35 <48> 8b 7a 08 48 85 ff 74 2c 49 8b 8e 50 01 00 00 48 8b 14 0b 49 39 Dec 10 04:24:46 pve kernel: [1910968.721151] RSP: 0018:ffffbce402f67ac0 EFLAGS: 00010282 Dec 10 04:24:46 pve kernel: [1910968.721169] RAX: 0000000000000000 RBX: dead0000000000e0 RCX: 0000000000000000 Dec 10 04:24:46 pve kernel: [1910968.721193] RDX: dead0000000000e0 RSI: ffff9d231f505258 RDI: ffff9d231f505618 Dec 10 04:24:46 pve kernel: [1910968.721215] RBP: ffffbce402f67af0 R08: 389cf659bbfa4a70 R09: 9ae16a3b2f90404f Dec 10 04:24:46 pve kernel: [1910968.721238] R10: ffffbce402f67bc8 R11: 0000000000000001 R12: ffff9d231f5053b0 Dec 10 04:24:46 pve kernel: [1910968.721262] R13: 0000000000000000 R14: ffff9d231f505258 R15: ffff9d26d93c5858 Dec 10 04:24:46 pve kernel: [1910968.721285] FS: 0000000000000000(0000) GS:ffff9d383fa00000(0000) knlGS:0000000000000000 Dec 10 04:24:46 pve kernel: [1910968.721311] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 10 04:24:46 pve kernel: [1910968.721330] CR2: 000000004f12e017 CR3: 000000046d40e004 CR4: 00000000001626e0 Dec 10 04:24:46 pve kernel: [1910968.721353] Call Trace: Dec 10 04:24:46 pve kernel: [1910968.721394] arc_read+0x7b2/0xff0 [zfs] Dec 10 04:24:46 pve kernel: [1910968.721437] ? dbuf_rele_and_unlock+0x640/0x640 [zfs] Dec 10 04:24:46 pve kernel: [1910968.721479] dbuf_read+0x269/0xb80 [zfs] Dec 10 04:24:46 pve kernel: [1910968.721521] dmu_buf_hold_array_by_dnode+0x109/0x480 [zfs] Dec 10 04:24:46 pve kernel: [1910968.721566] dmu_read_uio_dnode+0x49/0xf0 [zfs] Dec 10 04:24:46 pve kernel: [1910968.721585] ? __switch_to_asm+0x35/0x70 Dec 10 04:24:46 pve kernel: [1910968.721602] ? generic_start_io_acct+0x108/0x120 Dec 10 04:24:46 pve kernel: [1910968.721657] zvol_read+0x101/0x2d0 [zfs] Dec 10 04:24:46 pve kernel: [1910968.721678] taskq_thread+0x2ec/0x4d0 [spl] Dec 10 04:24:46 pve kernel: [1910968.721695] ? wake_up_q+0x80/0x80 Dec 10 04:24:46 pve kernel: [1910968.721709] kthread+0x120/0x140 Dec 10 04:24:46 pve kernel: [1910968.721725] ? task_done+0xb0/0xb0 [spl] Dec 10 04:24:46 pve kernel: [1910968.721739] ? __kthread_parkme+0x70/0x70 Dec 10 04:24:46 pve kernel: [1910968.721754] ret_from_fork+0x35/0x40 Dec 10 04:24:46 pve kernel: [1910968.721768] Modules linked in: udp_diag nft_chain_route_ipv4 nft_chain_nat_ipv4 nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c act_police cls_basic sch_ingress sch_htb nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache xt_multiport nft_counter ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nf_tables veth tcp_diag inet_diag ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp ast coretemp ttm kvm_intel drm_kms_helper option drm kvm irqbypass fb_sys_fops syscopyarea sysfillrect usb_wwan crct10dif_pclmul crc32_pclmul sysimgblt ghash_clmulni_intel aesni_intel usbserial aes_x86_64 ipmi_ssif crypto_simd cryptd joydev input_leds glue_helper mei_me mei ioatdma intel_cstate intel_rapl_perf pcspkr zfs(PO) ipmi_si mac_hid acpi_power_meter ipmi_devintf ipmi_msghandler acpi_pad zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) Dec 10 04:24:46 pve kernel: [1910968.721798] spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core sunrpc iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 uas hid_generic usbmouse usbkbd usbhid usb_storage hid i2c_i801 lpc_ich ahci libahci igb mpt3sas i2c_algo_bit raid_class dca scsi_transport_sas wmi [last unloaded: msr] Dec 10 04:24:46 pve kernel: [1910968.726204] ---[ end trace 662728a7f0642e93 ]--- Dec 10 04:24:46 pve kernel: [1910968.753990] RIP: 0010:zio_change_priority+0x84/0x110 [zfs] Dec 10 04:24:46 pve kernel: [1910968.754968] Code: 00 5c 01 00 49 89 96 e8 03 00 00 49 8b 96 58 01 00 00 49 39 d4 74 48 49 8b 96 58 01 00 00 49 2b 96 50 01 00 00 48 89 d3 74 35 <48> 8b 7a 08 48 85 ff 74 2c 49 8b 8e 50 01 00 00 48 8b 14 0b 49 39 Dec 10 04:24:46 pve kernel: [1910968.756794] RSP: 0018:ffffbce402f67ac0 EFLAGS: 00010282 Dec 10 04:24:46 pve kernel: [1910968.757690] RAX: 0000000000000000 RBX: dead0000000000e0 RCX: 0000000000000000 Dec 10 04:24:46 pve kernel: [1910968.758619] RDX: dead0000000000e0 RSI: ffff9d231f505258 RDI: ffff9d231f505618 Dec 10 04:24:46 pve kernel: [1910968.759514] RBP: ffffbce402f67af0 R08: 389cf659bbfa4a70 R09: 9ae16a3b2f90404f Dec 10 04:24:46 pve kernel: [1910968.760446] R10: ffffbce402f67bc8 R11: 0000000000000001 R12: ffff9d231f5053b0 Dec 10 04:24:46 pve kernel: [1910968.761373] R13: 0000000000000000 R14: ffff9d231f505258 R15: ffff9d26d93c5858 Dec 10 04:24:46 pve kernel: [1910968.762249] FS: 0000000000000000(0000) GS:ffff9d383fa00000(0000) knlGS:0000000000000000 Dec 10 04:24:46 pve kernel: [1910968.763114] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 10 04:24:46 pve kernel: [1910968.763963] CR2: 000000004f12e017 CR3: 000000046d40e004 CR4: 00000000001626e0

@gamanakis
Copy link
Contributor

gamanakis commented Dec 23, 2019

I had the same crash today, same dmesg dump. I think it occured when I set l2arc_noprefetch = 0 under heavy IO in 0.8.2 and kernel 5.4.1 in Archlinux. Need to verify though.

Edit: seems unrelated to l2arc_noprefetch = 0.

@gamanakis gamanakis mentioned this issue Dec 23, 2019
12 tasks
@gamanakis
Copy link
Contributor

gamanakis commented Dec 24, 2019

I think 5ff2249 might have fixed this.

Edit: it is definitely fixed by that commit. It is also related to l2arc_noprefetch = 0

@digdug3
Copy link

digdug3 commented Dec 26, 2019

I too have l2arc_noprefetch = 0 will set it to default to see it that works and wait for the release fixing this.

@stale
Copy link

stale bot commented Dec 25, 2020

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Dec 25, 2020
@mtippmann
Copy link
Author

this is fixed as commented here: #8672 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale No recent activity for issue Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

5 participants
@behlendorf @mtippmann @digdug3 @gamanakis and others