Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

userquota_updates_task NULL deref #7147

Closed
lnicola opened this issue Feb 8, 2018 · 47 comments
Closed

userquota_updates_task NULL deref #7147

lnicola opened this issue Feb 8, 2018 · 47 comments
Milestone

Comments

@lnicola
Copy link
Contributor

lnicola commented Feb 8, 2018

System information

Type Version/Name
Distribution Name Arch Linux
Distribution Version
Linux Kernel 4.15.1-2-ARCH
Architecture x86-64
ZFS Version 0.7.0-283_g6d82b7969
SPL Version 0.7.0-24_g23602fd

Describe the problem you're observing

Describe how to reproduce the problem

Include any warning/errors/backtraces from the system logs

[Feb 8 21:52] general protection fault: 0000 [#1] SMP PTI
[  +0.000786] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype xt_conntrack br_netfilter bridge stp llc tun nft_meta nf_conntrack_ipv6 nf_defrag_ipv6 nft_ct nft_set_bitmap nct6775 nft_set_hash nft_set_rbtree hwmon_vid xt_tcpudp nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 iptable_filter nf_tables iptable_nat nf_conntrack_ipv4 nfnetlink nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c crc32c_generic i915 nls_iso8859_1 nls_cp437 vfat fat ofpart cmdlinepart intel_spi_platform intel_spi spi_nor iTCO_wdt iTCO_vendor_support mtd ppdev drm_kms_helper snd_usb_audio snd_usbmidi_lib snd_hwdep snd_rawmidi snd_seq_device drm intel_rapl hci_uart snd_pcm intel_soc_dts_thermal intel_soc_dts_iosf intel_powerclamp btbcm coretemp btqca btintel crct10dif_pclmul
[  +0.006028]  bluetooth crc32_pclmul crc32c_intel snd_timer ghash_clmulni_intel cryptd snd joydev intel_cstate input_leds led_class soundcore intel_gtt i2c_i801 cdc_acm agpgart r8169 mei_txe syscopyarea sysfillrect sysimgblt mei fb_sys_fops mii i2c_algo_bit shpchp parport_pc lpc_ich rtc_cmos ecdh_generic parport rfkill i2c_hid crc16 pwm_lpss_platform evdev pwm_lpss mac_hid sch_fq_codel ip_tables x_tables zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) hid_generic usbhid hid sd_mod ahci libahci libata xhci_pci xhci_hcd scsi_mod usbcore usb_common
[  +0.005001] CPU: 3 PID: 413 Comm: dp_sync_taskq Tainted: P           O     4.15.1-2-ARCH #1
[  +0.001799] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q1900DC-ITX, BIOS P1.50 03/31/2016
[  +0.002037] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[  +0.002025] RSP: 0018:ffffb19085d43d30 EFLAGS: 00010286
[  +0.002053] RAX: ffff9f621a45c8c0 RBX: ffff9f608780c368 RCX: 0000000000000000
[  +0.002112] RDX: dead000000000100 RSI: ffff9f608780c448 RDI: ffff9f621a45c880
[  +0.002127] RBP: ffff9f62e85ec000 R08: 0000000000000020 R09: 0000000000000002
[  +0.002163] R10: 0000000000000000 R11: ffffb19085d43cfe R12: ffff9f608780c488
[  +0.002198] R13: ffff9f62e8724180 R14: ffff9f621a45c880 R15: ffff9f608780c4a8
[  +0.002221] FS:  0000000000000000(0000) GS:ffff9f62ffd80000(0000) knlGS:0000000000000000
[  +0.002280] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.002299] CR2: 00007f3a81969778 CR3: 0000000422038000 CR4: 00000000001006e0
[  +0.002354] Call Trace:
[  +0.002424]  userquota_updates_task+0xd2/0x520 [zfs]
[  +0.002461]  ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[  +0.002499]  ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[  +0.002449]  taskq_thread+0x295/0x4a0 [spl]
[  +0.002467]  ? wake_up_q+0x80/0x80
[  +0.002513]  ? taskq_thread_should_stop+0x70/0x70 [spl]
[  +0.002547]  kthread+0x113/0x130
[  +0.002524]  ? kthread_create_on_node+0x70/0x70
[  +0.002544]  ret_from_fork+0x35/0x40
[  +0.002555] Code: 08 48 89 06 48 89 56 08 48 89 32 f3 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48 
[  +0.005711] RIP: multilist_sublist_remove+0x10/0x30 [zfs] RSP: ffffb19085d43d30
[  +0.002920] ---[ end trace 5d8b030e0c63f2c7 ]---
@tcaputi
Copy link
Contributor

tcaputi commented Feb 9, 2018

As I said in #7059:

I noticed that your zfs version is the same as the guy in this issue who was having some strange problems as well. That version seems to point to a commit that doesn't exist in the ZoL codebase, and I don't know where it is coming from. I would suggest trying the latest code from master (assuming you are ok running on the bleeding edge) or the latest point release v0.7.6.

If you hit this issue again after that, let me know and I'll investigate further.

@lnicola
Copy link
Contributor Author

lnicola commented Feb 9, 2018

I suppose you've realized by now that I'm the same guy in all of the threads 😄 .

The situation is as follows. I have two pools, call them bike and tank. bike is faster, so I keep my root filesystem and a couple of other on it, make daily snapshots and send them from time to time over to tank. The last snapshot on tank is from last night. EDIT: I was using large dnodes, so sends from bike to tank were not working for a while, until #6864. After that was merged, sends worked again, but were very slow on some filesystems.

I've first seen the userquota_updates_task error yesterday, but shrugged it off as "harmless".

This morning I started a scrub on tank, then one on bike which got stuck. At that time I thought the first scrub is somehow blocking the second, but now I suspect that wasn't true (or maybe it was, see the end of the post). The was nothing in dmesg about this. Cancelling the first scrub worked, but everything got stuck waiting for disk on shutdown, so I power-cycled the box.

Now I'm no longer able to boot from bike, as it panics with VERIFY3(0 == dmu_object_claim_dnsize()) failed (0 == 28). But I can boot last night's snapshot of it from tank. I'm currently sending everything over to another pool. This worked fine until it hit the root filesystem on tank, which seems to be progressing very slowly. It's interesting that it's reading more data than it's writing, which didn't seem to happen on the other filesystems:

$ sudo zpool iostat tank freighter 10
              capacity     operations     bandwidth 
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
freighter   1.20T  3.33T      1    192  7.26K  64.0M
tank        1.12T   714G    139      0  61.1M  12.9K
----------  -----  -----  -----  -----  -----  -----
freighter   1.20T  3.33T      0    294      0  3.48M
tank        1.12T   714G    128      0  16.6M      0
----------  -----  -----  -----  -----  -----  -----
freighter   1.20T  3.33T      2    281  9.99K  3.31M
tank        1.12T   714G    115      0  14.5M      0
----------  -----  -----  -----  -----  -----  -----
freighter   1.20T  3.33T      0    260  1.60K  3.20M
tank        1.12T   714G    143      0  17.1M      0
----------  -----  -----  -----  -----  -----  -----
freighter   1.20T  3.33T      0    273      0  3.58M
tank        1.12T   714G    146      0  20.7M      0
----------  -----  -----  -----  -----  -----  -----
freighter   1.20T  3.33T      0    287  3.60K  3.79M
tank        1.12T   714G    110      0  16.6M      0
----------  -----  -----  -----  -----  -----  -----
freighter   1.20T  3.33T      6    221  27.2K  2.71M
tank        1.12T   714G     86      0  12.0M      0
----------  -----  -----  -----  -----  -----  -----
freighter   1.20T  3.33T      9    223  92.3K  11.0M
tank        1.12T   714G    359      1  22.8M  49.6K
----------  -----  -----  -----  -----  -----  -----

Additionally, I got the error in this issue at boot:

[   59.952154] general protection fault: 0000 [#1] SMP PTI
[   59.952213] Modules linked in: tun nft_meta nct6775 hwmon_vid nf_conntrack_ipv6 nf_defrag_ipv6 nft_ct nft_set_bitmap nft_set_hash nft_set_rbtree xt_tcpudp nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 nf_tables nfnetlink iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c crc32c_generic nls_iso8859_1 nls_cp437 vfat fat intel_rapl intel_soc_dts_thermal intel_soc_dts_iosf i915 snd_usb_audio snd_usbmidi_lib intel_powerclamp ofpart cmdlinepart coretemp intel_spi_platform intel_spi snd_hwdep spi_nor iTCO_wdt crct10dif_pclmul drm_kms_helper snd_rawmidi crc32_pclmul drm iTCO_vendor_support mtd ppdev r8169 snd_seq_device crc32c_intel snd_pcm snd_timer mii snd cdc_acm intel_gtt ghash_clmulni_intel cryptd intel_cstate input_leds soundcore led_class agpgart hci_uart
[   59.952826]  joydev syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit rtc_cmos btbcm shpchp mei_txe btqca mei btintel lpc_ich i2c_i801 bluetooth parport_pc parport ecdh_generic i2c_hid rfkill crc16 evdev pwm_lpss_platform pwm_lpss mac_hid sch_fq_codel ip_tables x_tables zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) hid_generic usbhid hid sd_mod ahci xhci_pci libahci xhci_hcd libata usbcore scsi_mod usb_common
[   59.953176] CPU: 1 PID: 322 Comm: dp_sync_taskq Tainted: P           O     4.15.1-2-ARCH #1
[   59.953245] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q1900DC-ITX, BIOS P1.50 03/31/2016
[   59.953404] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[   59.953455] RSP: 0018:ffffb0b145b7bd30 EFLAGS: 00010286
[   59.953501] RAX: ffff978bfb9688c0 RBX: ffff978bfacb28e0 RCX: 0000000000000000
[   59.953561] RDX: dead000000000100 RSI: ffff978bfacb29f0 RDI: ffff978bfb968880
[   59.953622] RBP: ffff978c68f1b800 R08: 0000000000000020 R09: 0000000000000008
[   59.953681] R10: 0000000000000000 R11: ffffb0b145b7bcfe R12: ffff978bfacb2a00
[   59.953741] R13: ffff978c1329aa80 R14: ffff978bfb968880 R15: ffff978bfacb2a20
[   59.953802] FS:  0000000000000000(0000) GS:ffff978c7fc80000(0000) knlGS:0000000000000000
[   59.953870] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.953919] CR2: 00007fe15b6d25b4 CR3: 00000003d389e000 CR4: 00000000001006e0
[   59.953979] Call Trace:
[   59.954061]  userquota_updates_task+0xd2/0x520 [zfs]
[   59.954159]  ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[   59.954264]  ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[   59.954324]  taskq_thread+0x295/0x4a0 [spl]
[   59.954367]  ? wake_up_q+0x80/0x80
[   59.954406]  ? taskq_thread_should_stop+0x70/0x70 [spl]
[   59.954452]  kthread+0x113/0x130
[   59.954484]  ? kthread_create_on_node+0x70/0x70
[   59.954527]  ret_from_fork+0x35/0x40
[   59.954561] Code: 08 48 89 06 48 89 56 08 48 89 32 f3 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48 
[   59.954818] RIP: multilist_sublist_remove+0x10/0x30 [zfs] RSP: ffffb0b145b7bd30
[   59.954886] general protection fault: 0000 [#2] SMP PTI
[   59.954909] ---[ end trace c381eb953d40c75a ]---
[   59.955056] Modules linked in: tun nft_meta nct6775 hwmon_vid nf_conntrack_ipv6 nf_defrag_ipv6 nft_ct nft_set_bitmap nft_set_hash nft_set_rbtree xt_tcpudp nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 nf_tables nfnetlink iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c crc32c_generic nls_iso8859_1 nls_cp437 vfat fat intel_rapl intel_soc_dts_thermal intel_soc_dts_iosf i915 snd_usb_audio snd_usbmidi_lib intel_powerclamp ofpart cmdlinepart coretemp intel_spi_platform intel_spi snd_hwdep spi_nor iTCO_wdt crct10dif_pclmul drm_kms_helper snd_rawmidi crc32_pclmul drm iTCO_vendor_support mtd ppdev r8169 snd_seq_device crc32c_intel snd_pcm snd_timer mii snd cdc_acm intel_gtt ghash_clmulni_intel cryptd intel_cstate input_leds soundcore led_class agpgart hci_uart
[   59.956082]  joydev syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit rtc_cmos btbcm shpchp mei_txe btqca mei btintel lpc_ich i2c_i801 bluetooth parport_pc parport ecdh_generic i2c_hid rfkill crc16 evdev pwm_lpss_platform pwm_lpss mac_hid sch_fq_codel ip_tables x_tables zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) hid_generic usbhid hid sd_mod ahci xhci_pci libahci xhci_hcd libata usbcore scsi_mod usb_common
[   59.956662] CPU: 3 PID: 321 Comm: dp_sync_taskq Tainted: P      D    O     4.15.1-2-ARCH #1
[   59.956779] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q1900DC-ITX, BIOS P1.50 03/31/2016
[   59.957028] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[   59.957112] RSP: 0000:ffffb0b145b73d30 EFLAGS: 00010286
[   59.957189] RAX: ffff978bfb968940 RBX: ffff978c14adf680 RCX: 0000000000000000
[   59.957290] RDX: dead000000000100 RSI: ffff978c14adf790 RDI: ffff978bfb968900
[   59.957390] RBP: ffff978c68f1b800 R08: 0000000000000020 R09: 0000000000000000
[   59.957490] R10: ffff978be53391e0 R11: ffffb0b145b73cff R12: ffff978c14adf7a0
[   59.957591] R13: ffff978c1329aa80 R14: ffff978bfb968900 R15: ffff978c14adf7c0
[   59.957692] FS:  0000000000000000(0000) GS:ffff978c7fd80000(0000) knlGS:0000000000000000
[   59.957805] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.957888] CR2: 00007f1296906000 CR3: 00000003a9448000 CR4: 00000000001006e0
[   59.957988] Call Trace:
[   59.958123]  userquota_updates_task+0xd2/0x520 [zfs]
[   59.958287]  ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[   59.958463]  ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[   59.958563]  taskq_thread+0x295/0x4a0 [spl]
[   59.958632]  ? wake_up_q+0x80/0x80
[   59.958697]  ? taskq_thread_should_stop+0x70/0x70 [spl]
[   59.958775]  kthread+0x113/0x130
[   59.958828]  ? kthread_create_on_node+0x70/0x70
[   59.958896]  ? kthread_create_on_node+0x70/0x70
[   59.958966]  ret_from_fork+0x35/0x40
[   59.959022] Code: 08 48 89 06 48 89 56 08 48 89 32 f3 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48 
[   59.959454] RIP: multilist_sublist_remove+0x10/0x30 [zfs] RSP: ffffb0b145b73d30
[   59.959599] ---[ end trace c381eb953d40c75b ]---
[   59.959999] general protection fault: 0000 [#3] SMP PTI
[   59.964598] Modules linked in: tun nft_meta nct6775 hwmon_vid nf_conntrack_ipv6 nf_defrag_ipv6 nft_ct nft_set_bitmap nft_set_hash nft_set_rbtree xt_tcpudp nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 nf_tables nfnetlink iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c crc32c_generic nls_iso8859_1 nls_cp437 vfat fat intel_rapl intel_soc_dts_thermal intel_soc_dts_iosf i915 snd_usb_audio snd_usbmidi_lib intel_powerclamp ofpart cmdlinepart coretemp intel_spi_platform intel_spi snd_hwdep spi_nor iTCO_wdt crct10dif_pclmul drm_kms_helper snd_rawmidi crc32_pclmul drm iTCO_vendor_support mtd ppdev r8169 snd_seq_device crc32c_intel snd_pcm snd_timer mii snd cdc_acm intel_gtt ghash_clmulni_intel cryptd intel_cstate input_leds soundcore led_class agpgart hci_uart
[   59.989065]  joydev syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit rtc_cmos btbcm shpchp mei_txe btqca mei btintel lpc_ich i2c_i801 bluetooth parport_pc parport ecdh_generic i2c_hid rfkill crc16 evdev pwm_lpss_platform pwm_lpss mac_hid sch_fq_codel ip_tables x_tables zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) hid_generic usbhid hid sd_mod ahci xhci_pci libahci xhci_hcd libata usbcore scsi_mod usb_common
[   60.003915] CPU: 3 PID: 323 Comm: dp_sync_taskq Tainted: P      D    O     4.15.1-2-ARCH #1
[   60.008742] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q1900DC-ITX, BIOS P1.50 03/31/2016
[   60.013596] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[   60.018280] RSP: 0000:ffffb0b145b83d30 EFLAGS: 00010286
[   60.022938] RAX: ffff978bfb9689c0 RBX: ffff978c0cdc57d8 RCX: 0000000000000000
[   60.027591] RDX: dead000000000100 RSI: ffff978c0cdc58e8 RDI: ffff978bfb968980
[   60.032209] RBP: ffff978c68f1b800 R08: 0000000000000020 R09: 0000000000000003
[   60.036838] R10: ffff978c14c0ac20 R11: ffffb0b145b83cff R12: ffff978c0cdc58f8
[   60.041458] R13: ffff978c1329aa80 R14: ffff978bfb968980 R15: ffff978c0cdc5918
[   60.046130] FS:  0000000000000000(0000) GS:ffff978c7fd80000(0000) knlGS:0000000000000000
[   60.050917] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   60.055546] CR2: 00007f1296906000 CR3: 00000003a9448000 CR4: 00000000001006e0
[   60.060209] Call Trace:
[   60.064905]  userquota_updates_task+0xd2/0x520 [zfs]
[   60.069575]  ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[   60.074250]  ? dmu_objset_userobjspace_upgradable+0x50/0x50 [zfs]
[   60.078835]  taskq_thread+0x295/0x4a0 [spl]
[   60.083416]  ? wake_up_q+0x80/0x80
[   60.087983]  ? taskq_thread_should_stop+0x70/0x70 [spl]
[   60.092578]  kthread+0x113/0x130
[   60.097155]  ? kthread_create_on_node+0x70/0x70
[   60.101754]  ret_from_fork+0x35/0x40
[   60.106337] Code: 08 48 89 06 48 89 56 08 48 89 32 f3 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48 
[   60.115975] RIP: multilist_sublist_remove+0x10/0x30 [zfs] RSP: ffffb0b145b83d30
[   60.120975] ---[ end trace c381eb953d40c75c ]---

Both pools are online right now, zpool status reports no issues. I do have a nagging dmesg warning: do_IRQ: 3.33 No irq handler for vector, which is a hopefully unrelated kernel issue. The root filesystem on bike is not mounted.

EDIT: As mentioned below, I tried to start a scrub on the backup pool, then one on tank which got stuck. A snapshot on a tank filesystem also got stuck.

@lnicola
Copy link
Contributor Author

lnicola commented Feb 9, 2018

The plot thickens...

I started a scrub on the back-up pool, then tried to start one on tank. zpool scrub got stuck again:

[<0>] cv_wait_common+0x115/0x130 [spl]
[<0>] txg_wait_synced+0xdc/0x120 [zfs]
[<0>] dsl_sync_task+0x15c/0x240 [zfs]
[<0>] dsl_scan+0x91/0x100 [zfs]
[<0>] zfs_ioc_pool_scan+0x53/0xc0 [zfs]
[<0>] zfsdev_ioctl+0x1d8/0x610 [zfs]
[<0>] do_vfs_ioctl+0xa4/0x630
[<0>] SyS_ioctl+0x74/0x80
[<0>] entry_SYSCALL_64_fastpath+0x20/0x83
[<0>] 0xffffffffffffffff

while zpool status says that the scrub didn't start yet.

Since the root fs on bike is no longer involved, I'm not sure what to do next.

@tcaputi
Copy link
Contributor

tcaputi commented Feb 9, 2018

while you have the system in this state can you run echo t > /proc/sysrq-trigger and post the output that will end up in your kernel logs? It will be a lot of output.. This will show me a stack trace of all the processes running on your system, so I understand if you do not wish to do this.

If your kernel is crashed there is no need to do this, the crash is almost certainly causing the problem.

@lnicola
Copy link
Contributor Author

lnicola commented Feb 9, 2018

Some things are still working, but it's mostly crashed, I suppose. I was able to export the back-up pool.

dmesg.zip

@tcaputi
Copy link
Contributor

tcaputi commented Feb 9, 2018

OK. Sorry to ask this but humor me for a second. You have opened several issues and at this point I'm having a hard time keeping it all straight. Could you quickly go through the timeline of what you have, what you did, and when the issues you've encountered happened? I'm going to guess that they are all probably related, so having the full timeline for all of them will probably help a lot.

@lnicola
Copy link
Contributor Author

lnicola commented Feb 9, 2018

That would be #7147 (comment) above, I suppose.

@tcaputi
Copy link
Contributor

tcaputi commented Feb 9, 2018

Just so I have it right, here is my summary of that (along with the questions I still have). Can you confirm this and answer the questions in parenthesis as best you can? I'm also not sure at what point you started encountering the dmu_object_claim_dnsize() issue, if you could provide that info as well.

  • bike crashed with userquota_updates_task panic
  • user attempted to run a scrub and noticed that it didn't return (almost definitely due to the crash)
  • user power-cycled the box and could no longer boot from bike (due to what issue? userquota_updates_task panic?)
  • user booted into a backup of his root filesystem on tank
  • user sent his dataset to another pool and noticed the root dataset was slow (probably unrelated)
  • once the send completed, user attempted to scrub tank and it didn't return again (did it crash again this time?)

@lnicola
Copy link
Contributor Author

lnicola commented Feb 9, 2018

bike crashed with userquota_updates_task panic

I saw the error message, but there was no apparent crash, and everything seemed to work afterwards. I didn't reboot. I don't know for which pool it was.

user attempted to run a scrub and noticed that it didn't return (almost definitely due to the crash)

I started a scrub on tank, then tried to start one on bike, but it never started.

user power-cycled the box and could no longer boot from bike (due to what issue? userquota_updates_task panic?)

Due to a dmu_object_claim_dnsize() panic when mounting bike.

user booted into a backup of his root filesystem on tank

Right. The userquota_updates_task showed up again a couple of times on boot, again I don't know for which pool.

user sent his dataset to another pool and noticed the root dataset was slow (probably unrelated)

Before the first crash, sending bike/zroot and bike/home to tank was slow. Sending tank/zroot (copy of bike/zroot) to a backup pool was probably fine, but sending tank/bike/home was slow. bike/home works fine nonetheless.

once the send completed, user attempted to scrub tank and it didn't return again (did it crash again this time?)

No crash, same txg_wait_synced hang. Making a snapshot of another filesystem on tank also hangs, but copying the data from it with cp seems to work.

@tcaputi
Copy link
Contributor

tcaputi commented Feb 9, 2018

Alright I think I have a grasp on everything now and I'll try to take a look today.

@tcaputi
Copy link
Contributor

tcaputi commented Feb 9, 2018

My apologies. I have run out of time for today and will need to look into this more on Monday.

@tcaputi tcaputi added this to the 0.8.0 milestone Feb 9, 2018
@tcaputi
Copy link
Contributor

tcaputi commented Feb 13, 2018

@lnicola Would you mind running:

gdb `modinfo zfs | grep filename | awk '{ print $2 }'`
list *(userquota_updates_task+0xd2)

This should help em figure out where the initial crash happened.

@lnicola
Copy link
Contributor Author

lnicola commented Feb 13, 2018

Sorry, I updated my kernel and ZFS version in the meanwhile, and my kernels have no debug info anyway. But there seems to be a single multilist_sublist_remove call in that function, so that might be a good place to start.

EDIT: The file system in cause got rolled back somehow, so I'm no longer having any issues. If you feel like closing this, that's fine.

@lnicola
Copy link
Contributor Author

lnicola commented Feb 13, 2018

Dump of assembler code for function multilist_sublist_remove:
   0x00000000000668c0 <+0>:	e8 00 00 00 00	callq  0x668c5 <multilist_sublist_remove+5>
   0x00000000000668c5 <+5>:	48 03 77 38	add    0x38(%rdi),%rsi
   0x00000000000668c9 <+9>:	48 8b 46 08	mov    0x8(%rsi),%rax
   0x00000000000668cd <+13>:	48 8b 16	mov    (%rsi),%rdx # crash here, or below
   0x00000000000668d0 <+16>:	48 89 42 08	mov    %rax,0x8(%rdx)
   0x00000000000668d4 <+20>:	48 89 10	mov    %rdx,(%rax)
   0x00000000000668d7 <+23>:	48 b8 00 01 00 00 00 00 ad de	movabs $0xdead000000000100,%rax
   0x00000000000668e1 <+33>:	48 89 06	mov    %rax,(%rsi)
   0x00000000000668e4 <+36>:	48 05 00 01 00 00	add    $0x100,%rax
   0x00000000000668ea <+42>:	48 89 46 08	mov    %rax,0x8(%rsi)
   0x00000000000668ee <+46>:	c3	retq   

@tcaputi
Copy link
Contributor

tcaputi commented Feb 21, 2018

@lnicola would you mind trying this again with the most recent code from master. We recently fixed a related issue to this and I think the problem might be solved.

@lnicola
Copy link
Contributor Author

lnicola commented Feb 21, 2018

I'll try it on the next distro package update, but the only issue I still have is the slow recv.

@tcaputi
Copy link
Contributor

tcaputi commented Feb 21, 2018

Ok. That problem should mostly be alleviated by #7197 so hopefully all your problems should be fixed soon.

@tcaputi
Copy link
Contributor

tcaputi commented Feb 28, 2018

@lnicola any update here?

@lnicola
Copy link
Contributor Author

lnicola commented Feb 28, 2018

I mentioned before, the file system or pool got rolled back at the time. I lost 9 hours of changes or so, and I didn't encounter any issues afterwards, even with that same ZFS version. So I don't know if it's fixed, but I no longer have a way to reproduce these issues.

@lnicola lnicola closed this as completed Feb 28, 2018
@lnicola
Copy link
Contributor Author

lnicola commented Mar 8, 2018

I just got a new one, in 0.7.0-356_g639b18944:

[103861.445982] general protection fault: 0000 [#1] PREEMPT SMP PTI
[103861.446061] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype xt_conntrack br_netfilter bridge stp llc tun nft_meta nf_conntrack_ipv6 nf_defrag_ipv6 nft_ct xt_tcpudp nft_set_bitmap nft_set_hash nft_set_rbtree iptable_filter nf_tables_inet nf_tables_ipv6 nf_tables_ipv4 nf_tables nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c crc32c_generic nct6775 hwmon_vid nls_iso8859_1 nls_cp437 vfat fat i915 ofpart cmdlinepart intel_spi_platform intel_spi spi_nor mtd iTCO_wdt iTCO_vendor_support ppdev snd_usb_audio snd_usbmidi_lib i2c_algo_bit snd_hwdep drm_kms_helper snd_rawmidi intel_rapl intel_soc_dts_thermal snd_seq_device intel_soc_dts_iosf intel_powerclamp snd_pcm coretemp crct10dif_pclmul crc32_pclmul
[103861.446689]  hci_uart drm crc32c_intel ghash_clmulni_intel btbcm cryptd snd_timer intel_cstate btqca btintel pcspkr snd bluetooth soundcore mei_txe r8169 mei lpc_ich i2c_i801 intel_gtt agpgart mii syscopyarea cdc_acm sysfillrect shpchp sysimgblt fb_sys_fops rtc_cmos parport_pc parport ecdh_generic rfkill evdev crc16 mac_hid i2c_hid pwm_lpss_platform pwm_lpss ip_tables x_tables zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) hid_generic usbhid hid sd_mod ahci xhci_pci libahci xhci_hcd libata usbcore scsi_mod usb_common
[103861.447137] CPU: 0 PID: 433 Comm: dp_sync_taskq Tainted: P           O     4.15.7-1-ARCH #1
[103861.447209] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q1900DC-ITX, BIOS P1.50 03/31/2016
[103861.447378] RIP: 0010:multilist_sublist_remove+0x10/0x30 [zfs]
[103861.447433] RSP: 0018:ffffb7fd45e6bd20 EFLAGS: 00010286
[103861.447482] RAX: ffff94bc422f6e40 RBX: ffff94bd22559000 RCX: 0000000000000001
[103861.447545] RDX: dead000000000100 RSI: ffff94bb4fb44100 RDI: ffff94bc422f6e00
[103861.447607] RBP: ffff94bb4fb44000 R08: 0000000000000000 R09: 0000000000000003
[103861.447669] R10: ffffb7fd45e6bbe8 R11: ffffb7fd45e6bcdd R12: ffff94bb4fb44120
[103861.447731] R13: ffff94bce832c000 R14: ffff94bc422f6e00 R15: ffff94bb4fb44140
[103861.447794] FS:  0000000000000000(0000) GS:ffff94bd3fc00000(0000) knlGS:0000000000000000
[103861.447864] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[103861.447916] CR2: 00007f75bac13db8 CR3: 000000032ab9a000 CR4: 00000000001006f0
[103861.447978] Call Trace:
[103861.448075]  userquota_updates_task+0xdf/0x5b0 [zfs]
[103861.448180]  ? dmu_objset_is_dirty+0x20/0x20 [zfs]
[103861.448281]  ? dmu_objset_is_dirty+0x20/0x20 [zfs]
[103861.448378]  ? dmu_objset_is_dirty+0x20/0x20 [zfs]
[103861.448432]  taskq_thread+0x295/0x4a0 [spl]
[103861.448478]  ? wake_up_q+0x80/0x80
[103861.448523]  ? taskq_thread_should_stop+0x70/0x70 [spl]
[103861.448572]  kthread+0x113/0x130
[103861.448605]  ? kthread_create_on_node+0x70/0x70
[103861.448652]  ret_from_fork+0x35/0x40
[103861.448691] Code: 08 48 89 06 48 89 56 08 48 89 32 f3 c3 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 03 77 38 48 8b 46 08 48 8b 16 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 06 48 
[103861.448990] RIP: multilist_sublist_remove+0x10/0x30 [zfs] RSP: ffffb7fd45e6bd20
[103861.449108] ---[ end trace 4e6621a5dfbf95b1 ]---

The application (a rm) is now hung.

EDIT: Unlike the last time, the pool came back fine (I think) after the reboot.

EDIT 2: Happened again during a rm after running the same workload (compiling an AUR package).

@lnicola lnicola reopened this Mar 8, 2018
@lnicola
Copy link
Contributor Author

lnicola commented Mar 11, 2018

Call traces of processes in D state after the hang:

[104651.891826] txg_sync        D    0   578      2 0x80000000
[104651.893818] Call Trace:
[104651.895780]  ? __schedule+0x24b/0x8c0
[104651.898000]  ? preempt_count_add+0x68/0xa0
[104651.899966]  schedule+0x32/0x90
[104651.901931]  taskq_wait+0x72/0xc0 [spl]
[104651.903896]  ? wait_woken+0x80/0x80
[104651.905920]  dsl_pool_sync+0x17c/0x420 [zfs]
[104651.907946]  spa_sync+0x425/0xcf0 [zfs]
[104651.909962]  txg_sync_thread+0x2cc/0x490 [zfs]
[104651.911978]  ? txg_delay+0x1a0/0x1a0 [zfs]
[104651.913929]  ? __thread_exit+0x20/0x20 [spl]
[104651.915899]  thread_generic_wrapper+0x6f/0x80 [spl]
[104651.917872]  kthread+0x113/0x130
[104651.919837]  ? kthread_create_on_node+0x70/0x70
[104651.921804]  ? kthread_create_on_node+0x70/0x70
[104651.923756]  ret_from_fork+0x35/0x40

[104655.089910] python2.7       D    0  8222      1 0x00000000
[104655.091994] Call Trace:
[104655.094057]  ? __schedule+0x24b/0x8c0
[104655.096120]  ? preempt_count_add+0x68/0xa0
[104655.098185]  schedule+0x32/0x90
[104655.100251]  cv_wait_common+0x113/0x130 [spl]
[104655.102340]  ? wait_woken+0x80/0x80
[104655.104488]  txg_wait_open+0xab/0xf0 [zfs]
[104655.106624]  dmu_free_long_range_impl+0x223/0x410 [zfs]
[104655.108776]  dmu_free_long_range+0x6a/0xb0 [zfs]
[104655.110929]  zfs_rmnode+0x24f/0x320 [zfs]
[104655.113102]  ? zfs_zinactive+0xd3/0xf0 [zfs]
[104655.115287]  zfs_inactive+0x7a/0x1f0 [zfs]
[104655.117415]  ? unmap_mapping_range+0x73/0x140
[104655.119608]  zpl_evict_inode+0x3c/0x50 [zfs]
[104655.121752]  evict+0xc4/0x190
[104655.123880]  do_unlinkat+0x1ab/0x320
[104655.126012]  do_syscall_64+0x74/0x190
[104655.128143]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

[104664.966746] postgres        D    0  4892   4612 0x00000000
[104664.968923] Call Trace:
[104664.971086]  ? __schedule+0x24b/0x8c0
[104664.973224]  ? preempt_count_add+0x68/0xa0
[104664.975376]  schedule+0x32/0x90
[104664.977518]  cv_wait_common+0x113/0x130 [spl]
[104664.979654]  ? wait_woken+0x80/0x80
[104664.981884]  txg_wait_open+0xab/0xf0 [zfs]
[104664.984086]  dmu_free_long_range_impl+0x223/0x410 [zfs]
[104664.986328]  dmu_free_long_range+0x6a/0xb0 [zfs]
[104664.988543]  zfs_rmnode+0x24f/0x320 [zfs]
[104664.990769]  ? zfs_zinactive+0xd3/0xf0 [zfs]
[104664.992991]  zfs_inactive+0x7a/0x1f0 [zfs]
[104664.995297]  ? unmap_mapping_range+0x73/0x140
[104664.997504]  zpl_evict_inode+0x3c/0x50 [zfs]
[104664.999648]  evict+0xc4/0x190
[104665.001771]  __dentry_kill+0xc3/0x150
[104665.003869]  dput.part.4+0x131/0x1e0
[104665.005979]  SyS_renameat2+0x3bc/0x590
[104665.008079]  do_syscall_64+0x74/0x190
[104665.010177]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

[104669.725973] rm              D    0 10213   2062 0x00000000
[104669.727987] Call Trace:
[104669.729983]  ? __schedule+0x24b/0x8c0
[104669.731985]  ? preempt_count_add+0x68/0xa0
[104669.733992]  schedule+0x32/0x90
[104669.735996]  cv_wait_common+0x113/0x130 [spl]
[104669.738012]  ? wait_woken+0x80/0x80
[104669.740096]  txg_wait_open+0xab/0xf0 [zfs]
[104669.742170]  dmu_free_long_range_impl+0x223/0x410 [zfs]
[104669.744268]  dmu_free_long_range+0x6a/0xb0 [zfs]
[104669.746355]  zfs_rmnode+0x24f/0x320 [zfs]
[104669.748447]  ? zfs_zinactive+0xd3/0xf0 [zfs]
[104669.750543]  zfs_inactive+0x7a/0x1f0 [zfs]
[104669.752565]  ? unmap_mapping_range+0x73/0x140
[104669.754637]  zpl_evict_inode+0x3c/0x50 [zfs]
[104669.756622]  evict+0xc4/0x190
[104669.758581]  do_unlinkat+0x1ab/0x320
[104669.760537]  do_syscall_64+0x74/0x190
[104669.762487]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

[104671.523356] redis-server    D    0 15279   4325 0x00000100
[104671.525335] Call Trace:
[104671.527306]  ? __schedule+0x24b/0x8c0
[104671.529294]  ? preempt_count_add+0x68/0xa0
[104671.531289]  schedule+0x32/0x90
[104671.533269]  cv_wait_common+0x113/0x130 [spl]
[104671.535262]  ? wait_woken+0x80/0x80
[104671.537313]  txg_wait_open+0xab/0xf0 [zfs]
[104671.539359]  dmu_free_long_range_impl+0x223/0x410 [zfs]
[104671.541418]  dmu_free_long_range+0x6a/0xb0 [zfs]
[104671.543483]  zfs_rmnode+0x24f/0x320 [zfs]
[104671.545519]  ? zfs_zinactive+0xd3/0xf0 [zfs]
[104671.547563]  zfs_inactive+0x7a/0x1f0 [zfs]
[104671.549556]  ? unmap_mapping_range+0x73/0x140
[104671.551592]  zpl_evict_inode+0x3c/0x50 [zfs]
[104671.553835]  evict+0xc4/0x190
[104671.555806]  __dentry_kill+0xc3/0x150
[104671.557788]  dput.part.4+0x131/0x1e0
[104671.559773]  SyS_renameat2+0x3bc/0x590
[104671.561739]  do_syscall_64+0x74/0x190
[104671.563719]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

[104671.873790] htop            D    0 15971  15826 0x00000004
[104671.875834] Call Trace:
[104671.877876]  ? __schedule+0x24b/0x8c0
[104671.879904]  ? preempt_count_add+0x68/0xa0
[104671.881958]  schedule+0x32/0x90
[104671.884022]  cv_wait_common+0x113/0x130 [spl]
[104671.886124]  ? wait_woken+0x80/0x80
[104671.888278]  txg_wait_open+0xab/0xf0 [zfs]
[104671.890420]  dmu_free_long_range_impl+0x223/0x410 [zfs]
[104671.892575]  dmu_free_long_range+0x6a/0xb0 [zfs]
[104671.894737]  zfs_trunc+0x77/0x1e0 [zfs]
[104671.897121]  ? sa_lookup_locked+0x89/0xb0 [zfs]
[104671.899190]  ? preempt_count_add+0x68/0xa0
[104671.901282]  ? _raw_spin_lock+0x13/0x40
[104671.903422]  zfs_freesp+0x9b/0x430 [zfs]
[104671.905508]  ? preempt_count_add+0x68/0xa0
[104671.907588]  ? _raw_spin_lock+0x13/0x40
[104671.909673]  ? _raw_spin_unlock+0x16/0x30
[104671.911806]  ? zfs_zaccess_aces_check+0x1b1/0x390 [zfs]
[104671.913970]  ? zfs_zaccess+0x10b/0x3f0 [zfs]
[104671.916126]  zfs_setattr+0xd1d/0x22a0 [zfs]
[104671.918234]  ? spl_kmem_alloc_impl+0xd6/0x190 [spl]
[104671.920332]  ? spl_kmem_alloc_impl+0xd6/0x190 [spl]
[104671.922472]  ? rrw_exit+0x55/0x140 [zfs]
[104671.924577]  ? zpl_xattr_get_dir+0x5d/0x110 [zfs]
[104671.926614]  ? preempt_count_add+0x68/0xa0
[104671.928626]  ? __kmalloc_node+0x21b/0x2d0
[104671.930649]  ? spl_kmem_zalloc+0xc7/0x180 [spl]
[104671.932731]  zpl_setattr+0xf8/0x170 [zfs]
[104671.934762]  notify_change+0x294/0x420
[104671.936746]  do_truncate+0x75/0xc0
[104671.939022]  path_openat+0xc3f/0x1320
[104671.940986]  do_filp_open+0x9b/0x110
[104671.942959]  ? __vfs_write+0x36/0x180
[104671.944938]  ? __check_object_size+0x9b/0x1a0
[104671.946918]  ? __alloc_fd+0xaf/0x160
[104671.948907]  ? do_sys_open+0x1bd/0x250
[104671.950879]  do_sys_open+0x1bd/0x250
[104671.952854]  do_syscall_64+0x74/0x190
[104671.954807]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

@tcaputi
Copy link
Contributor

tcaputi commented Mar 11, 2018

@lnicola That is expected. Once the GPF happens, that thread will stop running, causing any threads that rely on it to also become stuck. Unfortunately, we need to find a way to reproduce the issue at least somewhat consistently if we are going to have a chance to fix it.

@lnicola
Copy link
Contributor Author

lnicola commented Mar 28, 2018

Well, it seems I can still reproduce this, so tell me if there's anything I can do...

@tcaputi
Copy link
Contributor

tcaputi commented Mar 28, 2018

@lnicola can you provide steps that I can use to reproduce this? As of yet I have not been able to do so.

@nivedita76
Copy link
Contributor

The PANIC actually shows up after the list_del corruption though

@nivedita76
Copy link
Contributor

Here's the log

kern1.log

@nivedita76
Copy link
Contributor

It should be large dnodes, I'll check once I reboot.

Looking at that function not sure how that VERIFY fails though.. it checks that ref count is not zero, then adds 1 and checks that it's greater than 1, and that fails.

@nivedita76
Copy link
Contributor

Yes large_dnodes is active

@nivedita76
Copy link
Contributor

In dnode_move, don't we have to worry about dn_dirty_link in addition to dn_link?

@tcaputi
Copy link
Contributor

tcaputi commented Apr 2, 2018

@nivedita76 If you are able to reproduce the problem consistently, would it be possible to provide a stack trace of the VERIFY() you are hitting?

@nivedita76
Copy link
Contributor

How should I do that? The last log I attached was what got dumped into dmesg

@tcaputi
Copy link
Contributor

tcaputi commented Apr 2, 2018

Sorry. I missed it (needed to scroll down a bit farther). What you have provided is fine.

@tcaputi
Copy link
Contributor

tcaputi commented Apr 3, 2018

Might have an idea of what's causing this..... I'll get back soon if I figure anything out.

@nivedita76
Copy link
Contributor

I just had a list_del corruption even after configuring zfs_multilist_num_sublists to 1. Unfortunately this time it didn't save anything to the log before crashing.

@lnicola
Copy link
Contributor Author

lnicola commented Apr 4, 2018

@nivedita76 You can try netconsole to catch crash dumps if you have another machine:

modprobe netconsole
dmesg -n 8
cd /sys/kernel/config/netconsole
mkdir -p target1
cd target1
echo XXXX > local_ip
echo XXXX > remote_ip
echo enXXXX > dev_name
echo XX:XX:XX:XX:XX:XX > remote_mac
echo 1 > enabled

@tcaputi
Copy link
Contributor

tcaputi commented Apr 4, 2018

@nivedita76 Try applying this patch:

diff --git a/module/zfs/dnode.c b/module/zfs/dnode.c
index 596983b..17be48e 100644
--- a/module/zfs/dnode.c
+++ b/module/zfs/dnode.c
@@ -1100,6 +1100,11 @@ dnode_check_slots_free(dnode_children_t *children, int idx, int slots)
                        if (type != DMU_OT_NONE)
                                return (B_FALSE);
 
+                       for (int i = 0; i < TXG_SIZE; i++) {
+                               if (list_link_active(&dn->dn_dirty_link[i]))
+                                       return (B_FALSE);
+                       }
+
                        continue;
                } else {
                        return (B_FALSE);

This needs some work before I merge it, but in our testing it seemed to fix the problem.

tcaputi pushed a commit to datto/zfs that referenced this issue Apr 4, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable. This patch adds a check for whether dn_dirty_link is
active to determine if we are in this state.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Fixes: openzfs#7147

Signed-off-by: Tom Caputi <[email protected]>
@tcaputi
Copy link
Contributor

tcaputi commented Apr 4, 2018

@nivedita76 if you get a chance try the patch from #7388 and see if the problem is fixed.

@nivedita76
Copy link
Contributor

Got an oops.

[ 62.868131] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[ 62.868522] IP: multilist_sublist_lock_obj+0x9/0x20
[ 62.868868] PGD 0 P4D 0
[ 62.869210] Oops: 0000 [#1] SMP PTI
[ 62.869579] Modules linked in: macvtap macvlan tap bridge devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables nvidia_uvm(OE) 8021q garp mrp stp llc dm_mod snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel vboxpci(OE) aes_x86_64 crypto_simd glue_helper cryptd vboxnetadp(OE) intel_cstate snd_usb_audio vboxnetflt(OE) vboxdrv(OE) intel_uncore snd_usbmidi_lib snd_rawmidi intel_rapl_perf snd_seq_device nvidia_drm(OE) nvidia_modeset(OE) snd_hda_codec_realtek snd_hda_codec_generic nvidia(OE) snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore
[ 62.872350] CPU: 55 PID: 7977 Comm: mysqld Tainted: G W OE 4.16.0-rani-zfs+ #7
[ 62.872806] Hardware name: ASUSTeK COMPUTER INC. Z10PE-D8 WS/Z10PE-D8 WS, BIOS 3407 03/10/2017
[ 62.873275] RIP: 0010:multilist_sublist_lock_obj+0x9/0x20
[ 62.873748] RSP: 0018:ffffbec4f0123828 EFLAGS: 00010246
[ 62.874222] RAX: ffffa1ae31c74000 RBX: 0000000000000000 RCX: 0000000000000000
[ 62.874706] RDX: ffffffff903a25c0 RSI: ffffa1ae07b47638 RDI: 0000000000000000
[ 62.875194] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
[ 62.875682] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa1ae07b47798
[ 62.876170] R13: ffffa1adf6edf0f8 R14: 0000000000000000 R15: 0000000000000002
[ 62.876659] FS: 00007f67fcdbbe00(0000) GS:ffffa1ae3e800000(0000) knlGS:0000000000000000
[ 62.877158] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 62.877656] CR2: 0000000000000018 CR3: 0000000ff68e0003 CR4: 00000000003626e0
[ 62.878161] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 62.878657] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 62.879185] Call Trace:
[ 62.879770] dnode_check_slots_free+0x107/0x1c0
[ 62.880397] dnode_hold_impl+0x5d5/0xf30
[ 62.880887] dmu_object_alloc_dnsize+0x241/0x520
[ 62.881385] zfs_mknode+0x130/0xf70
[ 62.881887] ? _raw_spin_unlock+0x24/0x30
[ 62.882384] ? dmu_tx_assign+0x230/0x560
[ 62.882885] zfs_create+0x6b9/0x9e0
[ 62.883388] zpl_create+0xaa/0x160
[ 62.883917] lookup_open+0x5da/0x780
[ 62.884446] path_openat+0x32c/0xd10
[ 62.884944] do_filp_open+0xb4/0x130
[ 62.885449] ? do_sys_open+0x1c7/0x250
[ 62.885938] do_sys_open+0x1c7/0x250
[ 62.886412] do_syscall_64+0x66/0x120
[ 62.886881] entry_SYSCALL_64_after_hwframe+0x3d/0xa2

tcaputi pushed a commit to datto/zfs that referenced this issue Apr 5, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable. This patch adds a check for whether dn_dirty_link is
active to determine if we are in this state, avoiding the crash.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Fixes: openzfs#7147

Signed-off-by: Tom Caputi <[email protected]>
tcaputi pushed a commit to datto/zfs that referenced this issue Apr 6, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable. This patch adds the ability for dnodes to know which
txg they were last dirtied in and adds a check for this before
performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Fixes: openzfs#7147

Signed-off-by: Tom Caputi <[email protected]>
tcaputi pushed a commit to datto/zfs that referenced this issue Apr 6, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable. This patch adds the ability for dnodes to know which
txg they were last dirtied in and adds a check for this before
performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Fixes: openzfs#7147

Signed-off-by: Tom Caputi <[email protected]>
tcaputi pushed a commit to datto/zfs that referenced this issue Apr 7, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable. This patch adds the ability for dnodes to know which
txg they were last dirtied in and adds a check for this before
performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Fixes: openzfs#7147

Signed-off-by: Tom Caputi <[email protected]>
tcaputi pushed a commit to datto/zfs that referenced this issue Apr 8, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Fixes: openzfs#7147

Signed-off-by: Tom Caputi <[email protected]>
tcaputi pushed a commit to datto/zfs that referenced this issue Apr 9, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Fixes: openzfs#7147

Signed-off-by: Tom Caputi <[email protected]>
tcaputi pushed a commit to datto/zfs that referenced this issue Apr 9, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Fixes: openzfs#7147

Signed-off-by: Tom Caputi <[email protected]>
behlendorf pushed a commit to behlendorf/zfs that referenced this issue Oct 10, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Requires-spl: spl-0.7-release
Issue openzfs#7147
Issue openzfs#7388
Issue openzfs#7997
tonyhutter pushed a commit to LLNL/zfs that referenced this issue Oct 10, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Requires-spl: spl-0.7-release
Issue openzfs#7147
Issue openzfs#7388
Issue openzfs#7997
tonyhutter pushed a commit to tonyhutter/zfs that referenced this issue Oct 31, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes openzfs#7147
Closes openzfs#7388
tonyhutter pushed a commit that referenced this issue Nov 13, 2018
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.

This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7147
Closes #7388
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants