Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null pointer dereference on zfs 0.6.4.1-2 and Debian 8 #3678

Closed
Pentium100MHz opened this issue Aug 11, 2015 · 8 comments
Closed

Null pointer dereference on zfs 0.6.4.1-2 and Debian 8 #3678

Pentium100MHz opened this issue Aug 11, 2015 · 8 comments
Labels
Component: ZVOL ZFS Volumes
Milestone

Comments

@Pentium100MHz
Copy link

Hi, I saw that the system in my previous issue was not the latest version, so I updated, now I got this:

[324729.276946] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
[324729.276959] IP: [<ffffffffa08f70b8>] zvol_revalidate_disk+0x28/0x30 [zfs]
[324729.276973] PGD 0 
[324729.276976] Oops: 0002 [#1] SMP 
[324729.276979] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver binfmt_misc nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_physdev ip_set nfnetlink vhost_net vhost macvtap macvlan ip6table_filter ip6_tables iptable_filter ip_tables ebt_arp ebt_ip ebtable_nat ebtables x_tables tun nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc 8021q garp mrp bridge stp llc bonding snd_hda_codec_hdmi zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) x86_pkg_temp_thermal intel_powerclamp intel_rapl coretemp kvm_intel kvm crc32_pclmul iTCO_wdt eeepc_wmi iTCO_vendor_support asus_wmi ghash_clmulni_intel sparse_keymap rfkill snd_hda_codec_realtek snd_hda_codec_generic evdev nouveau aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd mxm_wmi video psmouse ttm serio_raw
[324729.277055]  snd_hda_intel sb_edac pcspkr snd_hda_controller drm_kms_helper edac_core snd_hda_codec drm snd_hwdep snd_pcm i2c_algo_bit snd_timer snd lpc_ich i2c_i801 mfd_core soundcore mei_me i2c_core mei shpchp wmi processor tpm_infineon thermal_sys tpm_tis tpm button autofs4 ext4 crc16 mbcache jbd2 dm_mod raid1 md_mod sg sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci ehci_pci libata xhci_hcd ehci_hcd firewire_ohci firewire_core e1000e crc_itu_t usbcore ptp scsi_mod usb_common pps_core
[324729.277116] CPU: 1 PID: 32452 Comm: systemd-udevd Tainted: P           O  3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1+deb8u3
[324729.277120] Hardware name: System manufacturer System Product Name/P9X79, BIOS 4608 12/24/2013
[324729.277123] task: ffff880c2a6641d0 ti: ffff880dbe938000 task.ti: ffff880dbe938000
[324729.277126] RIP: 0010:[<ffffffffa08f70b8>]  [<ffffffffa08f70b8>] zvol_revalidate_disk+0x28/0x30 [zfs]
[324729.277135] RSP: 0018:ffff880dbe93bb88  EFLAGS: 00010246
[324729.277138] RAX: 0000000000000000 RBX: ffff880ffe3114c0 RCX: 000000000000fac8
[324729.277141] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff880ffb5b2c00
[324729.277143] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[324729.277146] R10: 00000005f3940000 R11: fffffffa0c6d7ce5 R12: ffffffffa092d3c0
[324729.277149] R13: 0000000000000001 R14: 000000000000005d R15: 0000000000000001
[324729.277152] FS:  00007f2069509880(0000) GS:ffff88103fc40000(0000) knlGS:0000000000000000
[324729.277155] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[324729.277157] CR2: 0000000000000050 CR3: 0000000e80089000 CR4: 00000000000427e0
[324729.277159] Stack:
[324729.277161]  ffffffff811dbf9b ffff880fead42800 ffff880ffe3114c0 00000000fffffffe
[324729.277166]  ffffffffa08f7a51 000000000e600050 ffff880dbe93bc14 ffff880ffe3114d8
[324729.277539]  ffff880ffe3114c0 ffff880ffb5b2c00 ffff880ffe3114d8 ffffffffa092d460
[324729.278205] Call Trace:
[324729.278878]  [<ffffffff811dbf9b>] ? check_disk_change+0x5b/0x70
[324729.279554]  [<ffffffffa08f7a51>] ? zvol_open+0xb1/0x2f0 [zfs]
[324729.280215]  [<ffffffff811dc87c>] ? __blkdev_get+0xcc/0x480
[324729.280885]  [<ffffffff811dcf80>] ? blkdev_get_by_dev+0x40/0x40
[324729.281560]  [<ffffffff811dcde6>] ? blkdev_get+0x1b6/0x310
[324729.282238]  [<ffffffff811dcf80>] ? blkdev_get_by_dev+0x40/0x40
[324729.282918]  [<ffffffff811a5c42>] ? do_dentry_open+0x1f2/0x330
[324729.283605]  [<ffffffff811a5f4d>] ? finish_open+0x2d/0x40
[324729.284283]  [<ffffffff811b6aaa>] ? do_last+0xa6a/0x11e0
[324729.284957]  [<ffffffff811b30c6>] ? link_path_walk+0x286/0x8a0
[324729.285624]  [<ffffffff811b72db>] ? path_openat+0xbb/0x680
[324729.286288]  [<ffffffff8117a565>] ? free_pages_and_swap_cache+0x95/0xb0
[324729.286948]  [<ffffffff811b804a>] ? do_filp_open+0x3a/0x90
[324729.287606]  [<ffffffff811c40ac>] ? __alloc_fd+0x7c/0x120
[324729.288261]  [<ffffffff811a7489>] ? do_sys_open+0x129/0x220
[324729.288909]  [<ffffffff8151158d>] ? system_call_fast_compare_end+0x10/0x15
[324729.289554] Code: 00 00 00 66 66 66 66 90 48 8b 87 80 03 00 00 c7 80 20 01 00 00 00 00 00 00 48 8b 90 98 05 00 00 48 8b 80 00 01 00 00 48 c1 e8 09 <48> 89 42 50 31 c0 c3 90 66 66 66 66 90 48 8b 87 98 00 00 00 48 
[324729.290954] RIP  [<ffffffffa08f70b8>] zvol_revalidate_disk+0x28/0x30 [zfs]
[324729.291600]  RSP <ffff880dbe93bb88>
[324729.292246] CR2: 0000000000000050
[324729.295066] ---[ end trace de6614df51fa0c39 ]---

The pool has a single raidz1 vdev with 3 SSDs.

modinfo zfs | grep version

modinfo zfs  | grep version
version:        0.6.4-1.2-1
srcversion:     D7F5DB3F148B3DF8B2FCEE3
vermagic:       3.16.0-4-amd64 SMP mod_unload modversions

modinfo spl | grep version

version:        0.6.4-1b
srcversion:     88A374A9A6ABDC4BD14DF0E
vermagic:       3.16.0-4-amd64 SMP mod_unload modversions

cat /etc/debian_version

8.1
@Pentium100MHz
Copy link
Author

Update: this may have something to do with trying to write to a zvol that is marked read only.
EDIT: more testing showed that no, it's not that.
This is how it happens:

ssh otherserver "zfs send -v tank/fish@snap" | zfs recv -Fduv tank
zfs set readonly=yes tank/fish
ssh otherserver "zfs send -v tank/fish/zvol@snap" | zfs recv -Fduv tank
zfs set readonly=yes tank/fish
ssh otherserver "zfs send -v tank/fish/zvol2@snap | zfs recv -Fduv tank

The last zfs recv hangs in D state and dmesg shows the Null Pointer Dereference error.
However, this only happens sometimes, for now, I cannot figure out why it only happens sometimes, perhaps some race condition or it is load dependant?

It may also depend on the size of the zvol.

@behlendorf
Copy link
Contributor

Yes, it definitely does sound like there some kind of race here but after a brief inspection it's not obvious. Clearly one of these fields was NULL, but an offset of 0x50 doesn't make sense for any of the members. We'll need to dig deeper.

@Pentium100MHz
Copy link
Author

It's somewhere that is running all the time though, since once that happens (the error message), zvols cannot be created, destroyed or opened. That is, if a process (virtual machine in my case) was accessing a zvol it continues to be able to access it, but starting new VMs, stopping VMs or even running dd to/from a zvol gets stuck in D state. This continues until I hard reboot the server (running reboot from CLI gets stuck). Creating/destroying regular datasets work normally as far as I noticed.

@Pentium100MHz
Copy link
Author

This happens also on zfs 0.6.5.1-4.
dmesg output:

[438685.162284] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
[438685.162412] IP: [<ffffffffa084bc28>] zvol_revalidate_disk+0x28/0x30 [zfs]
[438685.162557] PGD 0 
[438685.162846] Oops: 0002 [#1] SMP 
[438685.163271] Modules linked in: binfmt_misc vhost_net vhost macvtap macvlan fuse nls_utf8 ntfs rpcsec_gss_krb5 nfsv4 dns_resolver ip6table_filter ip6_tables ebt_arp ebt_ip ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_physdev ip_set nfnetlink iptable_filter ip_tables x_tables tun nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc 8021q garp mrp bridge stp llc bonding snd_hda_codec_hdmi zfs(PO) zunicode(PO) zcommon(PO) znvpair(PO) spl(O) zavl(PO) x86_pkg_temp_thermal intel_powerclamp intel_rapl coretemp kvm_intel kvm crc32_pclmul eeepc_wmi asus_wmi iTCO_wdt sparse_keymap iTCO_vendor_support rfkill evdev snd_hda_codec_realtek ghash_clmulni_intel snd_hda_codec_generic nouveau aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd mxm_wmi video
[438685.166277]  psmouse ttm sb_edac pcspkr serio_raw drm_kms_helper snd_hda_intel snd_hda_controller edac_core drm snd_hda_codec snd_hwdep i2c_i801 i2c_algo_bit snd_pcm i2c_core snd_timer snd mei_me soundcore mei lpc_ich shpchp mfd_core wmi tpm_infineon processor tpm_tis button tpm thermal_sys autofs4 ext4 crc16 mbcache jbd2 dm_mod raid1 md_mod sg sd_mod crc_t10dif crct10dif_generic firewire_ohci crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci ehci_pci libata xhci_hcd ehci_hcd firewire_core crc_itu_t e1000e usbcore ptp scsi_mod usb_common pps_core
[438685.169255] CPU: 7 PID: 9478 Comm: systemd-udevd Tainted: P           O  3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1+deb8u4
[438685.169902] Hardware name: System manufacturer System Product Name/P9X79, BIOS 4608 12/24/2013
[438685.170557] task: ffff880690f37430 ti: ffff8802a8d78000 task.ti: ffff8802a8d78000
[438685.171216] RIP: 0010:[<ffffffffa084bc28>]  [<ffffffffa084bc28>] zvol_revalidate_disk+0x28/0x30 [zfs]
[438685.171896] RSP: 0018:ffff8802a8d7bb90  EFLAGS: 00010246
[438685.172582] RAX: 0000000000000000 RBX: ffff880865de2800 RCX: 0000000000009882
[438685.173264] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88043f9fb800
[438685.173955] RBP: 0000000000000001 R08: ffff8802a8d78000 R09: 0000000000000067
[438685.174648] R10: 0000000000000030 R11: 0000000000000008 R12: ffffffffa08824e0
[438685.175334] R13: 0000000000000001 R14: 0000000000000000 R15: ffff8800b7adba10
[438685.176024] FS:  00007efeb6380880(0000) GS:ffff88103fdc0000(0000) knlGS:0000000000000000
[438685.176715] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[438685.177412] CR2: 0000000000000050 CR3: 00000006c3f21000 CR4: 00000000000427e0
[438685.178127] Stack:
[438685.178843]  ffffffff811dbfcb ffff88059db55800 ffff880865de2800 0000000000000000
[438685.179591]  ffffffffa084c601 000000000e6006f0 ffff8802a8d7bc1c ffff880865de2818
[438685.180325]  ffff880865de2800 ffff88043f9fb800 ffff880865de2818 ffffffffa0882580
[438685.181066] Call Trace:
[438685.181798]  [<ffffffff811dbfcb>] ? check_disk_change+0x5b/0x70
[438685.182545]  [<ffffffffa084c601>] ? zvol_open+0xb1/0x2f0 [zfs]
[438685.183266]  [<ffffffff811dc8ac>] ? __blkdev_get+0xcc/0x480
[438685.183985]  [<ffffffff811dcfb0>] ? blkdev_get_by_dev+0x40/0x40
[438685.184691]  [<ffffffff811dce16>] ? blkdev_get+0x1b6/0x310
[438685.185405]  [<ffffffff811dcfb0>] ? blkdev_get_by_dev+0x40/0x40
[438685.186106]  [<ffffffff811a5c02>] ? do_dentry_open+0x1f2/0x330
[438685.186799]  [<ffffffff811a5f0d>] ? finish_open+0x2d/0x40
[438685.187488]  [<ffffffff811b66d4>] ? do_last+0x654/0x11b0
[438685.188165]  [<ffffffff811b72eb>] ? path_openat+0xbb/0x680
[438685.188838]  [<ffffffff8117a525>] ? free_pages_and_swap_cache+0x95/0xb0
[438685.189508]  [<ffffffff811b805a>] ? do_filp_open+0x3a/0x90
[438685.190172]  [<ffffffff811c40dc>] ? __alloc_fd+0x7c/0x120
[438685.190828]  [<ffffffff811a7449>] ? do_sys_open+0x129/0x220
[438685.191492]  [<ffffffff8151164d>] ? system_call_fast_compare_end+0x10/0x15
[438685.192152] Code: 00 00 00 66 66 66 66 90 48 8b 87 80 03 00 00 c7 80 20 01 00 00 00 00 00 00 48 8b 90 98 05 00 00 48 8b 80 00 01 00 00 48 c1 e8 09 <48> 89 42 50 31 c0 c3 90 66 66 66 66 90 48 8b 87 98 00 00 00 48 
[438685.193579] RIP  [<ffffffffa084bc28>] zvol_revalidate_disk+0x28/0x30 [zfs]
[438685.194253]  RSP <ffff8802a8d7bb90>
[438685.194914] CR2: 0000000000000050
[438685.197795] ---[ end trace c35f2639f6ef4565 ]---

stack dump of a stuck "zfs create -V 1G home/test/testzvol" process

[<ffffffffa0810234>] zap_cursor_retrieve+0x74/0x2f0 [zfs]
[<ffffffffa084da45>] zvol_create_minor+0x15/0x60 [zfs]
[<ffffffffa084da9a>] zvol_create_minors_cb+0xa/0x10 [zfs]
[<ffffffffa07afef0>] dmu_objset_find_impl+0xf0/0x3d0 [zfs]
[<ffffffffa084da90>] zvol_create_minors_cb+0x0/0x10 [zfs]
[<ffffffffa084da90>] zvol_create_minors_cb+0x0/0x10 [zfs]
[<ffffffffa07b0213>] dmu_objset_find+0x43/0x70 [zfs]
[<ffffffffa082410d>] zfs_ioc_create+0x15d/0x280 [zfs]
[<ffffffffa0821d7f>] zfsdev_ioctl+0x1df/0x4c0 [zfs]
[<ffffffff811ba4ef>] do_vfs_ioctl+0x2cf/0x4b0
[<ffffffff811ba751>] SyS_ioctl+0x81/0xa0
[<ffffffff81513668>] page_fault+0x28/0x30
[<ffffffff8151164d>] system_call_fast_compare_end+0x10/0x15
[<ffffffffffffffff>] 0xffffffffffffffff

@dweeezil
Copy link
Contributor

dweeezil commented Oct 5, 2015

@behlendorf The 0x50 is in set_capacity():

(gdb) print &((struct gendisk *)0)->part0
$7 = (struct hd_struct *) 0x48 <buf_hash_remove+24>
(gdb) print &((struct gendisk *)0)->part0.nr_sects
$8 = (sector_t *) 0x50 <buf_hash_remove+32>

@Pentium100MHz
Copy link
Author

I got a different error message this time:

Oct  7 11:38:51 ohv1 kernel: [98083.739802] PGD 0 
Oct  7 11:38:51 ohv1 kernel: [98083.739921] Oops: 0002 [#1] SMP 
Oct  7 11:38:51 ohv1 kernel: [98083.740194] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ip6table_filter ip6_tables ebt_arp ebt_ip ebtable_nat ebtables tun xt_physdev ip_set nfnetlink iptable_filter ip_tables x_tables rpcsec_gss_krb5 nfsv4 dns_resolver nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc 8021q garp mrp bridge stp llc bonding snd_hda_codec_hdmi zfs(PO) zunicode(PO) x86_pkg_temp_thermal zcommon(PO) znvpair(PO) intel_powerclamp spl(O) zavl(PO) intel_rapl coretemp kvm_intel kvm crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic eeepc_wmi asus_wmi sparse_keymap iTCO_wdt iTCO_vendor_support rfkill evdev nouveau aesni_intel aes_x86_64 lrw gf128mul snd_hda_intel glue_helper psmouse ablk_helper snd_hda_controller cryptd mxm_wmi snd_hda_codec video snd_hwdep ttm sb_edac serio_raw drm_kms_helper snd_pcm pcspkr edac_core drm snd_timer i2c_i801 i2c_algo_bit i2c_core snd soundcore mei_me mei lpc_ich shpchp mfd_core wmi tpm_infineon processor tpm_tis thermal_sys tpm button fuse autofs4 ext4 crc16 mbcache jbd2 dm_mod raid1 md_mod sg sd_mod crc_t10dif crct10dif_generic ehci_pci xhci_hcd ehci_hcd crct10dif_pclmul crct10dif_common crc32c_intel ahci firewire_ohci libahci usbcore firewire_core crc_itu_t libata e1000e ptp scsi_mod usb_common pps_core
Oct  7 11:38:51 ohv1 kernel: [98083.745203] CPU: 6 PID: 1018 Comm: systemd-udevd Tainted: P           O  3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1+deb8u4
Oct  7 11:38:51 ohv1 kernel: [98083.745792] Hardware name: System manufacturer System Product Name/P9X79, BIOS 4608 12/24/2013
Oct  7 11:38:51 ohv1 kernel: [98083.746376] task: ffff880c44eec390 ti: ffff88015d028000 task.ti: ffff88015d028000
Oct  7 11:38:51 ohv1 kernel: [98083.746978] RIP: 0010:[<ffffffffa0b38c28>]  [<ffffffffa0b38c28>] zvol_revalidate_disk+0x28/0x30 [zfs]
Oct  7 11:38:51 ohv1 kernel: [98083.747595] RSP: 0018:ffff88015d02bb90  EFLAGS: 00010246
Oct  7 11:38:51 ohv1 kernel: [98083.748224] RAX: 0000000000000000 RBX: ffff880db102f1c0 RCX: 0000000000008853
Oct  7 11:38:51 ohv1 kernel: [98083.748853] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff880ff994d000
Oct  7 11:38:51 ohv1 kernel: [98083.749487] RBP: 0000000000000001 R08: ffffffff81610960 R09: 0000000000000001
Oct  7 11:38:51 ohv1 kernel: [98083.750125] R10: 0000000000012f00 R11: 0000000000000010 R12: ffffffffa0b6f4e0
Oct  7 11:38:51 ohv1 kernel: [98083.750778] R13: 0000000000000001 R14: 000000000000005d R15: 0000000000000001
Oct  7 11:38:51 ohv1 kernel: [98083.751434] FS:  00007fba1a94c880(0000) GS:ffff88103fd80000(0000) knlGS:0000000000000000
Oct  7 11:38:51 ohv1 kernel: [98083.752095] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct  7 11:38:51 ohv1 kernel: [98083.752757] CR2: 0000000000000050 CR3: 0000000340bb0000 CR4: 00000000000427e0
Oct  7 11:38:51 ohv1 kernel: [98083.753436] Stack:
Oct  7 11:38:51 ohv1 kernel: [98083.754104]  ffffffff811dbfcb ffff880a5f1c9800 ffff880db102f1c0 00000000fffffffe
Oct  7 11:38:51 ohv1 kernel: [98083.754795]  ffffffffa0b39601 000000000e600300 ffff88015d02bc1c ffff880db102f1d8
Oct  7 11:38:51 ohv1 kernel: [98083.755483]  ffff880db102f1c0 ffff880ff994d000 ffff880db102f1d8 ffffffffa0b6f580
Oct  7 11:38:51 ohv1 kernel: [98083.756168] Call Trace:
Oct  7 11:38:51 ohv1 kernel: [98083.756845]  [<ffffffff811dbfcb>] ? check_disk_change+0x5b/0x70
Oct  7 11:38:51 ohv1 kernel: [98083.757533]  [<ffffffffa0b39601>] ? zvol_open+0xb1/0x2f0 [zfs]
Oct  7 11:38:51 ohv1 kernel: [98083.758199]  [<ffffffff811dc8ac>] ? __blkdev_get+0xcc/0x480
Oct  7 11:38:51 ohv1 kernel: [98083.758878]  [<ffffffff811dcfb0>] ? blkdev_get_by_dev+0x40/0x40
Oct  7 11:38:51 ohv1 kernel: [98083.759564]  [<ffffffff811dce16>] ? blkdev_get+0x1b6/0x310
Oct  7 11:38:51 ohv1 kernel: [98083.760246]  [<ffffffff811dcfb0>] ? blkdev_get_by_dev+0x40/0x40
Oct  7 11:38:51 ohv1 kernel: [98083.760925]  [<ffffffff811a5c02>] ? do_dentry_open+0x1f2/0x330
Oct  7 11:38:51 ohv1 kernel: [98083.761612]  [<ffffffff811a5f0d>] ? finish_open+0x2d/0x40
Oct  7 11:38:51 ohv1 kernel: [98083.762293]  [<ffffffff811b66d4>] ? do_last+0x654/0x11b0
Oct  7 11:38:51 ohv1 kernel: [98083.762970]  [<ffffffff811b72eb>] ? path_openat+0xbb/0x680
Oct  7 11:38:51 ohv1 kernel: [98083.763642]  [<ffffffff8117a525>] ? free_pages_and_swap_cache+0x95/0xb0
Oct  7 11:38:51 ohv1 kernel: [98083.764323]  [<ffffffff811b805a>] ? do_filp_open+0x3a/0x90
Oct  7 11:38:51 ohv1 kernel: [98083.764998]  [<ffffffff811c40dc>] ? __alloc_fd+0x7c/0x120
Oct  7 11:38:51 ohv1 kernel: [98083.765671]  [<ffffffff811a7449>] ? do_sys_open+0x129/0x220
Oct  7 11:38:51 ohv1 kernel: [98083.766338]  [<ffffffff8151164d>] ? system_call_fast_compare_end+0x10/0x15
Oct  7 11:38:51 ohv1 kernel: [98083.766996] Code: 00 00 00 66 66 66 66 90 48 8b 87 80 03 00 00 c7 80 20 01 00 00 00 00 00 00 48 8b 90 98 05 00 00 48 8b 80 00 01 00 00 48 c1 e8 09 <48> 89 42 50 31 c0 c3 90 66 66 66 66 90 48 8b 87 98 00 00 00 48 
Oct  7 11:38:51 ohv1 kernel: [98083.769089]  RSP <ffff88015d02bb90>
Oct  7 11:38:51 ohv1 kernel: [98083.769745] CR2: 0000000000000050
Oct  7 11:38:51 ohv1 kernel: [98083.772607] ---[ end trace 4031de5a1eddb053 ]---
Oct  7 11:38:55 ohv1 kernel: [98087.007144] PGD 0 
Oct  7 11:38:55 ohv1 kernel: [98087.007745] Oops: 0002 [#2] SMP 
Oct  7 11:38:55 ohv1 kernel: [98087.008326] Modules linked in: nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ip6table_filter ip6_tables ebt_arp ebt_ip ebtable_nat ebtables tun xt_physdev ip_set nfnetlink iptable_filter ip_tables x_tables rpcsec_gss_krb5 nfsv4 dns_resolver nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc 8021q garp mrp bridge stp llc bonding snd_hda_codec_hdmi zfs(PO) zunicode(PO) x86_pkg_temp_thermal zcommon(PO) znvpair(PO) intel_powerclamp spl(O) zavl(PO) intel_rapl coretemp kvm_intel kvm crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic eeepc_wmi asus_wmi sparse_keymap iTCO_wdt iTCO_vendor_support rfkill evdev nouveau aesni_intel aes_x86_64 lrw gf128mul snd_hda_intel glue_helper psmouse ablk_helper snd_hda_controller cryptd mxm_wmi snd_hda_codec video snd_hwdep ttm sb_edac serio_raw drm_kms_helper snd_pcm pcspkr edac_core drm snd_timer i2c_i801 i2c_algo_bit i2c_core snd soundcore mei_me mei lpc_ich shpchp mfd_core wmi tpm_infineon processor tpm_tis thermal_sys tpm button fuse autofs4 ext4 crc16 mbcache jbd2 dm_mod raid1 md_mod sg sd_mod crc_t10dif crct10dif_generic ehci_pci xhci_hcd ehci_hcd crct10dif_pclmul crct10dif_common crc32c_intel ahci firewire_ohci libahci usbcore firewire_core crc_itu_t libata e1000e ptp scsi_mod usb_common pps_core
Oct  7 11:38:55 ohv1 kernel: [98087.015019] CPU: 1 PID: 1148 Comm: systemd-udevd Tainted: P      D    O  3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1+deb8u4
Oct  7 11:38:55 ohv1 kernel: [98087.015676] Hardware name: System manufacturer System Product Name/P9X79, BIOS 4608 12/24/2013
Oct  7 11:38:55 ohv1 kernel: [98087.016339] task: ffff880bd1810aa0 ti: ffff8809edb00000 task.ti: ffff8809edb00000
Oct  7 11:38:55 ohv1 kernel: [98087.017006] RIP: 0010:[<ffffffff8140c040>]  [<ffffffff8140c040>] skb_dequeue+0x40/0x70
Oct  7 11:38:55 ohv1 kernel: [98087.017688] RSP: 0018:ffff8809edb03e58  EFLAGS: 00010097
Oct  7 11:38:55 ohv1 kernel: [98087.018359] RAX: 0000000000000246 RBX: ffff880a00000000 RCX: 0000000000000000
Oct  7 11:38:55 ohv1 kernel: [98087.019039] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff880a5f1c9934
Oct  7 11:38:55 ohv1 kernel: [98087.019718] RBP: ffff880a5f1c9920 R08: 0000000000000000 R09: 0000000000000000
Oct  7 11:38:55 ohv1 kernel: [98087.020394] R10: ffff880ffa82fa10 R11: 0000000000000246 R12: ffff880a5f1c9934
Oct  7 11:38:55 ohv1 kernel: [98087.021062] R13: ffff881038fd57e0 R14: ffff880ffa2393d8 R15: ffff880e401495f0
Oct  7 11:38:55 ohv1 kernel: [98087.021729] FS:  00007fba1a94c880(0000) GS:ffff88103fc40000(0000) knlGS:0000000000000000
Oct  7 11:38:55 ohv1 kernel: [98087.022399] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct  7 11:38:55 ohv1 kernel: [98087.023065] CR2: 0000000000000008 CR3: 00000004b0179000 CR4: 00000000000427e0
Oct  7 11:38:55 ohv1 kernel: [98087.023735] Stack:
Oct  7 11:38:55 ohv1 kernel: [98087.024404]  ffff880a5f1c9920 ffff880e401495c0 ffff880a5f1c9998 ffffffff8140d250
Oct  7 11:38:55 ohv1 kernel: [98087.025107]  ffff880a5f1c9800 ffffffff81449cfb ffffea0000000000 0000000000000000
Oct  7 11:38:55 ohv1 kernel: [98087.025794]  ffff880e401495c0 0000000000000000 ffff880e401495f0 ffffffff8140327a
Oct  7 11:38:55 ohv1 kernel: [98087.026486] Call Trace:
Oct  7 11:38:55 ohv1 kernel: [98087.027175]  [<ffffffff8140d250>] ? skb_queue_purge+0x20/0x30
Oct  7 11:38:55 ohv1 kernel: [98087.027877]  [<ffffffff81449cfb>] ? netlink_release+0xfb/0x320
Oct  7 11:38:55 ohv1 kernel: [98087.028580]  [<ffffffff8140327a>] ? sock_release+0x1a/0x90
Oct  7 11:38:55 ohv1 kernel: [98087.029282]  [<ffffffff814032fe>] ? sock_close+0xe/0x20
Oct  7 11:38:55 ohv1 kernel: [98087.029980]  [<ffffffff811a9b4a>] ? __fput+0xca/0x1d0
Oct  7 11:38:55 ohv1 kernel: [98087.030675]  [<ffffffff810851a7>] ? task_work_run+0x97/0xd0
Oct  7 11:38:55 ohv1 kernel: [98087.031365]  [<ffffffff81012e99>] ? do_notify_resume+0x69/0xa0
Oct  7 11:38:55 ohv1 kernel: [98087.032068]  [<ffffffff8151190a>] ? int_signal+0x12/0x17
Oct  7 11:38:55 ohv1 kernel: [98087.032755] Code: 50 10 00 48 8b 5d 00 48 39 eb 74 3f 48 85 db 74 21 83 6d 10 01 48 8b 0b 48 8b 53 08 48 c7 03 00 00 00 00 48 c7 43 08 00 00 00 00 <48> 89 51 08 48 89 0a 48 89 c6 4c 89 e7 e8 6e 4d 10 00 48 89 d8 
Oct  7 11:38:55 ohv1 kernel: [98087.034883]  RSP <ffff8809edb03e58>
Oct  7 11:38:55 ohv1 kernel: [98087.035567] CR2: 0000000000000008
Oct  7 11:38:55 ohv1 kernel: [98087.036253] ---[ end trace 4031de5a1eddb054 ]---

This may or may not (I was not able to reproduce it at will, so it may be a coincidence) have something to do with creating a snapshot of a zvol, dd'ing it to a remote server and then creating a new zvol.

@ab-oe
Copy link
Contributor

ab-oe commented Jan 4, 2016

Hello,
I 'vealso reproduced this issue. The problem occurs when the zvol_state_lock mutex is held by remove function and meanwhile zvol_open is invoked. The zvol_open waits until zvol_state_lock is released and then does its job on already removed zvol. I resolved this by checking if zvol is still available on the zvol_state_list. I think also that check_disk_change should be invoked while the zvol_state_lock mutex is held.

Here's proposed patch:

--- zvol.c.orig
+++ zvol.c
@@ -142,6 +142,23 @@ zvol_find_by_name(const char *name)
    return (NULL);
 }

+/*
+ * Check if zvol is on zvol_state_list
+ * Return 0 if zvol is already removed or 1 otherwise.
+ */
+static int find_zv_on_list(zvol_state_t *pzv) {
+   zvol_state_t *zv;
+
+   ASSERT(MUTEX_HELD(&zvol_state_lock));
+   for (zv = list_head(&zvol_state_list); zv != NULL;
+       zv = list_next(&zvol_state_list, zv)) {
+       if (zv == pzv)
+           return 1;
+   }
+
+   return 0;
+}
+

 /*
  * Given a path, return TRUE if path is a ZVOL.
@@ -987,6 +1004,17 @@ zvol_open(struct block_device *bdev, fmo
        drop_mutex = 1;
    }

+   /* zvol_open can be called in parallel while _zvol_remove_minor are doing its job
+      the zvol_open function will continue after the zvol_state_lock mutex is released
+      but the zv data are already freed.
+   */
+
+   if (!find_zv_on_list(zv)) {
+       printk(KERN_DEBUG "Trying to open removed zvol");
+       error = -ENXIO;
+       goto out_mutex;
+   }
+
    ASSERT3P(zv, !=, NULL);

    if (zv->zv_open_count == 0) {
@@ -1006,12 +1034,12 @@ out_open_count:
    if (zv->zv_open_count == 0)
        zvol_last_close(zv);

+   check_disk_change(bdev);
+
 out_mutex:
    if (drop_mutex)
        mutex_exit(&zvol_state_lock);

-   check_disk_change(bdev);
-
    return (SET_ERROR(error));
 }

@radhus
Copy link

radhus commented Jan 31, 2016

Hi,

I think I've got the exact same backtraces four times when exporting my pool (simply 'zpool export pool'). Maybe it'll help that it occurred during export.
The same 0x50 offset occurs here as well:

[  595.312789] BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
[  595.312800] IP: [<ffffffffa0330c78>] zvol_revalidate_disk+0x28/0x30 [zfs]
[  595.312815] PGD 0
[  595.312817] Oops: 0002 [#1] SMP
[  595.312820] Modules linked in: xenfs xen_privcmd nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc algif_skciphe
r af_alg dm_crypt dm_mod x86_pkg_temp_thermal thermal_sys intel_rapl coretemp crc32_pclmul zfs(PO) zunicode(PO) evdev zcommon(PO
) znvpair(PO) spl(O) zavl(PO) aesni_intel aes_x86_64 lrw gf128mul glue_helper pcspkr ablk_helper cryptd autofs4 ext4 crc16 mbcac
he jbd2 sg sd_mod crc_t10dif crct10dif_generic ahci libahci libata scsi_mod xen_netfront xen_blkfront crct10dif_pclmul crct10dif
_common crc32c_intel xen_pcifront
[  595.312856] CPU: 0 PID: 1205 Comm: systemd-udevd Tainted: P           O  3.16.0-4-amd64 #1 Debian 3.16.7-ckt20-1+deb8u3
[  595.312860] task: ffff8802d2a36110 ti: ffff8802d0e5c000 task.ti: ffff8802d0e5c000
[  595.312862] RIP: e030:[<ffffffffa0330c78>]  [<ffffffffa0330c78>] zvol_revalidate_disk+0x28/0x30 [zfs]
[  595.312873] RSP: e02b:ffff8802d0e5fb88  EFLAGS: 00010246
[  595.312875] RAX: 0000000000000000 RBX: ffff8802d4f6eb40 RCX: 0000000000000192
[  595.312877] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8802d3ef1000
[  595.312880] RBP: 0000000000000001 R08: ffffffff81610960 R09: 0000000000000000
[  595.312882] R10: 0000000000000005 R11: 0000000000000010 R12: ffffffffa03674e0
[  595.312885] R13: 0000000000000001 R14: 0000000000000000 R15: ffff8802d4a2c310
[  595.312890] FS:  00007f50b9515880(0000) GS:ffff8802dea00000(0000) knlGS:0000000000000000
[  595.312893] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[  595.312896] CR2: 0000000000000050 CR3: 00000002ce13a000 CR4: 0000000000042660
[  595.312898] Stack:
[  595.312900]  ffffffff811dc6bb ffff8802d3c51800 ffff8802d4f6eb40 0000000000000000
[  595.312904]  ffffffffa0331651 ffff8802d4f6eb58 ffff8802d0f21100 000000000000005d
[  595.312907]  ffff8802d4f6eb40 ffff8802d3ef1000 ffff8802d4f6eb58 ffffffffa0367580
[  595.312911] Call Trace:
[  595.312916]  [<ffffffff811dc6bb>] ? check_disk_change+0x5b/0x70
[  595.312925]  [<ffffffffa0331651>] ? zvol_open+0xb1/0x2f0 [zfs]
[  595.312930]  [<ffffffff811dcf9c>] ? __blkdev_get+0xcc/0x480
[  595.312933]  [<ffffffff811dd6a0>] ? blkdev_get_by_dev+0x40/0x40
[  595.312936]  [<ffffffff811dd506>] ? blkdev_get+0x1b6/0x310
[  595.312939]  [<ffffffff811dd6a0>] ? blkdev_get_by_dev+0x40/0x40
[  595.312944]  [<ffffffff811a5f52>] ? do_dentry_open+0x1f2/0x330
[  595.312948]  [<ffffffff811a625d>] ? finish_open+0x2d/0x40
[  595.312953]  [<ffffffff811b6e4a>] ? do_last+0xaaa/0x1200
[  595.312957]  [<ffffffff811b3211>] ? link_path_walk+0x71/0x8a0
[  595.312960]  [<ffffffff811b765b>] ? path_openat+0xbb/0x680
[  595.312965]  [<ffffffff8117a8a5>] ? free_pages_and_swap_cache+0x95/0xb0
[  595.312969]  [<ffffffff811b83ca>] ? do_filp_open+0x3a/0x90
[  595.312974]  [<ffffffff811c442c>] ? __alloc_fd+0x7c/0x120
[  595.312978]  [<ffffffff811a7799>] ? do_sys_open+0x129/0x220
[  595.312984]  [<ffffffff81513d0d>] ? system_call_fast_compare_end+0x10/0x15
[  595.312987] Code: 00 00 00 0f 1f 44 00 00 48 8b 87 80 03 00 00 c7 80 20 01 00 00 00 00 00 00 48 8b 90 98 05 00 00 48 8b 80 00
 01 00 00 48 c1 e8 09 <48> 89 42 50 31 c0 c3 90 0f 1f 44 00 00 48 8b 87 98 00 00 00 48
[  595.313024] RIP  [<ffffffffa0330c78>] zvol_revalidate_disk+0x28/0x30 [zfs]
[  595.313034]  RSP <ffff8802d0e5fb88>
[  595.313037] CR2: 0000000000000050
[  595.313047] ---[ end trace 3e1e4a5d62e331b3 ]---

My versions:

# modinfo zfs|grep version
version:        0.6.5.2-2
srcversion:     B6DF70C0F3350F9F3D1A7C4
vermagic:       3.16.0-4-amd64 SMP mod_unload modversions

# modinfo spl |grep version
version:        0.6.5-1
srcversion:     8EE77DBC6AD9EB936333AAE
vermagic:       3.16.0-4-amd64 SMP mod_unload modversions

# cat /etc/debian_version
8.3

@behlendorf behlendorf added the Component: ZVOL ZFS Volumes label Jan 31, 2016
nedbass pushed a commit that referenced this issue Mar 23, 2016
zfsonlinux issue #2217 - zvol minor operations: check snapdev
property before traversing snapshots of a dataset

zfsonlinux issue #3681 - lock order inversion between zvol_open()
and dsl_pool_sync()...zvol_rename_minors()

Create a per-pool zvol taskq for asynchronous zvol tasks.
There are a few key design decisions to be aware of.

* Each taskq must be single threaded to ensure tasks are always
  processed in the order in which they were dispatched.

* There is a taskq per-pool in order to keep the pools independent.
  This way if one pool is suspended it will not impact another.

* The preferred location to dispatch a zvol minor task is a sync
  task.  In this context there is easy access to the spa_t and
  minimal error handling is required because the sync task must
  succeed.

Support for asynchronous zvol minor operations address issue #3681.

Signed-off-by: Boris Protopopov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2217
Closes #3678
Closes #3681
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: ZVOL ZFS Volumes
Projects
None yet
Development

No branches or pull requests

5 participants