-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve zvol initialization code #1477
Conversation
There is an extremely odd bug that causes zvols to fail to appear on some systems, but not others. Recently, I was able to consistently reproduce this issue over a period of 1 month. The issue disappeared after I applied this change from FreeBSD. This is from FreeBSD's pool version 28 import, which occurred in revision 219089. Ported-by: Richard Yao <[email protected]>
Upon further review, it appears that the spa.c changes from FreeBSD should not be necessary because zfs_ioc_pool_import should call zvol_create_minors(). However, the zvols will not appear without the spa.c changes and my trace statements in zfs_ioc_pool_import() to see what zvol_create_minors() is returning (or the error code that prevents it) are not being printed, which seems paradoxical. This merits investigation, but I currently have other priorities, so I am not sure how much time I will find to put into that. The spa.c changes should be safe to make even without more in-depth investigation. That is because zvol_create_minors() will not create minors for zvols whose minors are already created. |
These patches are now part of Gentoo's zfs kernel module package. |
@ryao Where do these changes stand currently? I still need to do a careful review but have they been working without trouble in Gentoo? |
The following error will occur on some (possibly all) kernels because blk_init_queue() will try to take the spinlock before we initialize it. [ 5.538871] BUG: spinlock bad magic on CPU#0, zpool/4054 [ 5.538885] lock: 0xffff88021a73de60, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0 [ 5.538888] Pid: 4054, comm: zpool Not tainted 3.9.3 openzfs#11 [ 5.538890] Call Trace: [ 5.538898] [<ffffffff81478ef8>] spin_dump+0x8c/0x91 [ 5.538902] [<ffffffff81478f1e>] spin_bug+0x21/0x26 [ 5.538906] [<ffffffff812da097>] do_raw_spin_lock+0x127/0x130 [ 5.538911] [<ffffffff81253301>] ? zvol_probe+0x91/0xf0 [ 5.538914] [<ffffffff8147d851>] _raw_spin_lock_irq+0x21/0x30 [ 5.538919] [<ffffffff812c2c1e>] cfq_init_queue+0x1fe/0x350 [ 5.538922] [<ffffffff81253360>] ? zvol_probe+0xf0/0xf0 [ 5.538926] [<ffffffff812aacb8>] elevator_init+0x78/0x140 [ 5.538930] [<ffffffff812b2677>] blk_init_allocated_queue+0x87/0xb0 [ 5.538933] [<ffffffff81253360>] ? zvol_probe+0xf0/0xf0 [ 5.538937] [<ffffffff812b26d5>] blk_init_queue_node+0x35/0x70 [ 5.538941] [<ffffffff812b271e>] blk_init_queue+0xe/0x10 [ 5.538944] [<ffffffff8125211b>] __zvol_create_minor+0x24b/0x620 [ 5.538947] [<ffffffff81253264>] zvol_create_minors_cb+0x24/0x30 [ 5.538952] [<ffffffff811bd9ca>] dmu_objset_find_spa+0xea/0x510 [ 5.538955] [<ffffffff81253240>] ? zvol_free+0x60/0x60 [ 5.538958] [<ffffffff811bda71>] dmu_objset_find_spa+0x191/0x510 [ 5.538962] [<ffffffff81253240>] ? zvol_free+0x60/0x60 [ 5.538965] [<ffffffff81253ea2>] zvol_create_minors+0x92/0x180 [ 5.538969] [<ffffffff811f8d80>] spa_open_common+0x250/0x380 [ 5.538973] [<ffffffff811f8ece>] spa_open+0xe/0x10 [ 5.538977] [<ffffffff8122817e>] pool_status_check.part.22+0x1e/0x80 [ 5.538980] [<ffffffff81228a55>] zfsdev_ioctl+0x155/0x190 [ 5.538984] [<ffffffff8116a695>] do_vfs_ioctl+0x325/0x5a0 [ 5.538989] [<ffffffff81163f1d>] ? final_putname+0x1d/0x40 [ 5.538992] [<ffffffff8116a950>] sys_ioctl+0x40/0x80 [ 5.538996] [<ffffffff814812c9>] ? do_page_fault+0x9/0x10 [ 5.539000] [<ffffffff81483929>] system_call_fastpath+0x16/0x1b [ 5.541118] zd0: unknown partition table We fix this by calling spin_lock_init before blk_init_queue. The manner in which zvol_init() initializes structures is suspectible to a race between initialization and a probe on a zvol. We reorganize zvol_init() to prevent that. Lastly, calling zvol_create_minors(NULL) in zvol_init() does nothing because no pools are imported, so we remove it. Signed-off-by: Richard Yao <[email protected]>
@behlendorf I do not see how that is possible. Multiple calls to __zvol_create_minor() should fail with EEXIST. It is also not clear to me that these patch was applied in issue #1528. |
@behlendorf I seemed to have missed your earlier query. No one has reported any problems with these patches in Gentoo. |
After talking on IRC, it looks like there is some sort of race in issue #1528 and this patch did play a role in it, but I am having trouble seeing how this happened. At this point, I am inclined to think that the compiler did something strange. |
There is an extremely odd bug that can cause zvols to fail to appear on some systems, but not others. Recently, I was able to consistently reproduce this issue over a period of 1 month. I found several things wrong in how zvols were being initialized, so I fixed them with some inspiration from FreeBSD. I can no longer reproduce the missing zvol issue after applying these fixes.