Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve zvol initialization code #1477

Closed
wants to merge 2 commits into from
Closed

Improve zvol initialization code #1477

wants to merge 2 commits into from

Conversation

ryao
Copy link
Contributor

@ryao ryao commented May 28, 2013

There is an extremely odd bug that can cause zvols to fail to appear on some systems, but not others. Recently, I was able to consistently reproduce this issue over a period of 1 month. I found several things wrong in how zvols were being initialized, so I fixed them with some inspiration from FreeBSD. I can no longer reproduce the missing zvol issue after applying these fixes.

There is an extremely odd bug that causes zvols to fail to appear on
some systems, but not others. Recently, I was able to consistently
reproduce this issue over a period of 1 month. The issue disappeared
after I applied this change from FreeBSD.

This is from FreeBSD's pool version 28 import, which occurred in
revision 219089.

Ported-by: Richard Yao <[email protected]>
@ryao
Copy link
Contributor Author

ryao commented May 28, 2013

Upon further review, it appears that the spa.c changes from FreeBSD should not be necessary because zfs_ioc_pool_import should call zvol_create_minors(). However, the zvols will not appear without the spa.c changes and my trace statements in zfs_ioc_pool_import() to see what zvol_create_minors() is returning (or the error code that prevents it) are not being printed, which seems paradoxical.

This merits investigation, but I currently have other priorities, so I am not sure how much time I will find to put into that. The spa.c changes should be safe to make even without more in-depth investigation. That is because zvol_create_minors() will not create minors for zvols whose minors are already created.

@ryao
Copy link
Contributor Author

ryao commented May 28, 2013

These patches are now part of Gentoo's zfs kernel module package.

@behlendorf
Copy link
Contributor

@ryao Where do these changes stand currently? I still need to do a careful review but have they been working without trouble in Gentoo?

The following error will occur on some (possibly all) kernels because
blk_init_queue() will try to take the spinlock before we initialize it.

[    5.538871] BUG: spinlock bad magic on CPU#0, zpool/4054
[    5.538885]  lock: 0xffff88021a73de60, .magic: 00000000, .owner:
<none>/-1, .owner_cpu: 0
[    5.538888] Pid: 4054, comm: zpool Not tainted 3.9.3 openzfs#11
[    5.538890] Call Trace:
[    5.538898]  [<ffffffff81478ef8>] spin_dump+0x8c/0x91
[    5.538902]  [<ffffffff81478f1e>] spin_bug+0x21/0x26
[    5.538906]  [<ffffffff812da097>] do_raw_spin_lock+0x127/0x130
[    5.538911]  [<ffffffff81253301>] ? zvol_probe+0x91/0xf0
[    5.538914]  [<ffffffff8147d851>] _raw_spin_lock_irq+0x21/0x30
[    5.538919]  [<ffffffff812c2c1e>] cfq_init_queue+0x1fe/0x350
[    5.538922]  [<ffffffff81253360>] ? zvol_probe+0xf0/0xf0
[    5.538926]  [<ffffffff812aacb8>] elevator_init+0x78/0x140
[    5.538930]  [<ffffffff812b2677>] blk_init_allocated_queue+0x87/0xb0
[    5.538933]  [<ffffffff81253360>] ? zvol_probe+0xf0/0xf0
[    5.538937]  [<ffffffff812b26d5>] blk_init_queue_node+0x35/0x70
[    5.538941]  [<ffffffff812b271e>] blk_init_queue+0xe/0x10
[    5.538944]  [<ffffffff8125211b>] __zvol_create_minor+0x24b/0x620
[    5.538947]  [<ffffffff81253264>] zvol_create_minors_cb+0x24/0x30
[    5.538952]  [<ffffffff811bd9ca>] dmu_objset_find_spa+0xea/0x510
[    5.538955]  [<ffffffff81253240>] ? zvol_free+0x60/0x60
[    5.538958]  [<ffffffff811bda71>] dmu_objset_find_spa+0x191/0x510
[    5.538962]  [<ffffffff81253240>] ? zvol_free+0x60/0x60
[    5.538965]  [<ffffffff81253ea2>] zvol_create_minors+0x92/0x180
[    5.538969]  [<ffffffff811f8d80>] spa_open_common+0x250/0x380
[    5.538973]  [<ffffffff811f8ece>] spa_open+0xe/0x10
[    5.538977]  [<ffffffff8122817e>] pool_status_check.part.22+0x1e/0x80
[    5.538980]  [<ffffffff81228a55>] zfsdev_ioctl+0x155/0x190
[    5.538984]  [<ffffffff8116a695>] do_vfs_ioctl+0x325/0x5a0
[    5.538989]  [<ffffffff81163f1d>] ? final_putname+0x1d/0x40
[    5.538992]  [<ffffffff8116a950>] sys_ioctl+0x40/0x80
[    5.538996]  [<ffffffff814812c9>] ? do_page_fault+0x9/0x10
[    5.539000]  [<ffffffff81483929>] system_call_fastpath+0x16/0x1b
[    5.541118]  zd0: unknown partition table

We fix this by calling spin_lock_init before blk_init_queue.

The manner in which zvol_init() initializes structures is
suspectible to a race between initialization and a probe on a zvol. We
reorganize zvol_init() to prevent that.

Lastly, calling zvol_create_minors(NULL) in zvol_init() does nothing
because no pools are imported, so we remove it.

Signed-off-by: Richard Yao <[email protected]>
@behlendorf
Copy link
Contributor

@ryao #1528 appears to have been caused by these changes. There's a race now where we can accidentally hand out the same minor twice. That's going to need explaining.

@ryao
Copy link
Contributor Author

ryao commented Jun 28, 2013

@behlendorf I do not see how that is possible. Multiple calls to __zvol_create_minor() should fail with EEXIST. It is also not clear to me that these patch was applied in issue #1528.

@ryao
Copy link
Contributor Author

ryao commented Jun 28, 2013

@behlendorf I seemed to have missed your earlier query. No one has reported any problems with these patches in Gentoo.

@ryao
Copy link
Contributor Author

ryao commented Jun 28, 2013

After talking on IRC, it looks like there is some sort of race in issue #1528 and this patch did play a role in it, but I am having trouble seeing how this happened. At this point, I am inclined to think that the compiler did something strange.

@behlendorf
Copy link
Contributor

@ryao Refreshed as #1564, please review.

@behlendorf behlendorf closed this Jul 2, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants