Skip to content

Commit

Permalink
Cleanup zvol initialization code
Browse files Browse the repository at this point in the history
The following error will occur on some (possibly all) kernels because
blk_init_queue() will try to take the spinlock before we initialize it.

[    5.538871] BUG: spinlock bad magic on CPU#0, zpool/4054
[    5.538885]  lock: 0xffff88021a73de60, .magic: 00000000, .owner:
<none>/-1, .owner_cpu: 0
[    5.538888] Pid: 4054, comm: zpool Not tainted 3.9.3 openzfs#11
[    5.538890] Call Trace:
[    5.538898]  [<ffffffff81478ef8>] spin_dump+0x8c/0x91
[    5.538902]  [<ffffffff81478f1e>] spin_bug+0x21/0x26
[    5.538906]  [<ffffffff812da097>] do_raw_spin_lock+0x127/0x130
[    5.538911]  [<ffffffff81253301>] ? zvol_probe+0x91/0xf0
[    5.538914]  [<ffffffff8147d851>] _raw_spin_lock_irq+0x21/0x30
[    5.538919]  [<ffffffff812c2c1e>] cfq_init_queue+0x1fe/0x350
[    5.538922]  [<ffffffff81253360>] ? zvol_probe+0xf0/0xf0
[    5.538926]  [<ffffffff812aacb8>] elevator_init+0x78/0x140
[    5.538930]  [<ffffffff812b2677>] blk_init_allocated_queue+0x87/0xb0
[    5.538933]  [<ffffffff81253360>] ? zvol_probe+0xf0/0xf0
[    5.538937]  [<ffffffff812b26d5>] blk_init_queue_node+0x35/0x70
[    5.538941]  [<ffffffff812b271e>] blk_init_queue+0xe/0x10
[    5.538944]  [<ffffffff8125211b>] __zvol_create_minor+0x24b/0x620
[    5.538947]  [<ffffffff81253264>] zvol_create_minors_cb+0x24/0x30
[    5.538952]  [<ffffffff811bd9ca>] dmu_objset_find_spa+0xea/0x510
[    5.538955]  [<ffffffff81253240>] ? zvol_free+0x60/0x60
[    5.538958]  [<ffffffff811bda71>] dmu_objset_find_spa+0x191/0x510
[    5.538962]  [<ffffffff81253240>] ? zvol_free+0x60/0x60
[    5.538965]  [<ffffffff81253ea2>] zvol_create_minors+0x92/0x180
[    5.538969]  [<ffffffff811f8d80>] spa_open_common+0x250/0x380
[    5.538973]  [<ffffffff811f8ece>] spa_open+0xe/0x10
[    5.538977]  [<ffffffff8122817e>] pool_status_check.part.22+0x1e/0x80
[    5.538980]  [<ffffffff81228a55>] zfsdev_ioctl+0x155/0x190
[    5.538984]  [<ffffffff8116a695>] do_vfs_ioctl+0x325/0x5a0
[    5.538989]  [<ffffffff81163f1d>] ? final_putname+0x1d/0x40
[    5.538992]  [<ffffffff8116a950>] sys_ioctl+0x40/0x80
[    5.538996]  [<ffffffff814812c9>] ? do_page_fault+0x9/0x10
[    5.539000]  [<ffffffff81483929>] system_call_fastpath+0x16/0x1b
[    5.541118]  zd0: unknown partition table

We fix this by calling spin_lock_init before blk_init_queue.

The manner in which zvol_init() initializes structures is
suspectible to a race between initialization and a probe on a zvol. We
reorganize zvol_init() to prevent that.

Lastly, calling zvol_create_minors(NULL) in zvol_init() does nothing
because no pools are imported, so we remove it.

Signed-off-by: Richard Yao <[email protected]>
  • Loading branch information
ryao committed Jun 21, 2013
1 parent 6a6dcc4 commit caea5cf
Showing 1 changed file with 17 additions and 12 deletions.
29 changes: 17 additions & 12 deletions module/zfs/zvol.c
Original file line number Diff line number Diff line change
Expand Up @@ -1217,6 +1217,9 @@ zvol_alloc(dev_t dev, const char *name)
if (zv == NULL)
goto out;

spin_lock_init(&zv->zv_lock);
list_link_init(&zv->zv_next);

zv->zv_queue = blk_init_queue(zvol_request, &zv->zv_lock);
if (zv->zv_queue == NULL)
goto out_kmem;
Expand Down Expand Up @@ -1250,9 +1253,6 @@ zvol_alloc(dev_t dev, const char *name)
sizeof (rl_t), offsetof(rl_t, r_node));
zv->zv_znode.z_is_zvol = TRUE;

spin_lock_init(&zv->zv_lock);
list_link_init(&zv->zv_next);

zv->zv_disk->major = zvol_major;
zv->zv_disk->first_minor = (dev & MINORMASK);
zv->zv_disk->fops = &zvol_ops;
Expand Down Expand Up @@ -1563,30 +1563,35 @@ zvol_init(void)
{
int error;

list_create(&zvol_state_list, sizeof (zvol_state_t),
offsetof(zvol_state_t, zv_next));
mutex_init(&zvol_state_lock, NULL, MUTEX_DEFAULT, NULL);

zvol_taskq = taskq_create(ZVOL_DRIVER, zvol_threads, maxclsyspri,
zvol_threads, INT_MAX, TASKQ_PREPOPULATE);
if (zvol_taskq == NULL) {
printk(KERN_INFO "ZFS: taskq_create() failed\n");
return (-ENOMEM);
error = -ENOMEM;
goto out1;
}

error = register_blkdev(zvol_major, ZVOL_DRIVER);
if (error) {
printk(KERN_INFO "ZFS: register_blkdev() failed %d\n", error);
taskq_destroy(zvol_taskq);
return (error);
goto out2;
}

blk_register_region(MKDEV(zvol_major, 0), 1UL << MINORBITS,
THIS_MODULE, zvol_probe, NULL, NULL);

mutex_init(&zvol_state_lock, NULL, MUTEX_DEFAULT, NULL);
list_create(&zvol_state_list, sizeof (zvol_state_t),
offsetof(zvol_state_t, zv_next));

(void) zvol_create_minors(NULL);

This comment has been minimized.

Copy link
@behlendorf

behlendorf Jul 2, 2013

This line actually isn't dead, although perhaps it should be. This will cause us to walk the spa config namespace which contains a list of pool names obtained from the zpool cache file. For each of these pools it is then opened via spa_open() which is why the zvols are create during module load. Doing this all at module load time is probably not the best of ideas but changing it will alter the long standing behavior in ZoL. As a half step we could just move this to the zvol taskq for now.

This comment has been minimized.

Copy link
@ryao

ryao Jul 2, 2013

Author Owner

That is correct. I made a comment in openzfs#1528 (comment) a few days ago, but I have yet to update the commit message to clarify that.

As far as I can tell, Illumos does not have this zvol_create_minors() call and it has no problem making the block devices. With that said, additional attempts to create minors will silently fail, which should make retaining this harmless.

This comment has been minimized.

Copy link
@behlendorf

behlendorf Jul 2, 2013

My concern is that it occurs during module init which could perhaps explain some of the weirdness. It appears almost everything is initialized at this point but not quite everything so there's the potential for bugs here. I'm thinking about removing it as you suggest although that will change the current behavior in the sense that block devices won't appear immediately after module load. Some userspace command such as zpool list would need to be run so the pools get opened and are not just listed in the configuration cache. Alternately, we could dispatch a work item to the system taskq to asynchronously open all the pools in the configuration cache.


return (0);

out2:
taskq_destroy(zvol_taskq);
out1:
mutex_destroy(&zvol_state_lock);
list_destroy(&zvol_state_list);
return (error);
}

void
Expand Down

0 comments on commit caea5cf

Please sign in to comment.