Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rolling back snapshot crashes system #1859

Closed
wsldankers opened this issue Nov 12, 2013 · 4 comments
Closed

rolling back snapshot crashes system #1859

wsldankers opened this issue Nov 12, 2013 · 4 comments
Milestone

Comments

@wsldankers
Copy link

Hi,

After updating zfs to 0.6.2-2precise1.gbp8420db (spl 0.6.2-2precise2.gbp7a3237) I can no longer roll back snapshots:

root@marzipan:~# zfs rollback zut/foo@bar
cannot rollback 'zut/foo': dataset does not exist

Subsequent attempts to roll back the snapshot hang the system slowly, that is, more and more processes hang until the system is completely disfunctional.

Subsequent attempts to umount crash the system immediately with a kernel oops.

root@marzipan:~# zpool create -f zut /dev/vdb
root@marzipan:~# zfs create zut/foo
root@marzipan:~# zfs snapshot zut/foo@bar
root@marzipan:~# zfs rollback zut/foo@bar
cannot rollback 'zut/foo': dataset does not exist
root@marzipan:~# umount /zut/foo
Segmentation fault
[   95.411337] general protection fault: 0000 [#1] SMP 
[   95.412022] CPU 0 
[   95.412022] Modules linked in: btrfs libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs reiserfs ext2 joydev psmouse parport_pc rfcomm bnep bluetooth ppdev serio_raw virtio_balloon mac_hid i2c_piix4 lp parport zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl(O) zlib_deflate usbhid hid floppy
[   95.412022] 
[   95.412022] Pid: 2045, comm: umount Tainted: P           O 3.2.0-56-generic #86-Ubuntu Bochs Bochs
[   95.412022] RIP: 0010:[<ffffffffa018f615>]  [<ffffffffa018f615>] zfs_sb_teardown+0x115/0x3a0 [zfs]
[   95.412022] RSP: 0018:ffff88003cf59de8  EFLAGS: 00010296
[   95.412022] RAX: dead0000000fff78 RBX: ffff88003290c000 RCX: dead000000200200
[   95.412022] RDX: ffff88003cf59fd8 RSI: 0000000000000282 RDI: ffff88003290c470
[   95.412022] RBP: ffff88003cf59e38 R08: 6000000000000000 R09: f018000000000000
[   95.412022] R10: ffaf001547f1de03 R11: 0000000000000003 R12: 0000000000000001
[   95.412022] R13: ffff88003290c390 R14: ffff88003290c420 R15: ffff88003290c418
[   95.412022] FS:  00007f2932ba8800(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
[   95.412022] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   95.412022] CR2: 00007f29321c8100 CR3: 0000000032894000 CR4: 00000000000006f0
[   95.412022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   95.412022] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   95.412022] Process umount (pid: 2045, threadinfo ffff88003cf58000, task ffff88003cfb0000)
[   95.412022] Stack:
[   95.412022]  ffff88003cf59e08 ffff88003cf59e38 ffff88003290c470 ffff88003290c450
[   95.412022]  ffff88003cf59e28 ffff88003290c000 ffffffffa01bf2e0 ffff8800368828c0
[   95.412022]  0000000000000000 0000000000000000 ffff88003cf59e68 ffffffffa018f8ed
[   95.412022] Call Trace:
[   95.412022]  [<ffffffffa018f8ed>] zfs_umount+0x2d/0xb0 [zfs]
[   95.412022]  [<ffffffffa01ac65e>] zpl_put_super+0xe/0x10 [zfs]
[   95.412022]  [<ffffffff8117c502>] generic_shutdown_super+0x62/0xe0
[   95.412022]  [<ffffffff8117c616>] kill_anon_super+0x16/0x30
[   95.412022]  [<ffffffffa01ac48e>] zpl_kill_sb+0x1e/0x30 [zfs]
[   95.412022]  [<ffffffff8117cc5c>] deactivate_locked_super+0x3c/0xa0
[   95.412022]  [<ffffffff8117d4de>] deactivate_super+0x4e/0x70
[   95.412022]  [<ffffffff81199c9d>] mntput_no_expire+0x9d/0xf0
[   95.412022]  [<ffffffff8119afcb>] sys_umount+0x5b/0xd0
[   95.412022]  [<ffffffff81669802>] system_call_fastpath+0x16/0x1b
[   95.412022] Code: 50 04 00 00 48 89 45 c8 48 8b 83 50 04 00 00 48 39 45 c8 74 41 48 2b 83 48 04 00 00 75 0d eb 36 66 0f 1f 44 00 00 48 29 d0 74 2b <48> 83 b8 98 01 00 00 00 74 10 48 89 c7 48 89 45 b8 e8 25 a8 00 
[   95.412022] RIP  [<ffffffffa018f615>] zfs_sb_teardown+0x115/0x3a0 [zfs]
[   95.412022]  RSP <ffff88003cf59de8>
[   95.471656] ---[ end trace ab04e63751c8b5b5 ]---
Linux marzipan 3.2.0-56-generic #86-Ubuntu SMP Wed Oct 23 09:20:45 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Please let me know if there's additional information I can supply.

cheers,

@dweeezil
Copy link
Contributor

The 831baf0 commit in the master branch causes that problem. I'm guessing that your 0.6.2-2precise1.gbp8420db version contains that commit (not sure how to find out). You'll likely have to revert to a 0.6.2 release or if you want to compile on your own, try dweeezil/zfs@393b281.

unya pushed a commit to unya/zfs that referenced this issue Dec 13, 2013
The Illumos openzfs#3875 patch reverted a part of ZoL's 7b3e34b which added
special-case error handling for zfs_rezget().  The error handling dealt
with the case in which an all-ones object number ended up being passed
to dnode_hold() and causing an EINVAL to be returned from zfs_rezget().

Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#1859
Closes openzfs#1861
@fractal-prakash
Copy link

Is this issue resolved ? we are seeing this issue more often in 0.6.3 version .

@fractal-prakash
Copy link

zfs rollback hangs frequently with the below messages. the only option is to re-boot the node.

Nov 20 09:40:57 prakash kernel: Call Trace:
Nov 20 09:40:57 prakash kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 20 09:40:57 prakash kernel: [] ? mutex_lock+0x1e/0x50
Nov 20 09:40:57 prakash kernel: [] ? avl_find+0x60/0xb0 [zavl]
Nov 20 09:40:57 prakash kernel: [] rwsem_down_read_failed+0x26/0x30
Nov 20 09:40:57 prakash kernel: [] call_rwsem_down_read_failed+0x14/0x30
Nov 20 09:40:57 prakash kernel: [] ? down_read+0x24/0x30
Nov 20 09:40:57 prakash kernel: [] zfs_inactive+0x56/0x220 [zfs]
Nov 20 09:40:57 prakash kernel: [] ? zpl_inode_delete+0x0/0x30 [zfs]
Nov 20 09:40:57 prakash kernel: [] zpl_clear_inode+0xe/0x10 [zfs]
Nov 20 09:40:57 prakash kernel: [] clear_inode+0xac/0x140
Nov 20 09:40:57 prakash kernel: [] zpl_inode_delete+0x20/0x30 [zfs]
Nov 20 09:40:57 prakash kernel: [] generic_delete_inode+0xde/0x1d0
Nov 20 09:40:57 prakash kernel: [] generic_drop_inode+0x65/0x80
Nov 20 09:40:57 prakash kernel: [] iput+0x62/0x70
Nov 20 09:40:57 prakash kernel: [] zfs_rezget+0xec/0x490 [zfs]
Nov 20 09:40:57 prakash kernel: [] zfs_resume_fs+0x24e/0x3a0 [zfs]
Nov 20 09:40:57 prakash kernel: [] zfs_ioc_rollback+0x96/0xa0 [zfs]
Nov 20 09:40:57 prakash kernel: [] zfsdev_ioctl+0x1c0/0x4d0 [zfs]
Nov 20 09:40:57 prakash kernel: [] vfs_ioctl+0x22/0xa0
Nov 20 09:40:57 prakash kernel: [] do_vfs_ioctl+0x84/0x580
Nov 20 09:40:57 prakash kernel: [] sys_ioctl+0x81/0xa0
Nov 20 09:40:57 prakash kernel: [] system_call_fastpath+0x16/0x1b
Nov 20 09:42:57 prakash kernel: INFO: task zfs:3588 blocked for more than 120 second

@behlendorf
Copy link
Contributor

@fractal-prakash could you try the master version from Github and see if the issue is reproducible there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants