Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs rollback crash #431

Closed
asnow0 opened this issue Nov 1, 2011 · 6 comments
Closed

zfs rollback crash #431

asnow0 opened this issue Nov 1, 2011 · 6 comments
Milestone

Comments

@asnow0
Copy link

asnow0 commented Nov 1, 2011

No description provided.

@asnow0
Copy link
Author

asnow0 commented Nov 1, 2011

Running latest ZFS git on Ubuntu 10.04 (using kernel 3.0.0) and I'm seeing a crash mainly during ZFS rollbacks. The rollback hangs, and any processes that were accessing data on the ZFS volumes become blocked. Below is the call trace I get when this issue occurs:

[ 670.770635] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 670.770814] IP: [] sa_object_size+0x9/0x20 [zfs]
[ 670.770989] PGD 137afa067 PUD 125c39067 PMD 0
[ 670.771103] Oops: 0000 [#2] SMP
[ 670.771184] CPU 2
[ 670.771227] Modules linked in: sch_sfq cls_u32 sch_cbq vboxnetadp vboxnetflt vboxdrv zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl zlib_deflate hwmon_vid coretemp ipmi_msghandler ppdev i2c_i801 i915 lp drm_kms_helper serio_raw parport_pc parport raid10 raid456 async_pq async_xor xor async_memcpy async_raid6_recov pata_jmicron ahci libahci raid6_pq async_tx raid1 raid0 multipath linear
[ 670.780014]
[ 670.780014] Pid: 14320, comm: php Tainted: P D 3.0.0-datto9 #23 Gigabyte Technology Co., Ltd. D525TUD/D525TUD
[ 670.780014] RIP: 0010:[] [] sa_object_size+0x9/0x20 [zfs]
[ 670.780014] RSP: 0018:ffff8801292e1e18 EFLAGS: 00010206
[ 670.780014] RAX: 0000000028c6acc8 RBX: ffff880137c681b8 RCX: 0000000000000009
[ 670.780014] RDX: ffff8801292e1f58 RSI: ffff8801292e1f50 RDI: 0000000000000000
[ 670.780014] RBP: ffff8801292e1e18 R08: ffffffff81148d65 R09: ffff880104e6fb81
[ 670.780014] R10: ffff88013f006400 R11: ffff8801292e1d48 R12: ffff8801292e1ef8
[ 670.780014] R13: ffff880137c68048 R14: ffff880129195000 R15: 0000000000000000
[ 670.830096] FS: 00007f64a2e54720(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
[ 670.830096] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 670.830096] CR2: 0000000000000020 CR3: 00000001297c7000 CR4: 00000000000006e0
[ 670.830096] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 670.830096] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 670.830096] Process php (pid: 14320, threadinfo ffff8801292e0000, task ffff880137111440)
[ 670.830096] Stack:
[ 670.830096] ffff8801292e1e48 ffffffffa0273243 ffff880104e6fb40 ffff880123e8d400
[ 670.830096] ffff880137c681b8 ffff8801292e1ef8 ffff8801292e1e58 ffffffffa0288195
[ 670.830096] ffff8801292e1e98 ffffffff8114211e 0000000000000000 ffff88013b2edf00
[ 670.830096] Call Trace:
[ 670.830096] [] zfs_getattr_fast+0x73/0xb0 [zfs]
[ 670.881069] [] zpl_getattr+0x15/0x20 [zfs]
[ 670.881069] [] vfs_getattr+0x4e/0x80
[ 670.881069] [] vfs_fstatat+0x70/0x90
[ 670.881069] [] vfs_stat+0x1b/0x20
[ 670.881069] [] sys_newstat+0x24/0x50
[ 670.881069] [] system_call_fastpath+0x16/0x1b
[ 670.881069] Code: 48 89 f3 e8 3a 77 fc ff 48 c7 43 20 00 00 00 00 48 83 c4 08 5b c9 c3 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 66 66 66 66 90
[ 670.881069] 8b 7f 20 e8 3e ee fc ff c9 c3 66 66 66 2e 0f 1f 84 00 00 00
[ 670.881069] RIP [] sa_object_size+0x9/0x20 [zfs]
[ 670.881069] RSP
[ 670.881069] CR2: 0000000000000020
[ 670.882719] ---[ end trace e090e4ed91176baa ]---

If there's any more information that would be helpful, just ask and I'll try and provide it.

@rohan-puri
Copy link
Contributor

Can you provide, reproduction steps. It will be helpful

@asnow0
Copy link
Author

asnow0 commented Nov 2, 2011

Here's what seems to be causing the issue:
-A ZFS snapshot is taken on a particular ZFS filesystem
-Data in that filesystem is modified
-A zfs rollback is run on the ZFS filesystem to the snapshot that was
previously created
-the rollback hangs and the call trace I posted appears in dmesg
This seems to only happen on certain ZFS filesystems, since we have
several machines where everything's working fine but certain machines
are seeing this same failure.

On 11/2/2011 4:54 AM, rohan-puri wrote:

Can you provide, reproduction steps. It will be helpful

Alex Snow
Developer/System Administrator
Datto Inc
(203)665-6423
[email protected]

@rohan-puri
Copy link
Contributor

Thanks for the steps. I tried to reproduce it on my end but its NOT reproducible at my end.

@gunnarbeutner
Copy link
Contributor

Steps to reproduce:

  1. zfs snapshot tank@v1
  2. Terminal 1: while true; do ls /tank; done
  3. Terminal 2: zfs rollback tank@v1

@behlendorf
Copy link
Contributor

This looks pretty clear cut, thanks Gunnar I'll get your fix merged in.

sdimitro pushed a commit to sdimitro/zfs that referenced this issue May 23, 2022
In general, `PathBuf`/`&Path` should be used instead of `String`/`&str`
because unix paths can be any nul-terminated byte array, which is more
general than Rust's String (which must be valid UTF-8).  By using
String, we are unable to operate on some files (those whose names are
not valid UTF-8).  However, for the agent, this isn't that big of a deal
in practice, since this isn't a general file-processing utility.

However, we can still benefit from using Path:
* The more specific type makes it more clear what variables are used for
* We can take advantage of the directory-path-specific helper methods

This commit changes our codebase to use PathBuf/Path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants