-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nasty deadlock on zpool import #4006
Comments
I had to boot without zfs and performed a zpool import, and this is how i've obtained the output:
zpool import -f data
|
And here is one without selinux enabled so that is not the fault
|
@mihaivint We seems to have similar issues in the past, but I'm not quite sure if it's fixed or not. |
I'm not exactly sure if it was directly on this 0.6.5.1 or it was on 0.6.4.1 and then upgraded |
Same here with my pool: #3814 |
At least there is a way to recover the data, in this case there wasn't a need as i had a replica of the data, but good to know the ro mount |
@mihaivint can you apply the patches in #4123, they are safe and should resolve the problem. Definitely let us know if that doesn't fix things. |
Unfortunatelly i don't have that drive to test this, so i can't provide additional info. |
We need truncate and remove be in the same tx when doing zfs_rmnode on xattr dir. Otherwise, if we truncate and crash, we'll end up with inconsistent zap object on the delete queue. We do this by skipping dmu_free_long_range and let zfs_znode_delete to do the work. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#4114 Issue openzfs#4052 Issue openzfs#4006 Issue openzfs#3018 Issue openzfs#2861
During zfs_rmnode on a xattr dir, if the system crash just after dmu_free_long_range, we would get empty xattr dir in delete queue. This would cause blkid=0 be passed into zap_get_leaf_byblk when doing zfs_purgedir during mount, and would try to do rw_enter on a wrong structure and cause system lockup. We fix this by returning ENOENT when blkid is zero in zap_get_leaf_byblk. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4114 Closes #4052 Closes #4006 Closes #3018 Closes #2861
We need truncate and remove be in the same tx when doing zfs_rmnode on xattr dir. Otherwise, if we truncate and crash, we'll end up with inconsistent zap object on the delete queue. We do this by skipping dmu_free_long_range and let zfs_znode_delete to do the work. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4114 Issue #4052 Issue #4006 Issue #3018 Issue #2861
During zfs_rmnode on a xattr dir, if the system crash just after dmu_free_long_range, we would get empty xattr dir in delete queue. This would cause blkid=0 be passed into zap_get_leaf_byblk when doing zfs_purgedir during mount, and would try to do rw_enter on a wrong structure and cause system lockup. We fix this by returning ENOENT when blkid is zero in zap_get_leaf_byblk. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#4114 Closes openzfs#4052 Closes openzfs#4006 Closes openzfs#3018 Closes openzfs#2861
We need truncate and remove be in the same tx when doing zfs_rmnode on xattr dir. Otherwise, if we truncate and crash, we'll end up with inconsistent zap object on the delete queue. We do this by skipping dmu_free_long_range and let zfs_znode_delete to do the work. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#4114 Issue openzfs#4052 Issue openzfs#4006 Issue openzfs#3018 Issue openzfs#2861
The fixes also worked for me. |
During zfs_rmnode on a xattr dir, if the system crash just after dmu_free_long_range, we would get empty xattr dir in delete queue. This would cause blkid=0 be passed into zap_get_leaf_byblk when doing zfs_purgedir during mount, and would try to do rw_enter on a wrong structure and cause system lockup. We fix this by returning ENOENT when blkid is zero in zap_get_leaf_byblk. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#4114 Closes openzfs#4052 Closes openzfs#4006 Closes openzfs#3018 Closes openzfs#2861
We need truncate and remove be in the same tx when doing zfs_rmnode on xattr dir. Otherwise, if we truncate and crash, we'll end up with inconsistent zap object on the delete queue. We do this by skipping dmu_free_long_range and let zfs_znode_delete to do the work. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#4114 Issue openzfs#4052 Issue openzfs#4006 Issue openzfs#3018 Issue openzfs#2861
During zfs_rmnode on a xattr dir, if the system crash just after dmu_free_long_range, we would get empty xattr dir in delete queue. This would cause blkid=0 be passed into zap_get_leaf_byblk when doing zfs_purgedir during mount, and would try to do rw_enter on a wrong structure and cause system lockup. We fix this by returning ENOENT when blkid is zero in zap_get_leaf_byblk. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#4114 Closes openzfs#4052 Closes openzfs#4006 Closes openzfs#3018 Closes openzfs#2861
We need truncate and remove be in the same tx when doing zfs_rmnode on xattr dir. Otherwise, if we truncate and crash, we'll end up with inconsistent zap object on the delete queue. We do this by skipping dmu_free_long_range and let zfs_znode_delete to do the work. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#4114 Issue openzfs#4052 Issue openzfs#4006 Issue openzfs#3018 Issue openzfs#2861
During zfs_rmnode on a xattr dir, if the system crash just after dmu_free_long_range, we would get empty xattr dir in delete queue. This would cause blkid=0 be passed into zap_get_leaf_byblk when doing zfs_purgedir during mount, and would try to do rw_enter on a wrong structure and cause system lockup. We fix this by returning ENOENT when blkid is zero in zap_get_leaf_byblk. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#4114 Closes openzfs#4052 Closes openzfs#4006 Closes openzfs#3018 Closes openzfs#2861
We need truncate and remove be in the same tx when doing zfs_rmnode on xattr dir. Otherwise, if we truncate and crash, we'll end up with inconsistent zap object on the delete queue. We do this by skipping dmu_free_long_range and let zfs_znode_delete to do the work. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#4114 Issue openzfs#4052 Issue openzfs#4006 Issue openzfs#3018 Issue openzfs#2861
I have a new issue today. One of the nodes locked. after an update. Not sure what triggered it, as others didn't have the same issue. Anyway on this node i had 0.6.5.1 now i've updated to 0.6.5.3 but behavior is the same zfs is not able to import the pool and hangs everything.
Environment is a VMware vm:
free -m
total used free shared buff/cache available
Mem: 7823 474 7050 8 298 5877
Swap: 4095 0 4095
cat /proc/cpuinfo |grep -ic process
4
uname -a
Linux f34bd24f63.es.private.redlight.hubgets.com 3.10.0-229.20.1.el7.x86_64 #1 SMP Tue Nov 3 19:10:07 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: