-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic when running 'zpool split' #5565
Comments
After forcibly rebooting, zpool status shows as below:
And after zpool online
|
@smaeul thanks for reporting this issue. It appears the failure was caused by a zero length entry in the dirty time log (DTL) for the mirror when the split was requested. The DTL is used to track the set of transaction groups for which the vdev has less than perfect replication. You should be able to recover full redundancy in the pool by performing a |
The possible fix. We tested it internally, but would be nice to get confirmation from ZFS Jedi. From an available crash-dump I see that the freed VDEV is accessed, because it is on txg's DTL of synced VDEV, because was transferred by vdev_top_transfer(), that was called by vdev_remove_parent(), that was called by vdev_split(). "detach" does the cleanup, but it seems for "split" this cleanup was not implemented.
|
@ramzec thanks for digging in to this. Your analysis and fix look correct to me, would you mind opening a PR with the proposed change. |
ok. will do. |
Roman, I think I hit this years ago, but don’t have any remaining evidence. Nor do I recall filing a bug at Nexenta at the time. Do you believe it is reproducible in ZTS? |
We caught this error a month ago during split of system pool. Many times we tried to reproduce in on different hosts. Nothing. We have only one host where it can be reproduced 100%. So that I'm not sure it can be simply reproduced in ZTS |
System information
Describe the problem you're observing
I received this panic when trying to run
zpool split system blah
:The zpool split command, any further zfs/zpool commands, and sync all became uninterruptible (D state).
Describe how to reproduce the problem
I have the following pool I attempted to split. It is a mirror of two dm-crypt volumes. It is less than a year old. It was originally created with a single disk, and the second was attached several hours later. No further changes to the layout were made until now.
Include any warning/errors/backtraces from the system logs
See above.
The text was updated successfully, but these errors were encountered: