-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After zpool replace ZFS restarts resilvering process and does not finish #9551
Comments
As the resilver is currently running: stop the ZED daemon and wait for resilver to complete? |
Ohh! i didnt mention that but i all ready did this :/ this was one of the first suggestions i read in a forum. that did not work for me.
|
I have seen similar with zfs 0.8.1-r0-gentoo on 4.19.67_p2-r2-debian-sources-lts when turning a single drive pool into a triple mirror: after the two Detaching only one of the new mirror sides didn't help to stop resilver from restarting, I needed to detach both new mirror sides and then attach only one, wait for resilver to complete, then attach the next. Getting to a state with two drives getting resilvered at the same time was back to restart-o-mania again. No ZED running. |
Hello, i just recently looked at the print from demesg:
Im not sure what that means ;( maybe someone can enlighten me? Gothard |
If a device is participating in an active resilver, then it will have a non-empty DTL. Operations like vdev_{open,reopen,probe}() can cause the resilver to be restarted (or deferred to be restarted later), which is unnecessary if the DTL is still covered by the current scan range. This is similar to the logic in vdev_dtl_should_excise() where the DTL can only be excised if it's max txg is in the resilvered range. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Gallagher <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: John Poduska <[email protected]> Issue openzfs#840 Closes openzfs#9155 Closes openzfs#9378 Closes openzfs#9551 Closes openzfs#9588
If a device is participating in an active resilver, then it will have a non-empty DTL. Operations like vdev_{open,reopen,probe}() can cause the resilver to be restarted (or deferred to be restarted later), which is unnecessary if the DTL is still covered by the current scan range. This is similar to the logic in vdev_dtl_should_excise() where the DTL can only be excised if it's max txg is in the resilvered range. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Gallagher <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: John Poduska <[email protected]> Issue #840 Closes #9155 Closes #9378 Closes #9551 Closes #9588
I have just seen potentially similar behaviour on my RAIDZ2 array. As soon as I I'll do these replaces in serial now, but it feels like the wrong answer. |
I'm seeing the same problem. Resilver runs in a loop, just disabled zfs-zed, need to wait ~20hours before resilvering finishes single loop.
|
System information
Describe the problem you're observing
One of my HDD drives needed to be replaced, due to not beeing recognized. my guess from the sound of the HDD that the motor inside broke.
i shut down the NAS, disconnected the HDD and replaced it with a new one, booted and "zpool replace" the drive with the new one. i did that a couple of times now for the last years if one HDD seems to have an issue and everything worked fine. but this time, the resilvering process keeps restarting, sometimes it goes above 70% some times it restarts at about 20% of the resilvering process. i also tried with a second bought HDD in case of a broken bought HDD, and replaced a 2nd time, also didnt work. i did some research but i could not figure out the problem at all and the forum hints didnt work for me.
thank you for reading and your time you use to help me, if i can provide addional information that you need just ask and i try to provide it.
Describe how to reproduce the problem
I can't tell when it will restart the resilvering process, but i just have to wait and it does not seem to stop. it keeps restarting.
Include any warning/errors/backtraces from the system logs
pool status looks like this:
event log
History Log
The text was updated successfully, but these errors were encountered: