-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ASSERT at zdb.c:3715:load_concrete_ms_allocatable_trees() #7672
Comments
Running zdb -cccvvAAA it can get past this and finally dies with SEGV. No error from the operating system. No error found, after two scrub passes. |
@dioni21 Mixing and matching userland and kernel versions is gonna produce exciting results. I would suggest that, if you want to try using a git version, purge all traces of the old SPL/ZFS packages, and then install the git version. What do you mean by "have differing checksums"? According to e.g. md5sum, or zdb examining on the affected files, or ...? You haven't included anything about the source pool layout, or the properties on the datasets, or even the arguments for running send|recv. |
@rincebrain Thanks for your answer. Using mixed zdb was a last resort after a SIGSEGV with no more info. I know it is not recommended. Since the matching zdb/kernel setup did not had this assertion, maybe should I close this issue? Sorry for that... I think I found the reason for this SEGV (default inflight I/Os), I'll fill another issue as soon as I confirm. Since my disks are SATA, every full disk operation takes a long time. "differing checksums" => According to md5sum, or, to be more specific, mtree (yes, I'm a FreeBSD user running Linux) This is a simple home setup. I am upgrading a 2x4TB pool to a 2x10TB pool, both configured with simple mirroring, and both with log and cache on SSD LVM partitions. There are about 11 dataset, some with dedup, some with compression, some without. Also, I think that file corruption was caused by using a full parameter zfs send (--dedup --large-block --replicate --embed --compressed --props). I've seen a previous bug with this config, but it is marked as solved. |
@rincebrain If I understood correctly, #6224 is not applied to my setup (0.7.9). This may explain why I got errors even without any zend options. I'll try with only --large-block --replicate --props as soon as current zdb run finishes. #4809 appear exactly with my problem, I'm not sure if this is the one I've read about before. The feature@hole_birth is active on both my pools. The source pool is very old, but the destination pool has just been created, in 0.7.9. Also, it is currently disabled in parameters:
Should I worry? |
@dioni21 If the source doing the sending has either of those tunables, #4809 shouldn't happen. What makes you think it's #4809 and not some other kind of data mangling? Have you looked to see if the affected files are the same every time you send/recv and how they differ between src and dst using e.g. zdb? |
@rincebrain I do not know yet the reason for corruption. Still searching. ZDB failure is what started this issue. Right now I could only use mtree/md5 and/or rsync -c to check file consistency. What I already know:
Note that I made these new copies without deleting the previous ones, since the new pool has much more free space. Also, the source pool is still in "production" with all my personal stuff, changing content as we talk... |
My last tests lead me to believe the reason for corruption is send --dedup. Taking it off was enough to copy all datasets with no md5sum error in content. I opened a new issue, #7703 Now that I could copy all data without corruption, I'll try zdb -ccc again ASAP. |
@rincebrain closing this issue since, as you pointed, could have been caused by using mixed kernel/userland binary versions. I'll open another if I can find more details about the SIGSEGV. Thanks a lot... |
System information
Distribution Name | Fedora
Distribution Version | 27
Linux Kernel | 4.16.16-200.fc27.x86_64
Architecture | x86_64
ZFS Version | zfs-0.7.9-1.fc27.x86_64 (from yum repo)
SPL Version | spl-0.7.9-1.fc27.x86_64
Describe the problem you're observing
I am debugging a problem while copying a whole pool to a new drive using zfs send/recv. Some files on the receiving side have different checksums.
While running
zdb -cccv
with the installed version, I got a segmentation fault. So, I went to try a newer version, and compiled ZDB from master repo (commit id e03a41a). Now I have an assertion.Describe how to reproduce the problem
Include any warning/errors/backtraces from the system logs
Nothing useful:
How could I help more?
The text was updated successfully, but these errors were encountered: