Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permanent errors detected in <metadata>:<0x0> ; <metadata>:<0x1> ; <metadata>:<0x1dd> ; bolaLNXBak/opt/MacPorts:<0x0> #10615

Closed
RJVB opened this issue Jul 23, 2020 · 1 comment
Labels
Status: Stale No recent activity for issue

Comments

@RJVB
Copy link

RJVB commented Jul 23, 2020

System information

Type | Version/Name
Linux
Distribution Name Kubuntu
Distribution Version 14.04.6 with lots of updates
Linux Kernel 4.19.133-ck1
Architecture x86_64
ZFS Version 0.8.4
SPL Version idem

Describe the problem you're observing

This system is a "root-on-zfs" system that I have been running under VirtualBox off an actual harddisk, via VBox's "raw disk access". This never gave issues with kernels up to and including 4.14.23 . Importing the pool under 4.19.133 has led to the following report:

  pool: bolaLNXBak
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: resilvered 222G in 0 days 04:48:06 with 0 errors on Tue Jul 14 04:04:08 2020
config:

        NAME                                           STATE     READ WRITE CKSUM  SLOW
        bolaLNXBak                                     ONLINE       0     0     0     -
          ata-VBOX_HARDDISK_VBe9d3fab6-73a1dc23-part6  ONLINE       0     0     0     0  (trim unsupported)

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x0>
        <metadata>:<0x1>
        <metadata>:<0x1dd>
        bolaLNXBak/opt/MacPorts:<0x0>
        bolaLNXBak/opt/MacPorts:<0x42606>
        bolaLNXBak/opt/MacPorts:<0x44108>
        bolaLNXBak/opt/MacPorts:<0xf209>
        bolaLNXBak/opt/MacPorts:<0x4260e>
        bolaLNXBak/opt/MacPorts:<0x42614>
        bolaLNXBak/opt/MacPorts:<0x44117>
        bolaLNXBak/opt/MacPorts:<0x4652f>
        bolaLNXBak/opt/MacPorts:<0x44130>
        bolaLNXBak/opt/MacPorts:<0x23a41>
        bolaLNXBak/opt/MacPorts:<0x44143>
        bolaLNXBak/opt/MacPorts:<0x44164>
        bolaLNXBak/opt/MacPorts:<0x44264>
        bolaLNXBak/opt/MacPorts:<0x6b369>
        bolaLNXBak/opt/MacPorts:<0x2628d>
        bolaLNXBak/opt/MacPorts:<0x45ebc>
        bolaLNXBak/opt/MacPorts:<0x442c0>
        bolaLNXBak/opt/MacPorts:<0x442d3>
        bolaLNXBak/opt/MacPorts:<0x45edd>
        bolaLNXBak/opt/MacPorts:<0xf0>
        bolaLNXBak/opt/MacPorts:<0x440ff>
        bolaLNXBak/opt/MacPorts:<0x463ff>

Many of the hex numbered entries were first listed as actual files that supposedly had an error - but which had the exact same md5sum as a valid copy on a different system. Simply restoring the file from backup (or cloning it and them moving the clone over the original) replaced the named error with the useless (to me) listings above.

Describe how to reproduce the problem

The first errors appeared out of the blue and concerned only directories that AFAIK were never written to. I've also seen new errors appear when restoring packages that had supposedly faulty files.

The 3 entries above aside, the errors are all in a single dataset. I noticed that sync was disabled in that dataset; I have not yet seen new errors appear since re-enabling sync.

My kernel has the Con Kolivas patches (as all my kernel builds) but also a submitted patch to make zswab use B-Trees (probably irrelevant) and the NixOS patch posted by @mskarbek (#8836 (comment)_) to re-enable the export of 2 SIMD functions. Could this be the culprit?
I know ZFS tests for the existence of all functions it needs, but could it have an additional hardcoded assumption that 4.19 doesn't support using SIMD operations?

Include any warning/errors/backtraces from the system logs

None.

I'll be testing some more with a different (root on ~zfs) to see if I can observe other cases of supposed errors.

@stale
Copy link

stale bot commented Jul 23, 2021

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Jul 23, 2021
@stale stale bot closed this as completed Oct 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale No recent activity for issue
Projects
None yet
Development

No branches or pull requests

1 participant