-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full Issue 2094 Fix #2112
Full Issue 2094 Fix #2112
Conversation
I like this concept quite and this is definitely a step in the right direction as far as regression testing is concerned. Given recent experience, however, it seems it would be a good idea to eventually have reference pools that represent the myriad of optional states in which a pool can exist. It would be nice to document all of them (another issue altogether), but some of the ones that come to mind are (in no particular order):
|
@dweeezil I couldn't agree more. The more reference images we have in different states the better. We do some of this testing today with ztest just not across different versions of the code base. |
I've refreshed this patch stack. The reference images have been moved to a separate https://github.com/zfsonlinux/zfs-images repository to avoid bloating the core zfs repository. The new repository has been added as a git submodule under scripts/zfs-images. The git submodule isn't required for anything but running the https://github.com/zfsonlinux/zfs-images/tarball/master |
@dweeezil @ryao I've refreshed this patch stack to include the full set of proposed changes to address #2094. It obsoletes #2109 and I'd appreciate your feedback and any additional testing you can offer. To summarize the changes which are largely based on you guys works.
|
@@ -171,6 +171,7 @@ struct spa { | |||
uint8_t spa_scrub_reopen; /* scrub doing vdev_reopen */ | |||
uint64_t spa_scan_pass_start; /* start time per pass/reboot */ | |||
uint64_t spa_scan_pass_exam; /* examined bytes per pass */ | |||
uint64_t spa_scan_pass_errata; /* errata issues detected */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ensure that this is meant for reuse, we probably should pick a more general name for it and split it out into a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good thought. I suspect we may end up preserving this functionality just in case we ever need it again. I'll rename the variable to spa_errata
and refactor the commit when I update this branch.
@behlendorf I agree with @dweeezil's points. Also, I have read the C code in depth and skimmed through the shell scripts; and added some comments. I think this should be fine to merge after the one about the fall-thru handling is resolved. My other comments on the code are less important, although I hope that you will consider them. |
I finally got a chance to look at this (behlendorf/zfs@7e35fd5). Here are my comments:
I've not had a chance to actually try running with this commit yet but hopefully I'll get a chance tomorrow (Wednesday) during the day. |
@behlendorf I was able to test your patch and it works perfectly well, however, I realized it would be nice for a plain "zpool import" to be able to report the erratum as well. To that end, I've worked up a slightly different approach to reporting the error in dweeezil/zfs@f115c41 which layers on your patch. It adds a new pool status which allows for the erratum to be reported in a more "natural" way within the zpool command. |
@dweeezil @ryao I've refreshed the patch stack to incorporate your feedback. The major changes are as follows.
|
@behlendorf I visually reviewed the patch and it looks good to me. I'm going run it on one of my test systems this evening and plan on (trying to) make test pools with an in-progress async destroy and one with an in-progress scrub. This should give us a good foundation going forward should any other such incompatibilities come up. |
@behlendorf Quick status update since I obviously didn't get to this yesterday: I've managed to make a test pool with a bptree_obj in it. For some reason, the patch isn't detecting it and to make matters work, I bump into an assert the next time I try to destroy a file system (somewhat as expected because the contents of the object are bogus). More later when I get a chance to look at it a bit closer. |
@dweeezil If you could open a pull request with the test image against https://github.com/zfsonlinux/zfs-images I can dig in to this as well. I've just started carefully testing this patch as well, and I'm keen to finalize these changes and move on. |
I'm putting together the new errata page, I should have something in the next day or so. |
@behlendorf I was in a bit of a hurry before and forgot that we can't detect the problem if there's only a bptree_obj but no scan (wonder if there would be a reasonable way to do that). In any case, if you want to take a peek at my bogus pool, it's at http://images.chase2k.com/tank.bptree.xz. It has only a bptree_obj but no scan. I'll make one with both a scan and a bptree_obj tomorrow and I'm sure that'll work. |
@behlendorf I'll be making a pull request against https://github.com/zfsonlinux/zfs-images with some bptree_obj entries. I'm going to delete the one referenced in my last message. |
@behlendorf My root pool is damaged and I have not yet diagnosed the failure mode. I am out of action while I address that. I wanted to use Illumos to perform repairs, but my pool is affected by this erratum, so I reached out to the Illumos developers in #illumos on freenode for some assistance in getting an Illumos build with this correction. Consequently, igork in #illumos agreed to help me and went through the process of building a dilos ISO with the erratum fix applied. This process was very informative and I have some remarks. In particular, I think we should split the erratum fix into three patches. The first would print a notice to dmesg. The second would add the add the erratum reporting infrastructure. The third would hook the erratum fix into the erratum reporting infrastructure. We both have written non-trivial portions of this patch and it is not clear who wrote what to external parties. Splitting it would ensure that attribution is clear. Additionally, a split would make it easier for other Open ZFS implementations to grab the errata fix without having to go through the reporting infrastructure changes, which break the JNI bindings. |
@behlendorf Any chance of the "Revert changes to zbookmark_t" portion of this fix getting applied to main? Would help prevent further spread of the issue. I've been manually applying it on the systems I'm using to test other fixes but for the unwary it's a potential gotcha. |
@olw2005 It's on it's way. As soon as they're happy with it (i.e., being sure it doesn't introduce yet more problems). Expect it in a few days... |
@olw2005 If it is applied without the rest of this, everyone affected who updates to HEAD will be unable to use their pools. That is something we would like to avoid. |
@ryao @FransUrbo facepalm Of course, that makes perfect sense. At least I learned more about "advanced git operations" in the process so it's all good. =) |
@ryao How about two patches. The latest incarnation of the patch doesn't include any logging to dmesg and I don't believe there's any need for it. The issue is clearly described through the output of the 'zpool import' / 'zpool status' and the errata webpages I'll be posting (hopefully I can finish that up soon). I have no objection to slipping up the rest of the change as you described and generally agree that's a cleaner way to go. I'll get the patches refreshed over the next couple of days to split the patch stack. @ryao @dweeezil Aside from this minor cleanup are both of you OK with merging this fix? |
@behlendorf Yes, I'm fine with the fix as it stands. As a side note, if anyone felt so inclined, I realized it would be possible to somewhat hack-ishly detect the "incompatible bptree_obj with no scan" situation by dmu_read()-ing the bptree_phys_t entries and then reading one more. If the extra chunk had any nonzero bytes, the erratum is present. The number of entries to read is available in the bonus as bp_end-bt_begin in a a bptree_phys_t struct. |
Verify that an assortment of known good reference pools can be imported using different versions of the ZoL code. By default references pools for the major ZFS implementation will be checked against the most recent ZoL tags and the master development branch. Alternate tags or branches may be verified with the '-s <src-tag> option. Passing the keyword "installed" will instruct the script to test whatever version is installed. Preferentially a reference pool is used for all tests. However, if one does not exist and the pool-tag matches one of the src-tags then a new reference pool will be created using binaries from that source build. This is particularly useful when you need to test your changes before opening a pull request. New reference pools may be added by placing a bzip2 compressed tarball of the pool in the scripts/zpool-example directory and then passing the -p <pool-tag> option. To increase the test coverage reference pools should be collected for all the major ZFS implementations. Having these pools easily available is also helpful to the developers. Care should be taken to run these tests with a kernel supported by all the listed tags. Otherwise build failure will cause false positives. EXAMPLES: The following example will verify the zfs-0.6.2 tag, the master branch, and the installed zfs version can correctly import the listed pools. Note there is no reference pool available for master and installed but because binaries are available one is automatically constructed. The working directory is also preserved between runs (-k) preventing the need to rebuild from source for multiple runs. zimport.sh -k -f /var/tmp/zimport \ -s "zfs-0.6.1 zfs-0.6.2 master installed" \ -p "all master installed" --------------------- ZFS on Linux Source Versions -------------- zfs-0.6.1 zfs-0.6.2 master 0.6.2-180 ----------------------------------------------------------------- Clone SPL Skip Skip Skip Skip Clone ZFS Skip Skip Skip Skip Build SPL Skip Skip Skip Skip Build ZFS Skip Skip Skip Skip ----------------------------------------------------------------- zevo-1.1.1 Pass Pass Pass Pass zol-0.6.1 Pass Pass Pass Pass zol-0.6.2-173 Fail Fail Pass Pass zol-0.6.2 Pass Pass Pass Pass master Fail Fail Pass Pass installed Pass Pass Pass Pass Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Richard Yao <[email protected]> Issue openzfs#2094
Commit 1421c89 added a field to zbookmark_t that unintentinoally caused a disk format change. This negatively affected backward compatibility and platform portability. Therefore, this field is being removed. The function that field permitted is left unimplemented until a later patch that will reimplement the field in a way that does not affect the disk format. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2094
Both the zpool_import_status() and zpool_get_status() functions return the zpool_status_t enum. This explicit type should be used rather than the more generic int type. This patch makes no functional change and should only be considered code cleanup. It happens to have been done in the context of openzfs#2094 because that's when I noticed this issue. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Richard Yao <[email protected] Issue openzfs#2094
From time to time it may be nessisary to inform the pool administator about an errata which impacts their pool. These errata will by shown to the administrator through the 'zpool status' and 'zpool import' output as appropriate. The errata must clearly describe the issue detected, how the pool is impacted, and what action should be taken to resolve the situation. Additional information for each errata will be provided at http://zfsonlinux.org/msg/ZFS-8000-ER. To accomplish the above this patch adds the required infrastructure to allow the kernel modules to notify the utilities that an errata has been detected. This is done through the ZPOOL_CONFIG_ERRATA uint64_t which has been added to the pool configuration nvlist. To add a new errata the following changes must be made: * A new errata identifier must be assigned by adding a new enum value to the zpool_errata_t type. New enums must be added to the end to preserve the existing ordering. * Code must be added to detect the issue. This does not strictly need to be done at pool import time but doing so will make the errata visible in 'zpool import' as well as 'zpool status'. Once detected the spa->spa_errata member should be set to the new enum. * If possible code should be added to clear the spa->spa_errata member once the errata has been resolved. * The show_import() and status_callback() functions must be updated to include an informational message describing the errata. This should include an action message decribing what an administrator should do to address the errata. * The documentation at http://zfsonlinux.org/msg/ZFS-8000-ER must be updated to describe the errata. This space can be used to provide as much additional information as needed to fully describe the errata. A link to this documentation will be automatically generated in the output of 'zpool import' and 'zpool status'. Original-idea-by: Time Chaos <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Richard Yao <[email protected] Issue openzfs#2094
ZoL commit 1421c89 unintentionally changed the disk format in a forward- compatible, but not backward compatible way. This was accomplished by adding an entry to zbookmark_t, which is included in a couple of on-disk structures. That lead to the creation of pools with incorrect dsl_scan_phys_t objects that could only be imported by versions of ZoL containing that commit. Such pools cannot be imported by other versions of ZFS or past versions of ZoL. The additional field has been removed by the previous commit. However, affected pools must be imported and scrubbed using a version of ZoL with this commit applied. This will return the pools to a state in which they may be imported by other implementations. The 'zpool import' or 'zpool status' command can be used to determine if a pool is impacted. A message similar to one of the following means your pool must be scrubbed to restore compatibility. $ zpool import pool: zol-0.6.2-173 id: 1165955789558693437 state: ONLINE status: Errata #1 detected. action: The pool can be imported using its name or numeric identifier, however there is a compatibility issue which should be corrected by running 'zpool scrub' see: http://zfsonlinux.org/msg/ZFS-8000-ER config: ... $ zpool status pool: zol-0.6.2-173 state: ONLINE scan: pool compatibility issue detected. see: openzfs#2094 action: To correct the issue run 'zpool scrub'. config: ... If there was an async destroy in progress 'zpool import' will prevent the pool from being imported. Further advice on how to proceed will be provided by the error message as follows. $ zpool import pool: zol-0.6.2-173 id: 1165955789558693437 state: ONLINE status: Errata #2 detected. action: The pool can not be imported with this version of ZFS due to an active asynchronous destroy. Revert to an earlier version and allow the destroy to complete before updating. see: http://zfsonlinux.org/msg/ZFS-8000-ER config: ... Pools affected by the damaged dsl_scan_phys_t can be detected prior to an upgrade by running the following command as root: zdb -dddd poolname 1 | grep -P '^\t\tscan = ' | sed -e 's;scan = ;;' | wc -w Note that `poolname` must be replaced with the name of the pool you wish to check. A value of 25 indicates the dsl_scan_phys_t has been damaged. A value of 24 indicates that the dsl_scan_phys_t is normal. A value of 0 indicates that there has never been a scrub run on the pool. The regression caused by the change to zbookmark_t never made it into a tagged release, Gentoo backports, Ubuntu, Debian, Fedora, or EPEL stable respositorys. Only those using the HEAD version directly from Github after the 0.6.2 but before the 0.6.3 tag are affected. This patch does have one limitation that should be mentioned. It will not detect errata #2 on a pool unless errata #1 is also present. It expected this will not be a significant problem because pools impacted by errata #2 have a high probably of being impacted by errata #1. End users can ensure they do no hit this unlikely case by waiting for all asynchronous destroy operations to complete before updating ZoL. The presence of any background destroys on any imported pools can be checked by running `zpool get freeing` as root. This will display a non-zero value for any pool with an active asynchronous destroy. Lastly, it is expected that no user data has been lost as a result of this erratum. Original-patch-by: Tim Chase <[email protected]> Reworked-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2094
Refreshed based on above comments. My plan is to merge this to master tomorrow unless I hear an objection. |
@behlendorf I have no objections. |
@behlendorf Looks good to me save for the typo in my name in behlendorf/zfs@070572f. |
I've no idea how I mangled your name like that. I'll fix it when I merge it. |
Merged as: 3965d2e Merge branch 'issue-2094' Everyone, thank you for helping us get this cleanly sorted out! |
Just reporting in that osx.zfs pull of ZOL detected my pool had the errata, and zpool scrub corrected everything. Thank you all. |
An additional test script to ensure 'zpool import' regressions are caught prior to those changes being merged.