Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancel initialize and TRIM before vdev_metaslab_fini() #9751

Merged
merged 1 commit into from
Dec 26, 2019

Conversation

behlendorf
Copy link
Contributor

Motivation and Context

Resolve #8602 which is one of the more common ztest failures
encountered by the CI.

Description

Any running zpool initialize or TRIM must be cancelled prior
to the vdev_metaslab_fini() call in spa_vdev_remove_log() which
will unload the metaslabs and set ms->ms_group == NULL.

How Has This Been Tested?

Unfortunately, I've been unable to consistently reproduce this
issue locally. I've run ztest with this PR applied for several
hours and was unable to reproduce the issue. However, since the
CI does relatively frequently hit this issue I've opened this PR for
additional testing and increased the ztest run time to 2 hours.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf added Component: Test Suite Indicates an issue with the test framework or a test case Status: Code Review Needed Ready for review and testing labels Dec 19, 2019
Any running 'zpool initialize' or TRIM must be cancelled prior
to the vdev_metaslab_fini() call in spa_vdev_remove_log() which
will unload the metaslabs and set ms->ms_group == NULL.

TEST_ZTEST_TIMEOUT=7200
Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs#8602
@codecov
Copy link

codecov bot commented Dec 20, 2019

Codecov Report

Merging #9751 into master will increase coverage by <1%.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master    #9751    +/-   ##
========================================
+ Coverage      80%      80%   +<1%     
========================================
  Files         385      385            
  Lines      121302   121302            
========================================
+ Hits        96442    96775   +333     
+ Misses      24860    24527   -333
Flag Coverage Δ
#kernel 80% <100%> (ø) ⬆️
#user 67% <100%> (+1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 523fc80...4447e29. Read the comment docs.

Copy link
Contributor

@PrivatePuffin PrivatePuffin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a reasonable cause and fix!
Thanks for your work on this issue, it was a pain when doing testing on zstd last few weeks!

I can confirm your testing is valid @behlendorf
While doing many runs of zstd for zstd last few weeks, it hits about 0.6 times in every push on buildbot.
A few hours of zloop on a faster machine would've atleast had one of these.

LGTM.

@PrivatePuffin
Copy link
Contributor

@ikozhukhov might want to review this.

@behlendorf behlendorf requested a review from shartse December 20, 2019 16:45
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Dec 26, 2019
@behlendorf behlendorf merged commit 635a01a into openzfs:master Dec 26, 2019
allanjude pushed a commit to allanjude/zfs that referenced this pull request Dec 28, 2019
Any running 'zpool initialize' or TRIM must be cancelled prior
to the vdev_metaslab_fini() call in spa_vdev_remove_log() which
will unload the metaslabs and set ms->ms_group == NULL.

Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Kjeld Schouten <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#8602
Closes openzfs#9751
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Jan 2, 2020
Any running 'zpool initialize' or TRIM must be cancelled prior
to the vdev_metaslab_fini() call in spa_vdev_remove_log() which
will unload the metaslabs and set ms->ms_group == NULL.

Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Kjeld Schouten <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#8602
Closes openzfs#9751
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Jan 7, 2020
Any running 'zpool initialize' or TRIM must be cancelled prior
to the vdev_metaslab_fini() call in spa_vdev_remove_log() which
will unload the metaslabs and set ms->ms_group == NULL.

Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Kjeld Schouten <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#8602
Closes openzfs#9751
tonyhutter pushed a commit that referenced this pull request Jan 23, 2020
Any running 'zpool initialize' or TRIM must be cancelled prior
to the vdev_metaslab_fini() call in spa_vdev_remove_log() which
will unload the metaslabs and set ms->ms_group == NULL.

Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Kjeld Schouten <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #8602
Closes #9751
@behlendorf behlendorf deleted the issue-8602 branch April 19, 2021 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Test Suite Indicates an issue with the test framework or a test case Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ztest: segfault when hitting race in metaslab_enable
3 participants