Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZTS: Improve redundancy/* test scripts #11906

Merged
merged 1 commit into from
Apr 19, 2021

Conversation

behlendorf
Copy link
Contributor

Motivation and Context

The redundancy_draid3 test case occasionally fails in the CI
but it's unclear why, and I haven't been able to reproduce the
issue locally. This PR leverages the CI to see if we can either
reproduce the issue, or if the small fixes in this PR resolve the
issue which is possible.

http://build.zfsonlinux.org/builders/FreeBSD%20stable%2F13%20amd64%20%28TEST%29/builds/215

Description

  • Add additional logging to provide more information about why the
    test failed. This including logging more of the individual commands
    and the contents and differences of the record files on failure.

  • Updated get_vdevs() to properly exclude all top-level vdevs
    including raidz3 and draid[1-3].

  • Replaced gnudd with dd. This is the only remaining place in the
    test suite gnudd is used and it shouldn't be needed.

How Has This Been Tested?

Locally ran the updated tests but I was unable to reproduce the
issue. This PR leverages the CI to run the redundancy tests 100
times per builder to see if it can reproduce the issue.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf added the Status: Work in Progress Not yet ready for general review label Apr 16, 2021
@behlendorf behlendorf force-pushed the zts-redundancy branch 3 times, most recently from 998c5ec to b10bccb Compare April 18, 2021 23:00
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Work in Progress Not yet ready for general review labels Apr 19, 2021
@behlendorf behlendorf marked this pull request as ready for review April 19, 2021 04:21
- Add additional logging to provide more information about why the
  test failed.  This including logging more of the individual commands
  and the contents and differences of the record files on failure.

- Updated get_vdevs() to properly exclude all top-level vdevs
  including raidz3 and draid[1-3].

- Replaced gnudd with dd.  This is the only remaining place in the
  test suite gnudd is used and it shouldn't be needed.

- The refill_test_env function expects the pool as the first argument
  but never sets the pool variable.

- Only fill the test pools to 50% of capacity instead of 75% to help
  speed up the tests.

- Fix replace_missing_devs() calculation, MINDEVSIZE should be
  MINVDEVSIZE.

- Fix damage_devs() so it overwrites almost all of the device so
  we're guaranteed to damage filesystem blocks.

- redundancy_stripe.ksh should not use log_mustnot to check if the
  pool is healthy since the return value may be misinterpreted.
  Just perform a normal conditional check and log the failure.

Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#11906
@behlendorf behlendorf merged commit 50d9ff9 into openzfs:master Apr 19, 2021
@behlendorf behlendorf deleted the zts-redundancy branch April 19, 2021 19:53
behlendorf added a commit to behlendorf/zfs that referenced this pull request Apr 21, 2021
- Add additional logging to provide more information about why the
  test failed.  This including logging more of the individual commands
  and the contents and differences of the record files on failure.

- Updated get_vdevs() to properly exclude all top-level vdevs
  including raidz3 and draid[1-3].

- Replaced gnudd with dd.  This is the only remaining place in the
  test suite gnudd is used and it shouldn't be needed.

- The refill_test_env function expects the pool as the first argument
  but never sets the pool variable.

- Only fill the test pools to 50% of capacity instead of 75% to help
  speed up the tests.

- Fix replace_missing_devs() calculation, MINDEVSIZE should be
  MINVDEVSIZE.

- Fix damage_devs() so it overwrites almost all of the device so
  we're guaranteed to damage filesystem blocks.

- redundancy_stripe.ksh should not use log_mustnot to check if the
  pool is healthy since the return value may be misinterpreted.
  Just perform a normal conditional check and log the failure.

Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#11906
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
- Add additional logging to provide more information about why the
  test failed.  This including logging more of the individual commands
  and the contents and differences of the record files on failure.

- Updated get_vdevs() to properly exclude all top-level vdevs
  including raidz3 and draid[1-3].

- Replaced gnudd with dd.  This is the only remaining place in the
  test suite gnudd is used and it shouldn't be needed.

- The refill_test_env function expects the pool as the first argument
  but never sets the pool variable.

- Only fill the test pools to 50% of capacity instead of 75% to help
  speed up the tests.

- Fix replace_missing_devs() calculation, MINDEVSIZE should be
  MINVDEVSIZE.

- Fix damage_devs() so it overwrites almost all of the device so
  we're guaranteed to damage filesystem blocks.

- redundancy_stripe.ksh should not use log_mustnot to check if the
  pool is healthy since the return value may be misinterpreted.
  Just perform a normal conditional check and log the failure.

Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#11906
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants