Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(upgrade): fix rebuild validation #314

Merged
merged 1 commit into from
Aug 22, 2023

Conversation

sinhaashish
Copy link
Member

@sinhaashish sinhaashish commented Aug 17, 2023

After the addition of partial rebuild, the rebuild validations has been changed for upgrade.

The logic to check if any rebuild is in progress has been changed.

Description

The new rebuild validation is based on the logic below.

  1. If all the volumes are online (across the cluster) then continue with upgrade

  2. In case of volumes which are degraded/faulted check for replica rebuild in progress

    • If replica rebuild in progress then wait for rebuild to complete
    • If no replica rebuild in progress then step wait for a period of 10+ minutes with an interval of 35 seconds. At every interval check for replica rebuild in progress, If rebuilding then wait , else after 10 minutes continue with the upgrade.

There can be scenario where the already checked volume becomes degraded later, so add a vector of unhealthy volumes, and wait till the size of unhealthy volumes.

The rebuild in pre-validation i.e. client side is not modified with the above logic as it just checks for rebuilding . And once the job is created the the rebuild is again verified. So all the above logic reside in Upgrade job

Motivation and Context

The rebuild status would have been incorrect if we just check for child.rebuild_progress.is_some()

Regression

No

How Has This Been Tested?

Ran the manual upgrade.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added unit tests to cover my changes.

@sinhaashish sinhaashish force-pushed the rebuild-check branch 5 times, most recently from 93af180 to 4525463 Compare August 21, 2023 04:11
@sinhaashish sinhaashish force-pushed the rebuild-check branch 2 times, most recently from e892ba4 to d82a0f5 Compare August 21, 2023 13:21
@sinhaashish
Copy link
Member Author

bors merge

bors bot pushed a commit that referenced this pull request Aug 22, 2023
314: chore(upgrade): fix rebuild validation r=sinhaashish a=sinhaashish

After the addition of partial rebuild, the rebuild validations has been changed for upgrade.

<!

Co-authored-by: sinhaashish <[email protected]>
@bors
Copy link
Contributor

bors bot commented Aug 22, 2023

This PR was included in a batch that successfully built, but then failed to merge into develop. It will not be retried.

Additional information:

Response status code: 422
{"message":"All comments must be resolved.","documentation_url":"https://docs.github.com/articles/about-protected-branches"}

@sinhaashish
Copy link
Member Author

bors merge

@bors
Copy link
Contributor

bors bot commented Aug 22, 2023

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

@bors bors bot merged commit 0196fd9 into openebs:develop Aug 22, 2023
@sinhaashish sinhaashish deleted the rebuild-check branch August 22, 2023 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants