Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce max headroom for disk watermark stages #88639

Merged
merged 76 commits into from
Sep 19, 2022

Conversation

kingherc
Copy link
Contributor

@kingherc kingherc commented Jul 20, 2022

Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Also, convert the disk watermarks to RelativeByteSizeValue, similar to the existing setting for the flood stage of the frozen tier.

Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService.

Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests.

Fixes #81406

@kingherc kingherc force-pushed the feature/81406-disk-max-headrooms branch from 6c1bd74 to 10bca27 Compare July 20, 2022 10:42
Introduce max headroom settings for the low, high, and
flood disk watermark stages, similar to the existing
max headroom setting for the flood stage of the frozen
tier. Also, convert the disk watermarks to
RelativeByteSizeValue, similar to the existing setting
for the flood stage of the frozen tier.

Introduce new max headrooms in HealthMetadata and in
ReactiveStorageDeciderService.

Add multiple tests in DiskThresholdDeciderUnitTests,
DiskThresholdDeciderTests and DiskThresholdMonitorTests.

Fixes elastic#81406
@kingherc kingherc force-pushed the feature/81406-disk-max-headrooms branch from 10bca27 to 61d4f32 Compare July 20, 2022 11:40
@kingherc kingherc self-assigned this Jul 20, 2022
@kingherc kingherc added >enhancement Team:Core/Infra Meta label for core/infra team Team:Data Management Meta label for data/management team Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. :Core/Infra/Settings Settings infrastructure and APIs :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) :Data Management/Other Supportability Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better. labels Jul 20, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @kingherc, I've created a changelog YAML for you.

@kingherc kingherc marked this pull request as ready for review July 20, 2022 14:15
@kingherc kingherc requested review from DaveCTurner and gmarouli July 20, 2022 14:16
@kingherc
Copy link
Contributor Author

Hi!

@DaveCTurner , because @henningandersen may see this later , I added you as reviewer because I noticed your name in several tests I modified; please tell me if you'd prefer to add someone else or wait for Henning. @gmarouli , I added you as secondary reviewer because I touched some HealthMetadata parts you had recently written -- with any luck if this PR makes the 8.4.0 version, there will not be any compatibility issues to handle. But feel free to tell me any tips for handling backwards compatibility in case this PR slips the deadline for the 8.4.0 version.

Since this is my first large PR, please tell me tips if you think I should do something differently.

Some notes and questions for you:

  • Feel free to follow this reviewing order for your ease:
  1. disk_allocator.asciidoc
  2. DiskThresholdSettings.java
  3. DiskThresholdMonitor.java
  4. DiskThresholdDecider.java
  5. HealthMetadata.java
  6. HealthMetadataService.java
  7. Then the rest, including the tests.
  • I introduced a boolean switch in several tests to make sure we test the new max headroom settings. On purpose, I split tests to test both percentage watermarks and the new max headroom settings. I could, alternatively, not split the tests and rather introduce the switch with a testMaxHeadroom=randomBoolean() inside the test, so as to randomly test either the percentage settings or the max headrooms. Please tell me if you think this would be better.
  • In modifying disk_allocator.asciidoc to add the new max headroom settings, I saw a {ess-icon} next to the watermark settings. I added the same icon. But, should I do anything related to ESS or notify the ESS team?
  • There was a rounding error happening calculating disk usage in some cases. Corrected it in a commit.

@kingherc kingherc added the cloud-deploy Publish cloud docker image for Cloud-First-Testing label Jul 20, 2022
@kingherc kingherc requested a review from henningandersen July 20, 2022 14:49
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change touches a lot of code (+2,021 −885 !!) and does a handful of different things and I wonder if it would be possible for you to split it up into steps for ease of reviewing. For instance, it looks like we could move DiskThresholdSettings to use RelativeByteSizeValue as a preliminary step before introducing the max headroom parameters. Likewise the move to using formatNoTrailingZerosPercent and the change from 100.0 * (freeBytes / totalBytes) to (100.0 * freeBytes) / totalBytes seem independent of this change (or maybe they're prerequisites?) so could happen first.

That said the overall shape of the change looks sensible to me. I think adding a boolean parameter to each test and running them all each time is fine.

@DaveCTurner
Copy link
Contributor

I wonder if it would be possible for you to split it up into steps for ease of reviewing.

By "split it up into steps" I probably mean doing each step as a separate PR. Sometimes it works to have a sequence of commits in a single PR but it's hard to iterate on that kind of change.

@kingherc
Copy link
Contributor Author

Hi @DaveCTurner , I get you. I also thought of this problem in the last couple of days. I was mostly working on the tests so that they are comprehensive, and the test-related changes of the new max headroom settings are the bulk of the PR. The good thing is that the test changes can be reviewed without going back-and-forward. I believe following the order of reading I recommended will get you through the main code changes, while the rest of the test-related changes can be read in a forward-reading manner at your own leisure/pace.

Some of the changes that you mentioned could be done in separate commits, to ease reviewing. I can do that tomorrow. I am a bit hesitant to split into separate PRs, because:

  • I would ideally see this through sooner than later to make it in the 8.4.0 version FF, if there are no major concerns of course, else I would need to figure out the compatibility issues with the HealthMetadataService (which may not be major, but I did not need to care for them now), and
  • of the additional effort to extricate seemingly orthogonal changes (which I am not yet 100% sure they will be totally orthogonal) that can be self-standing PRs passing the merge gates separately. While with separate commits, this will not be an issue for the merge gate.

Taking the above in consideration, if it is OK with you to review separate commits, it would be great. However, if you believe that I should split in separate PRs, that is also perfectly fine, I can go ahead and do that!

@DaveCTurner
Copy link
Contributor

As things stand I don't think I'll be able to review this satisfactorily by 8.4.0 FF - there's just too much going on in this one PR and this stuff is important to get right. Let's not fixate on that date too much, we will be able to make this compatible in a later version if needed.

@kingherc
Copy link
Contributor Author

Hi @DaveCTurner , that makes sense, it is a very tight window until FF. No problem, please review when you can, and indeed I will make it compatible as/if needed after the FF. I will go ahead and split commits for easier reviewing, but please tell me if I should split PRs. BTW, if you do not review it this week, then @henningandersen can review instead of you next week, since I believe either one of your reviews would be sufficient and since he seems more relevant to how he introduced the max headroom setting for the frozen tier. Of course if you still want to review, feel free, your comments are definitely appreciated :)

@gmarouli , when you also review, please comment on compatibility implications, assuming this PR makes it after the 8.4.0 FF. E.g., if there would be any issues introducing the new max headroom in the HealthMetadata after 8.4.0 with some version if switches in the serialization and/or XContent (is HealthMetadata exposed to REST?).

I am thinking I should not merge this with only one approval. So I will wait for at least two approvals (@gmarouli and at least one of @DaveCTurner or @henningandersen ). Please tell me if I should do differently. Thanks so much for taking the time to review this!

@DaveCTurner
Copy link
Contributor

👍 I think it might be tight to have Henning review all of this before FF too but let's see. I'm very comfortable reviewing the bits that I highlighted as possible preliminaries, especially since they are refactorings which shouldn't change behaviour, and I expect that will make the behaviour-changing side of this PR much more manageable. That's why I'd rather approve it in steps so I'd still like to see separate PRs please.

One LGTM is usually acceptable unless the reviewer indicates otherwise.

@kingherc
Copy link
Contributor Author

Thanks for the tips @DaveCTurner ! If one LGTM is enough, I believe I could disengage you & Henning, and ask @gmarouli if you would be comfortable to review this, maybe in time for the FF? Feel free to tell me if you'd like separate commits/PRs and I can do them immediately.

@kingherc
Copy link
Contributor Author

Hi @henningandersen ! gentle reminder for giving another review

@henningandersen henningandersen self-requested a review August 18, 2022 16:02
@kingherc
Copy link
Contributor Author

Gentle ping @henningandersen :)

@kingherc
Copy link
Contributor Author

kingherc commented Sep 7, 2022

@henningandersen gentle reminder for a final review/approval now that we know we can merge this without unintended repercussions.

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had another look and left a few more comments. I did not check the health part though, but I think @gmarouli went into detail there.

kingherc and others added 5 commits September 12, 2022 16:05
To
org.elasticsearch.cluster.routing.allocation.decider.MockDiskUsagesIT#testRerouteOccursOnDiskPassingHighWatermark
That if it is set, then the watermarks need to be ratios.
@kingherc
Copy link
Contributor Author

Thanks @henningandersen for leaving feedback! 👍
I resolved them, feel free to check again. Thanks!

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have one comment left on this now.

And tests for corner edge cases where they could default to -1
and the user needs to explicitly set any higher-level headrooms.
@kingherc
Copy link
Contributor Author

Hi @henningandersen ! Just pushed a commit to disallow explicitly setting headrooms to -1. There are still the corner edge cases where they could default to -1, so I leave those checks in place for these corner cases. I added appropriate tests for all these.

Please check the commit, and if that was your last worry, feel free to approve, or otherwise comment. Thanks!

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

docs/reference/modules/cluster/disk_allocator.asciidoc Outdated Show resolved Hide resolved
@kingherc kingherc merged commit 34471b1 into elastic:main Sep 19, 2022
@kingherc kingherc deleted the feature/81406-disk-max-headrooms branch September 19, 2022 11:59
+ lowHeadroom.getStringRep()
+ "]"
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is an issue with this validation. Imagine a user is upgrading a cluster from a version prior to this change to a version with it.

The prior settings are:

cluster.routing.allocation.disk.watermark.low=null (is not set)
cluster.routing.allocation.disk.watermark.high=91% (any non-default value)

This way the new settings are initialized as:

cluster.routing.allocation.disk.watermark.low.max_headroom=200GB
cluster.routing.allocation.disk.watermark.high.max_headroom=-1

that would fail this validation.

I believe the similar could happen for below validation of highHeadroom & floodHeadroom.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @idegtiarenko, indeed, reading the code I believe you're right. This would make the user need to define the max headroom explicitly. Seems like an issue that could break upgrades indeed.

I think a solution out of this would be that the default values of the headrooms return -1 if any of the watermarks are explicitly set. Currently they return -1 only if the respective watermark is explicity set. So with this correction, it would mean that the low max headroom in your example would return -1 instead of 200GB.

Have you created a ticket for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is covered by this issue: #93468

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cloud-deploy Publish cloud docker image for Cloud-First-Testing :Core/Infra/Settings Settings infrastructure and APIs :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement Supportability Improve our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better. Team:Core/Infra Meta label for core/infra team Team:Data Management Meta label for data/management team Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Revisit default disk watermark on different data tiers
8 participants