-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28068 : Add hbase.normalizer.merge.merge_request_max_number_of_regions property … #5403
Conversation
…to limit max number of regions to merge for normalizer
…to limit max number of regions to merge for normalizer
💔 -1 overall
This message was automatically generated. |
…to limit max number of regions to merge for normalizer
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch @rahulLiving , really sorry that we left this time-bomb in the code for you. I'm glad you were able to recover your cluster.
I think that adding a safety guard here is a good idea. Should we also have a safety guard that limits the number of merge requests that can be produced in a single pass? for a single table? Perhaps that's for a separate Jira.
@@ -656,6 +656,12 @@ possible configurations would overwhelm and obscure the important. | |||
<description>The minimum size for a region to be considered for a merge, in whole | |||
MBs.</description> | |||
</property> | |||
<property> | |||
<name>hbase.normalizer.merge.max.region.count</name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Careful here. This config name looks like a counter-point to hbase.normalizer.merge.min.region.count
but in fact it's not. The min
config is placing a guard-rail on the minimum number of regions a table contain in order to be considered for merge normalization. This new guard-rail is limiting the number of regions that can be merged together in a single operation.
I instead suggest a config named something like hbase.normalizer.merge.merge_request_max_number_of_regions
. Also, the description text should be updated accordingly -- this is just a copy-paste of the other config's description.
@@ -81,6 +81,8 @@ class SimpleRegionNormalizer implements RegionNormalizer, ConfigurationObserver | |||
static final int DEFAULT_MERGE_MIN_REGION_AGE_DAYS = 3; | |||
static final String MERGE_MIN_REGION_SIZE_MB_KEY = "hbase.normalizer.merge.min_region_size.mb"; | |||
static final int DEFAULT_MERGE_MIN_REGION_SIZE_MB = 0; | |||
static final String MERGE_MAX_REGION_COUNT_KEY = "hbase.normalizer.merge.max.region.count"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However you chance the configuration key, please also change these variable names. As it is, they look identical to the other, subtly-different MAX config variable, and it will be confusing to maintainers.
private static long parseMergeMaxRegionCount(final Configuration conf) { | ||
final long parsedValue = | ||
conf.getLong(MERGE_MAX_REGION_COUNT_KEY, DEFAULT_MERGE_MAX_REGION_COUNT); | ||
final long settledValue = Math.max(1, parsedValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The minimum value should be 2 ; it makes no sense to merge a single region.
|| (rangeMembers.size() == 1 && sumRangeMembersSizeMb == 0) // when there is only one | ||
// region and the size is 0, | ||
// seed the range with | ||
// whatever we have. | ||
|| regionSizeMb == 0 // always add an empty region to the current range. | ||
|| (regionSizeMb + sumRangeMembersSizeMb <= avgRegionSizeMb) | ||
|| (regionSizeMb + sumRangeMembersSizeMb <= avgRegionSizeMb)) | ||
&& (rangeMembers.size() < getMergeMaxRegionCount()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please invert the logic to be another || condition
-- adding the &&
makes the logic more confusing.
@@ -81,6 +81,8 @@ class SimpleRegionNormalizer implements RegionNormalizer, ConfigurationObserver | |||
static final int DEFAULT_MERGE_MIN_REGION_AGE_DAYS = 3; | |||
static final String MERGE_MIN_REGION_SIZE_MB_KEY = "hbase.normalizer.merge.min_region_size.mb"; | |||
static final int DEFAULT_MERGE_MIN_REGION_SIZE_MB = 0; | |||
static final String MERGE_MAX_REGION_COUNT_KEY = "hbase.normalizer.merge.max.region.count"; | |||
static final long DEFAULT_MERGE_MAX_REGION_COUNT = Long.MAX_VALUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to put a large but limiting default, so that no one else hits the same problem you do, right? In order to execute a merge procedure, the master must assign all involved regions to the same region server. So likely a majority of the regions involved have to be moved first -- all those assign and close and open procedures. How about a default of 100? 1_000?
Remember, this limits the size of a single merge request. With a default of 100, your problematic situation with 25k regions would still generate 250 merge procedures (much better, but still a lot all at once).
rangeMembers.isEmpty() // when there are no range members, seed the range with whatever | ||
// we have. this way we're prepared in case the next region is | ||
// 0-size. | ||
(rangeMembers.isEmpty() // when there are no range members, seed the range with whatever |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're here, do you mind cleaning up these comment strings? They're awkwardly broken over several lines, probably because of the spotless:apply automation. Something like,
if (
// when there are no range members, seed the range with whatever we have. this way we're
// prepared in case the next region is0-size.
rangeMembers.isEmpty()
// when there is only one region and the size is 0, seed the range with whatever we
// have.
|| (rangeMembers.size() == 1 && sumRangeMembersSizeMb == 0)
// always add an empty region to the current range.
|| regionSizeMb == 0
|| (regionSizeMb + sumRangeMembersSizeMb <= avgRegionSizeMb)
) {
// add the current region to the range when there's capacity remaining.
rangeMembers.add(new NormalizationTarget(regionInfo, regionSizeMb));
@ndimiduk Thanks a lot for the review. Have addressed your comments. Can you please have a look again
Sure. Let me try to create a JIRA for same. Not sure if I am familiar with all the details here, but let me have a look and create one. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Please also update the Jira title with the new config name and update the git commit to match the jira title. Thanks!
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@ndimiduk Fixed the PR and JIRA title, I see 2 known test failure and 0 new failure. Can you please help with the merge? Thanks |
…gions property … (apache#5403) Authored-by: Rahul Kumar <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…gions property … (#5403) Authored-by: Rahul Kumar <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…gions property … (apache#5403) Authored-by: Rahul Kumar <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…gions property … (apache#5403) Authored-by: Rahul Kumar <[email protected]> Signed-off-by: Nick Dimiduk <[email protected]>
…to limit max number of regions in a merge request for merge normalization