-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-26320 Implement a separate thread pool for the LogCleaner #3712
Conversation
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is straight forward, in general I think it is fine to have different scan pools for wal and hfile, since they may on different storage types, where the speed for scanning is different.
Only have some concerns on the naming so left some comments, please take a look.
Thanks,
@@ -367,7 +367,8 @@ | |||
|
|||
private HbckChore hbckChore; | |||
CatalogJanitor catalogJanitorChore; | |||
private DirScanPool cleanerPool; | |||
private DirScanPool archiveCleanerPool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better add some comments here? I think the logCleanerPool is for removing wals, and the archiveCleanerPool is for all other cleaners? And do we need DirScanPool for cleaners other than log and hfile? If not, let's just name them logCleanerPool and hfileCleaner pool?
|
||
public DirScanPool(Configuration conf) { | ||
String poolSize = conf.get(CleanerChore.CHORE_POOL_SIZE, CleanerChore.DEFAULT_CHORE_POOL_SIZE); | ||
private DirScanPool(Configuration conf, String cleanerPoolConfig, String cleanerPoolConfigDefault, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CleanerPoolConfig is too general and I thought it was a class...
Let's just name them cleanerPoolSizeConfigName, and cleanerPoolSizeConfigDefault?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for this suggested name. Looks clean.
Can we think of making this more clean?
DirScanPool can have a private Enum Type (for ArchivedHFilesCleaner, OldLogsCleaner)
The constructor can only take the type. So the other details of name, configName, configDefault are considered based on type. Infact type can have a method to return the poolSize.
We dont have to even store the cleanerPoolConfigDefault
I think its used only on configUpdate method. That can use Configuration#get(String) and have a null check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for "DirScanPool can have a private Enum Type (for ArchivedHFilesCleaner, OldLogsCleaner)"
pool.setCorePoolSize(size); | ||
} | ||
|
||
public int getSize() { | ||
return size; | ||
} | ||
|
||
public static DirScanPool getArchiveScanner(Configuration conf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetScanPool instead of getScanner? Scanner has special meanings in HBase which may confuse developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Got the same Q when read the method name at begin of patch.
CleanerChore.DEFAULT_CHORE_POOL_SIZE, "Archive"); | ||
} | ||
|
||
public static DirScanPool getOldLogsScanner(Configuration conf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
@@ -64,6 +64,8 @@ | |||
*/ | |||
public static final String CHORE_POOL_SIZE = "hbase.cleaner.scan.dir.concurrent.size"; | |||
static final String DEFAULT_CHORE_POOL_SIZE = "0.25"; | |||
public static final String LOG_CLEANER_CHORE_SIZE = "hbase.log.cleaner.scan.dir.concurrent.size"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now we will have (by default) a pool of 1 thread for scan and cleaning of Archived HFiles.
For old WALs the thread pool's default size will be 25% of #cores
Can we add some code level comments here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, pool of 1 thread (by default) is for scanning and cleaning the old WALs. And pool with 25% of #cores (by default) is for archived store files.
+1 adding code level comments.
|
||
public DirScanPool(Configuration conf) { | ||
String poolSize = conf.get(CleanerChore.CHORE_POOL_SIZE, CleanerChore.DEFAULT_CHORE_POOL_SIZE); | ||
private DirScanPool(Configuration conf, String cleanerPoolConfig, String cleanerPoolConfigDefault, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for this suggested name. Looks clean.
Can we think of making this more clean?
DirScanPool can have a private Enum Type (for ArchivedHFilesCleaner, OldLogsCleaner)
The constructor can only take the type. So the other details of name, configName, configDefault are considered based on type. Infact type can have a method to return the poolSize.
We dont have to even store the cleanerPoolConfigDefault
I think its used only on configUpdate method. That can use Configuration#get(String) and have a null check?
pool.setCorePoolSize(size); | ||
} | ||
|
||
public int getSize() { | ||
return size; | ||
} | ||
|
||
public static DirScanPool getArchiveScanner(Configuration conf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Got the same Q when read the method name at begin of patch.
Thanks @Apache9, @anoopsjohn and @pankaj72981 for taking a look. I have updated based on your suggestions. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/DirScanPool.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/DirScanPool.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/DirScanPool.java
Outdated
Show resolved
Hide resolved
hbase-server/src/test/java/org/apache/hadoop/hbase/master/region/MasterRegionTestBase.java
Outdated
Show resolved
Hide resolved
This avoids starvation when the archive directory is large and takes a long time to iterate through.
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
The unit test failures don't seem related and only fail on Java 11 it seems. @Apache9 Do you have any further comments? Thanks |
Thanks for all the reviews! |
…he#3712) This avoids starvation when the archive directory is large and takes a long time to iterate through. Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Anoop Sam John <[email protected]> Signed-off-by: Pankaj <[email protected]>
…he#3712) This avoids starvation when the archive directory is large and takes a long time to iterate through. Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Anoop Sam John <[email protected]> Signed-off-by: Pankaj <[email protected]>
… (#4460) This avoids starvation when the archive directory is large and takes a long time to iterate through. Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Anoop Sam John <[email protected]> Signed-off-by: Pankaj <[email protected]>
… (#4460) This avoids starvation when the archive directory is large and takes a long time to iterate through. Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Anoop Sam John <[email protected]> Signed-off-by: Pankaj <[email protected]>
…he#3712) (apache#4460) This avoids starvation when the archive directory is large and takes a long time to iterate through. Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Anoop Sam John <[email protected]> Signed-off-by: Pankaj <[email protected]>
This avoids starvation when the archive directory is large and takes a long time
to iterate through.