-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-16386.Reduce DataNode load when FsDatasetAsyncDiskService is working. #3806
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes LGTM, pending a good CI run.
Thank you @sodonnel for your comments and reviews. |
💔 -1 overall
This message was automatically generated. |
There seems to be one checkstyle violation - could you fix that please and then I think we are good to commit this. |
Thank you @sodonnel. |
82937e6
to
fbb76d2
Compare
💔 -1 overall
This message was automatically generated. |
Could you help review this pr again @sodonnel . |
<name>dfs.datanode.fsdatasetasyncdisk.max.threads.per.volume</name> | ||
<value>4</value> | ||
<description> | ||
The maximum number of threads per volume used to delete blocks on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be used for other asynchronous operations in the future, not just deleting blocks, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tomscut for your comments and review.
I agree to your suggestion. But now FsDatasetAsyncDiskService mainly aimed at removing block.
Do you have a better description of this property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be changed to this: delete blocks
-> process async disk operations
.
fbb76d2
to
dc62588
Compare
💔 -1 overall
This message was automatically generated. |
Could you help review this pr, @jojochuang @ferhui. |
LGTM. For what is worth, we don't need two committers to approve a PR :) Stephen alone is a gold standard. |
thanks for working on working.. can you guys commit to branch-3.2.3 also..? |
Thank you for your reminder and help, @jojochuang . |
Thank you for your attention and comments, @brahmareddybattula . |
This will not go onto branch-3.2 cleanly due to HADOOP-17126 (new Preconditions class), however it is a trivial change in the import statement, so I have went ahead and made it and committed to 3.2.3 too. |
@jianghuazhu I'm so sorry to discuss this issue again. |
The ThreadPoolExecutor used the unbounded LinkedBlockingQueue, so the actual thread number should be less than or equal to the corePoolSize. When NN needs one DN to delete a large number of blocks, this DN will create a large number of ReplicaFileDeleteTask, and stored all ReplicaFileDeleteTasks in the LinkedBlockingQueue of the ThreadPoolExecutor, resulting in increased memory or even OOM. Feel free to correct me if there are mistakes. |
Thanks @ZanderXu for following.
|
Thanks @jianghuazhu for you comment.
|
@ZanderXu , nice to communicate with you. |
Thanks, and i will create a new PR to do it. |
Description of PR
When FsDatasetAsyncDiskService is working, if the DataNode has a lot of disks, this will cause a higher load on the DataNode, for example, a lot of memory is used.
This phenomenon will affect the stability of the DataNode.
Details: HDFS-16386
How was this patch tested?
For testing, there is little pressure.