-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support writing multi files of single partition to improve speed in HDFS storage #396
Conversation
int endPartition, | ||
String storageBasePath, | ||
String fileNamePrefix, | ||
Configuration hadoopConf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can get the parameter concurrency
from hadooConf
. The hadoopConf
we can pass from our client. It will be more flexible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Although it’s a little bit strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the todo comment to support this in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you create some issues for these todo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I will do after this PR is merged.
Codecov Report
@@ Coverage Diff @@
## master #396 +/- ##
============================================
- Coverage 58.77% 58.54% -0.23%
- Complexity 1602 1607 +5
============================================
Files 193 195 +2
Lines 10939 11021 +82
Branches 955 963 +8
============================================
+ Hits 6429 6452 +23
- Misses 4132 4193 +61
+ Partials 378 376 -2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Is there any read side related changes should be applied for this PR? And, if possible, please add an integration test for spark3 with concurrent writer enabled? |
There is no need to do any compatible change in read client, because the original logic has covered the different partition files due to the prefix of retry times.
It's OK. But the spark client test may be not accurate, because it's hard to control the concurrent write. And I think the integration test is enough and accurate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall logic lgtm, left some minor comments.
...est/common/src/test/java/org/apache/uniffle/test/ShuffleServerConcurrentWriteOfHdfsTest.java
Outdated
Show resolved
Hide resolved
...est/common/src/test/java/org/apache/uniffle/test/ShuffleServerConcurrentWriteOfHdfsTest.java
Outdated
Show resolved
Hide resolved
...est/common/src/test/java/org/apache/uniffle/test/ShuffleServerConcurrentWriteOfHdfsTest.java
Outdated
Show resolved
Hide resolved
...est/common/src/test/java/org/apache/uniffle/test/ShuffleServerConcurrentWriteOfHdfsTest.java
Show resolved
Hide resolved
integration-test/common/src/test/java/org/apache/uniffle/test/ShuffleServerWithHdfsTest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/apache/uniffle/server/ShuffleServerConf.java
Outdated
Show resolved
Hide resolved
storage/src/main/java/org/apache/uniffle/storage/common/HdfsStorage.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Do you have other comments? @jerqi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
… distribute pressure (#452) ### What changes were proposed in this pull request? [Improvement] Read HDFS data files with random sequence to distribute pressure #452 ### Why are the changes needed? In PR #396 to support concurrently writing single partition's data into multiple HDFS files, it's better to randomly read HDFS data files to distribute stress in client side. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing UTs
…cy to write in client side (#815) ### What changes were proposed in this pull request? 1. Support specifying per-partition's max concurrency to write in client side ### Why are the changes needed? The PR of #396 has introduced the concurrent HDFS writing for one partition, but the concurrency is determined by the server client. In order to increase flexibility, this PR supports specifying per-partition's max concurrency to write in client side ### Does this PR introduce _any_ user-facing change? Yes. The client conf of `<client_type>.rss.client.max.concurrency.per-partition.write` and `rss.server.client.max.concurrency.limit.per-partition.write` are introduced. ### How was this patch tested? 1. UTs
What changes were proposed in this pull request?
PooledHdfsShuffleWriteHandler
to support writing single partition to multiple HDFS files concurrently.Why are the changes needed?
As the problem mentioned by #378 (comment), the writing speed of HDFS is too slow and it can't write concurrently. Especially when huge partition exists, this problem will cause other apps slow due to the slight memory.
So the improvement of writing speed is an important factor to flush the huge partition to HDFS quickly.
Does this PR introduce any user-facing change?
Yes.
How was this patch tested?