Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-17165. Implement service-user feature in DecayRPCScheduler. #2240

Merged
merged 3 commits into from
Sep 9, 2020

Conversation

tasanuma
Copy link
Member

JIRA: https://issues.apache.org/jira/browse/HADOOP-17165

This PR is following the HADOOP-17165.

Diff from HADOOP-17165.002.patch

  • Fixed a failed test of TestCommonConfigurationFields.
  • Use Set for return type and field type.
  • Save the priorities of service users in cache.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 24m 27s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 28m 47s trunk passed
+1 💚 compile 19m 37s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 compile 16m 57s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 checkstyle 0m 53s trunk passed
+1 💚 mvnsite 1m 27s trunk passed
+1 💚 shadedclient 17m 50s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 37s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javadoc 1m 46s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+0 🆗 spotbugs 2m 52s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 2m 50s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 1m 4s the patch passed
+1 💚 compile 22m 5s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javac 22m 5s the patch passed
+1 💚 compile 20m 16s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 javac 20m 16s the patch passed
+1 💚 checkstyle 0m 51s the patch passed
+1 💚 mvnsite 1m 28s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 xml 0m 2s The patch has no ill-formed XML file.
+1 💚 shadedclient 14m 0s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 39s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javadoc 1m 47s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 findbugs 3m 32s the patch passed
_ Other Tests _
-1 ❌ unit 9m 34s hadoop-common in the patch passed.
+1 💚 asflicense 0m 55s The patch does not generate ASF License warnings.
193m 59s
Reason Tests
Failed junit tests hadoop.metrics2.source.TestJvmMetrics
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2240/1/artifact/out/Dockerfile
GITHUB PR #2240
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint
uname Linux 3649a30e91a1 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 83fd15b
Default Java Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
unit https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2240/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2240/1/testReport/
Max. process+thread count 3360 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2240/1/console
versions git=2.17.1 maven=3.6.0 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@tasanuma
Copy link
Member Author

The failed test succeeded in my local computer.

@@ -483,6 +501,12 @@ private void recomputeScheduleCache() {

for (Map.Entry<Object, List<AtomicLong>> entry : callCosts.entrySet()) {
Object id = entry.getKey();
// The priority for service users is always 0
if (isServiceUser((String)id)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tasanuma Thanks for your proposal. I am concerned any corner case here if put service users's request to the priority forever rather than flow controls. Such as user hdfs is one service user, and submit one big job which involve massive RPC request to NameNode, other normal users' request may postpone serious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment, @Hexiaoqiao.
I think you can adjust the existing FairCallQueue settings to deal with that case to a certain extent. You could consider increasing the weight of the highest priority queue, or extending the length of the highest priority queue. However, this simple feature is not a feature that can be used in all cases. If more fairness is needed among service users, then I believe that HADOOP-15016 would be required.

The case where I think this feature would be useful is when you have service users who are regularly running critical ETL jobs. A user who can make huge RPC requests at a moment may not be suite for a service user.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tasanuma curious why you choose to do it here and cachedOrComputedPriorityLevel instead of getPriorityLevel in the previous patch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunchao Thanks for your review.

  • When DecayRpcScheduler decays, it calls recomputeScheduleCache to recalculate the priority of all users, including service users, and put them in the cache. So we had to do this in recomputeScheduleCache as well. Lines 415-421 of the unit test cover this case.
  • I moved it from getPriorityLevel to cachedOrComputedPriorityLevel because I thought it would be a little more efficient to determine if a user is a service user after looking at the cache information.

Copy link
Member

@sunchao sunchao Sep 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Thanks @tasanuma . This looks most good for me. One minor nit is that the logic for handling service users now is separated into two places. Another way perhaps is to add this in computePriorityLevel which is called by recomputeScheduleCache and cacheOrComputePriorityLevel. We need to add an extra parameter for this method though. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review, @sunchao! I agreed. Updated the PR addressing it.

@tasanuma
Copy link
Member Author

tasanuma commented Sep 1, 2020

I've read the design documentation for HADOOP-15016 and I believe that HADOOP-17165 could co-exist with HADOOP-15016.

  • HADOOP-17165 adds the service user functionality to the existing prioritized FairCallQueue, while HADOOP-15016 proposes to add new Reserved Queue.
  • HADOOP-17165 uses the term service user, while HADOOP-15016's POC patch uses the term reserved user.

@tasanuma
Copy link
Member Author

tasanuma commented Sep 1, 2020

@sunchao Could you review this PR?

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tasanuma . Sorry I just noticed this PR.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 2s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 15s trunk passed
+1 💚 compile 21m 1s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 compile 17m 32s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 checkstyle 0m 46s trunk passed
+1 💚 mvnsite 1m 26s trunk passed
+1 💚 shadedclient 18m 5s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 29s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javadoc 1m 25s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+0 🆗 spotbugs 2m 14s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 2m 12s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 50s the patch passed
+1 💚 compile 20m 13s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javac 20m 14s the patch passed
+1 💚 compile 17m 37s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 javac 17m 37s the patch passed
+1 💚 checkstyle 0m 45s the patch passed
+1 💚 mvnsite 1m 22s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 xml 0m 2s The patch has no ill-formed XML file.
+1 💚 shadedclient 15m 40s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 28s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javadoc 1m 25s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 findbugs 2m 19s the patch passed
_ Other Tests _
+1 💚 unit 9m 39s hadoop-common in the patch passed.
+1 💚 asflicense 0m 46s The patch does not generate ASF License warnings.
169m 20s
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2240/2/artifact/out/Dockerfile
GITHUB PR #2240
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint
uname Linux e3071abad3ef 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 1841a5b
Default Java Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2240/2/testReport/
Max. process+thread count 1346 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2240/2/console
versions git=2.17.1 maven=3.6.0 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one nit. Thanks @tasanuma ! @Hexiaoqiao let me know if you want to take another look.

@@ -178,6 +188,7 @@
private static final double PRECISION = 0.0001;
private MetricsProxy metricsProxy;
private final CostProvider costProvider;
private Set<String> serviceUsernames;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit nit: usually it is UserName instead of Username, so better follow that pattern.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. Updated PR following UserName.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 43s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 36m 54s trunk passed
+1 💚 compile 26m 34s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 compile 21m 34s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 checkstyle 1m 0s trunk passed
+1 💚 mvnsite 1m 47s trunk passed
+1 💚 shadedclient 20m 25s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 30s trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javadoc 1m 24s trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+0 🆗 spotbugs 2m 20s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 2m 18s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 53s the patch passed
+1 💚 compile 20m 19s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javac 20m 19s the patch passed
+1 💚 compile 17m 42s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 javac 17m 42s the patch passed
+1 💚 checkstyle 0m 45s the patch passed
+1 💚 mvnsite 1m 23s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 xml 0m 1s The patch has no ill-formed XML file.
+1 💚 shadedclient 15m 36s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 29s the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚 javadoc 1m 28s the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚 findbugs 2m 20s the patch passed
_ Other Tests _
+1 💚 unit 9m 36s hadoop-common in the patch passed.
+1 💚 asflicense 0m 46s The patch does not generate ASF License warnings.
186m 56s
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2240/3/artifact/out/Dockerfile
GITHUB PR #2240
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint
uname Linux 9dfc335fc413 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 0d85515
Default Java Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2240/3/testReport/
Max. process+thread count 1344 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2240/3/console
versions git=2.17.1 maven=3.6.0 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@Hexiaoqiao Hexiaoqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tasanuma and @sunchao for your works. +1 from my side.

@sunchao sunchao merged commit e5fe326 into apache:trunk Sep 9, 2020
@sunchao
Copy link
Member

sunchao commented Sep 9, 2020

Merged. Thanks @tasanuma !

@tasanuma
Copy link
Member Author

tasanuma commented Sep 9, 2020

Thanks for reviewing and merging it, @sunchao and @Hexiaoqiao!

@tasanuma tasanuma deleted the HADOOP-17165 branch September 9, 2020 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants