Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-16535. SlotReleaser should reuse the domain socket based on socket paths #4158

Merged
merged 3 commits into from
Apr 18, 2022

Conversation

stiga-huang
Copy link
Contributor

@stiga-huang stiga-huang commented Apr 11, 2022

Description of PR

HDFS-13639 (#1885) improves the performance of short-circuit shm slot releasing by reusing the domain socket that the client previously used to send release request to the DataNode.

This is good when there are only one DataNode locates with the client (truth in most of the production environment). However, if we launch multiple DataNodes on a machine (usually for testing, e.g. Impala's end-to-end tests), the request could be sent to the wrong DataNode. See an example in IMPALA-11234.

We should only reuse the domain socket when it corresponds to the same socket path. This change introduces a map in ShortCircuitCache to track the mapping from socket path to the domain socket. SlotReleaser will pick the correct domain socket (if exists) based on the socket path.

How was this patch tested?

I deploy the fix in my Impala minicluster with multiple DataNodes launched on my machine. Verified that the error logs of slot release failures disappeared.

Also add a regression test: TestShortCircuitCache#testDomainSocketClosedByMultipleDNs().

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 39s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 39m 2s trunk passed
+1 💚 compile 1m 2s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 0m 54s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 36s trunk passed
+1 💚 mvnsite 1m 1s trunk passed
+1 💚 javadoc 0m 50s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 39s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 2m 46s trunk passed
+1 💚 shadedclient 22m 15s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 50s the patch passed
+1 💚 compile 0m 53s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 0m 53s the patch passed
+1 💚 compile 0m 46s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 46s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 19s the patch passed
+1 💚 mvnsite 0m 50s the patch passed
+1 💚 javadoc 0m 34s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 31s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 2m 30s the patch passed
+1 💚 shadedclient 21m 47s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 24s hadoop-hdfs-client in the patch passed.
+1 💚 asflicense 0m 37s The patch does not generate ASF License warnings.
100m 52s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4158/1/artifact/out/Dockerfile
GITHUB PR #4158
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 1d56d1eec633 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 45f926d
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4158/1/testReport/
Max. process+thread count 546 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client U: hadoop-hdfs-project/hadoop-hdfs-client
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4158/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@jojochuang
Copy link
Contributor

@leosunli is this something you'd be interested in reviewing?

@leosunli
Copy link
Contributor

@jojochuang thank for your reminder, I will review this PR.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 38s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 16m 10s Maven dependency ordering for branch
+1 💚 mvninstall 25m 11s trunk passed
+1 💚 compile 6m 9s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 5m 46s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 16s trunk passed
+1 💚 mvnsite 2m 29s trunk passed
+1 💚 javadoc 1m 50s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 2m 18s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 0s trunk passed
+1 💚 shadedclient 22m 42s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 28s Maven dependency ordering for patch
+1 💚 mvninstall 2m 4s the patch passed
+1 💚 compile 5m 59s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 5m 59s the patch passed
+1 💚 compile 5m 49s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 5m 49s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 5s the patch passed
+1 💚 mvnsite 2m 8s the patch passed
+1 💚 javadoc 1m 28s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 1m 57s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 5m 50s the patch passed
+1 💚 shadedclient 22m 47s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 23s hadoop-hdfs-client in the patch passed.
+1 💚 unit 233m 13s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 44s The patch does not generate ASF License warnings.
374m 32s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4158/2/artifact/out/Dockerfile
GITHUB PR #4158
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 07f572f61ba0 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 02bc8ad
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4158/2/testReport/
Max. process+thread count 3097 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4158/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@leosunli leosunli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank @stiga-huang for catching it. I add two simple comments.

@@ -240,7 +245,7 @@ public void run() {
} else {
shm.getEndpointShmManager().shutdown(shm);
IOUtilsClient.cleanupWithLogger(LOG, domainSocket, out);
domainSocket = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you remove the line 243? If remove it, the line 228 looks redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you remove the line 243?

domainSocket is now a local variable (defined at line 192). We don't need to update it at the end of this method. Note that this finally clause belongs to the try-clause starts at line 195.

If remove it, the line 228 looks redundant.

For line 228, it's needed in the next retry of the while-loop. The check at line 199 will catch it and trigger a re-connect.

@@ -957,6 +958,83 @@ public void testDomainSocketClosedByDN() throws Exception {
}
}

// Regression test for HDFS-16473
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the connection between this UT and HDFS-16473.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I gave a wrong JIRA. Thanks for catching this! It should be HDFS-16535.

@stiga-huang
Copy link
Contributor Author

Thank @leosunli! I updated the comments.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 38s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 15m 52s Maven dependency ordering for branch
+1 💚 mvninstall 28m 29s trunk passed
+1 💚 compile 7m 0s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 6m 23s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 1m 26s trunk passed
+1 💚 mvnsite 2m 34s trunk passed
+1 💚 javadoc 1m 51s trunk passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 2m 19s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 38s trunk passed
+1 💚 shadedclient 25m 53s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 29s Maven dependency ordering for patch
+1 💚 mvninstall 2m 13s the patch passed
+1 💚 compile 6m 45s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 6m 45s the patch passed
+1 💚 compile 6m 25s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 6m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 12s the patch passed
+1 💚 mvnsite 2m 24s the patch passed
+1 💚 javadoc 1m 34s the patch passed with JDK Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 2m 4s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 6m 16s the patch passed
+1 💚 shadedclient 26m 9s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 31s hadoop-hdfs-client in the patch passed.
+1 💚 unit 247m 8s hadoop-hdfs in the patch passed.
+1 💚 asflicense 0m 49s The patch does not generate ASF License warnings.
402m 38s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4158/3/artifact/out/Dockerfile
GITHUB PR #4158
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 3fe566d3fcbc 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 850b657
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.14.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4158/3/testReport/
Max. process+thread count 3451 (vs. ulimit of 5500)
modules C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4158/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@leosunli
Copy link
Contributor

LGFM +1.

@jojochuang jojochuang merged commit 35d4c02 into apache:trunk Apr 18, 2022
asfgit pushed a commit that referenced this pull request Apr 18, 2022
…et paths (#4158)

Reviewed-by: Lisheng Sun <[email protected]>
(cherry picked from commit 35d4c02)
HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
…ased on socket paths (apache#4158)

Reviewed-by: Lisheng Sun <[email protected]>
(cherry picked from commit 35d4c02)

Change-Id: Ie4609bf36ada0156af684ace80222f699dad41d1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants