-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18217. ExitUtil synchronized blocks reduced to avoid exit bloc… #4255
Conversation
…king halt + enlarged catches (robustness) to all Throwables (not just Exceptions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good; but the variable will need to be called `caught"
how to test this?
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java
Outdated
Show resolved
Hide resolved
checkstyle is complaining about line length. move the comments to the lines above the source and all is good. we don't normally have comments of any kind on the same lines as code |
Hi, Thanks for the reviews. I've started to adjust my code to integrate them. Ok, I'll adjust code style about comment + rename the variable to caught. I'm also adjusting the javadoc to make it clear that in enable mode all throwables are caught, even errors (except maybe only ThreadDeath if it happen after the last try/catch). And in disable mode, the argument exception is re-throw suppressing other internally thrown throwables unless at least one is an error, in which case the 1st error one is rethrow, suppressing anything else. like before the patch: I don't document about RuntimeException subclasses or Error that could be thrown by System.exit/Runtime.halt. About atomic versus volatile. The "disable" flags (for halt and exit) are not used in test-and-set pattern, nor any kind of do-several-actions that would have to be atomically done so volatile has enough thread sharing capabilities. The "first(Exit|Halt)Exception" fields would benefit from it though so I'm changing them to AtomicReference. It the "disable" flags were more tightly coupled with the "first" fields, then a synchronized would be needed to handle both field values and ensure they always describe a legit ("flag", "first") tuple but that's not the case here too: each can be resetted whatever the other value is. Moving to Atomic has also another benefit: it allows to remove all synchronized blocks on the ExitUtil class, so even if badly/evil code enters a synchronized blocks on ExitUtil.class and never leaves it, ExitUtil terminate/halt code will still be able run without blocking. About tests : Since i'm changing to AtomicReference I will add some (i still have to check about maybe some existing one) but only when in disable mode. When enable mode, I'm a bit afraid of the impact if I miss something in the tests. A real test would require to spawn an external JVM with a valid classpath that would need to call ExitUtil with expected codes and check its return code when it dies or kills it after a timeout (the expected codes need to differ from those due to a kill on timeout). If the test for some obscur reason really go wrong and do not kill the external JVM I don't want to impact other tests because of a hanging JVM. If the server/platform is supposed to have a long uptime, I don't want to leak spawned processes, making it an unstable/unreliable testing environment after several test runs. I don't know if such tests would need to be cross platform too. I've access only to windows and linux, not MacOS so i won't be able to check if I really kill the spawned JVM (i would use java.lang.Process which should be portable, even for the kill). I also don't know the impact on other developpers if they run tests on their own machine. |
thanks for that detailed explanation. atomic references sound file |
The JDK11 javadoc failures seems to all come from an InodeTree.java class this patch does not touch. The only 2 classes (ExitUtil.java and TestExitUtil.java) it is responsible for neither raise javadoc errors not warnings. But i may have miss-read the report :) If i want to fix the javadoc issue on InodeTree i should open another JIRA/check one is not already working on it ? Also my last pushed commit triggered this last failed build, if another pull request for another JIRA fixes the InodeTree javadoc issue, will it also trigger another (hopefully good) build for this patch pull request ? (sorry I'm not new to git but very new to pull requests :/) |
the inode stuff has been fixed elsewhere. if you rebase or merge your pr, it will go away. or we just ignore it, which simplifies keeping this commit history/discussion valid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happy with the production code, now it's test tuning
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestExitUtil.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestExitUtil.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestExitUtil.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestExitUtil.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestExitUtil.java
Outdated
Show resolved
Hide resolved
Hi @HerCath Thanks for the PR. Did you get a chance to address the comments from @steveloughran ? Seems to be pending for a while. If you busy, I will happy to make some progress. |
… intercept+messages+assertSame
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has now got big...propose some refactoring of the common setter.
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestExitUtil.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestExitUtil.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. i would have expected javadoc to complain about undocumented params; maybe it is different on private methods. anyway, add those and the patch is ready to go
+1 pending the javadoc tweak
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
+1, merging. thank you for your diligence here. |
Reduce the ExitUtil synchronized block scopes so System.exit and Runtime.halt calls aren't within their boundaries, so ExitUtil wrappers do not block each other. Enlarged catches to all Throwables (not just Exceptions). Contributed by Remi Catherinot
+cherrypicked to branch-3.3; ran new test suite and all is good. |
Reduce the ExitUtil synchronized block scopes so System.exit and Runtime.halt calls aren't within their boundaries, so ExitUtil wrappers do not block each other. Enlarged catches to all Throwables (not just Exceptions). Contributed by Remi Catherinot
…king halt + enlarged catches (robustness) to all Throwables (not just Exceptions)
Description of PR
I've reduced the synchronized blocks scope so System.exit and Runtime.halt calls aren't within their boundaries, so ExitUtil wrappers do not block each others (System.exit never returns if called and no SecurityException is raised, so ExitUtil.terminate was never releasing the acquired lock, thus forbidding ExitUtil.halt to be able to halt the JVM even when in the middle of a graceful shutdown, thus it was not behaving like the 2 wrapped java's methods System.exit/Runtime.halt which do not block each other)
I've altered throwable handling:
How was this patch tested?
No more tests than the existing ones (if any). This case is not really hard to reproduce but the test would need to exit a JVM. I've not added such tests because if unit does not fork, it would kills the test suite (thus impacting all tests). I think developing a robust test for this specific case is way more hard and dangerous to offset the cost of a review, the risk of what could be missed by this review.
Easiest way can be reproduced the initial bug: having a shutdown hook call ExitUtil.terminate, have another thread that will call ExitUtil.halt after (use pauses to ensure it calls it after the hook), witness the JVM not stopping and needing either an external kill or a internal Runtime.halt call, maybe check the JVM threads' stacks too to view the ExitUtil.terminate call stuck on System.exit, and ExitUtil.halt call stuck on ExitUtil.terminate.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?