Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-1397. Avoid the usage of signal handlers in datanodes of the MiniOzoneClusters #701

Closed
wants to merge 1 commit into from

Conversation

elek
Copy link
Member

@elek elek commented Apr 5, 2019

[~arpaga] showed me a problem that TestQueryNode.testHealthyNodesCount is failed in the CI check of HDDS-1339.

According to the logs the test is timed out because only 4 datanodes are started out of the 5.

The log also contained an exception from one datanode:

2019-04-04 00:26:33,583 WARN  ozone.HddsDatanodeService (LogAdapter.java:warn(59)) - failed to register any UNIX signal loggers: 
java.lang.IllegalStateException: Can't re-install the signal handlers.
    at org.apache.hadoop.util.SignalLogger.register(SignalLogger.java:77)
    at org.apache.hadoop.util.StringUtils.startupShutdownMessage(StringUtils.java:718)
    at org.apache.hadoop.util.StringUtils.startupShutdownMessage(StringUtils.java:707)
    at org.apache.hadoop.ozone.HddsDatanodeService.createHddsDatanodeService(HddsDatanodeService.java:126)
    at org.apache.hadoop.ozone.HddsDatanodeService.createHddsDatanodeService(HddsDatanodeService.java:108)
    at org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.createHddsDatanodes(MiniOzoneClusterImpl.java:552)

The code which requires the signal handler is the following (signal handler is registered in the startupShutdownMessage)

/**
   * Create an Datanode instance based on the supplied command-line arguments.
   * <p>
   * This method is intended for unit tests only. It suppresses the
   * startup/shutdown message and skips registering Unix signal handlers.
   *
   * @param args        command line arguments.
   * @param conf        HDDS configuration
   * @param printBanner if true, then log a verbose startup message.
   * @return Datanode instance
   */
  private static HddsDatanodeService createHddsDatanodeService(
      String[] args, Configuration conf, boolean printBanner) {
    if (args.length == 0 && printBanner) {
      StringUtils
          .startupShutdownMessage(HddsDatanodeService.class, args, LOG);
      return new HddsDatanodeService(conf);
    } else {
      new HddsDatanodeService().run(args);
      return null;
   }

As you can read from the comment it's expected to be called with printBanner=false to avoid the creation of the signal handler.

Note: In the startupShutdownMessage method a new signal handler is registered and signal handlers can be registered only once:

//SignalLogger
  void register(final LogAdapter LOG) {
   if (registered) {
      throw new IllegalStateException("Can't re-install the signal handlers.");
    }
 ....

We have a dedicated method to create datanode service for the unit tests. The only thing what we need is to turn OFF the signal handler registration here. (The following code fragment shows the original state where the signal handler creation is requested with the true parameter value)

  @VisibleForTesting
  public static HddsDatanodeService createHddsDatanodeService(
      String[] args, Configuration conf) {
    return createHddsDatanodeService(args, conf, true);
  }

See: https://issues.apache.org/jira/browse/HDDS-1397

@elek elek added the ozone label Apr 5, 2019
@hadoop-yetus

This comment has been minimized.

@hadoop-yetus

This comment has been minimized.

@elek
Copy link
Member Author

elek commented Apr 9, 2019

/retest

@hanishakoneru
Copy link
Contributor

hanishakoneru commented Apr 9, 2019

Thanks @elek for fixing this. LGTM. +1 pending Yetus/CI.

…iOzoneClusters. Contributed by Elek, Marton.
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 518 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1410 trunk passed
+1 compile 54 trunk passed
+1 checkstyle 23 trunk passed
+1 mvnsite 40 trunk passed
+1 shadedclient 831 branch has no errors when building and testing our client artifacts.
+1 findbugs 60 trunk passed
+1 javadoc 29 trunk passed
_ Patch Compile Tests _
+1 mvninstall 44 the patch passed
+1 compile 33 the patch passed
+1 javac 33 the patch passed
+1 checkstyle 16 the patch passed
+1 mvnsite 35 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 869 patch has no errors when building and testing our client artifacts.
+1 findbugs 65 the patch passed
+1 javadoc 28 the patch passed
_ Other Tests _
+1 unit 75 container-service in the patch passed.
+1 asflicense 28 The patch does not generate ASF License warnings.
4249
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-701/3/artifact/out/Dockerfile
GITHUB PR #701
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 67bfc5eb1ea6 4.4.0-143-generic #169~14.04.2-Ubuntu SMP Wed Feb 13 15:00:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 32722d2
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-701/3/testReport/
Max. process+thread count 316 (vs. ulimit of 5500)
modules C: hadoop-hdds/container-service U: hadoop-hdds/container-service
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-701/3/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@elek
Copy link
Member Author

elek commented Apr 10, 2019

@elek elek closed this in dfb518b Apr 10, 2019
shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
Side input stores are non-changelog stores that still needs to use logged stored directory to guarantee durability.

Author: bharathkk <[email protected]>

Reviewers: Prateek Maheshwari <[email protected]>

Closes apache#701 from bharathkk/side-input-fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants