Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Had to resort to force-closing job, something went wrong? #30300

Closed
DaveCTurner opened this issue May 1, 2018 · 2 comments · Fixed by #37770
Closed

[CI] Had to resort to force-closing job, something went wrong? #30300

DaveCTurner opened this issue May 1, 2018 · 2 comments · Fixed by #37770
Labels
:ml Machine learning >test-failure Triaged test failures from CI

Comments

@DaveCTurner
Copy link
Contributor

https://internal-ci.elastic.co/job/elastic+x-pack-elasticsearch+6.2+matrix-java-periodic/ES_BUILD_JAVA=java10,ES_RUNTIME_JAVA=java8,nodes=linux/25/console failed. The REPRODUCE WITH line does not reproduce this for me:

REPRODUCE WITH: ./gradlew :x-pack-elasticsearch:plugin:ml:internalClusterTest \
  -Dtests.seed=AE4F7001DFF3186A \
  -Dtests.class=org.elasticsearch.xpack.ml.integration.TooManyJobsIT \
  -Dtests.method="testMultipleNodes" \
  -Dtests.security.manager=true \
  -Dtests.locale=en-CA \
  -Dtests.timezone=Indian/Maldives

It seems that there was some confusion about the existence or otherwise of job max-number-of-jobs-limit-job-130:

  1> [2018-05-01T04:45:24,023][INFO ][o.e.x.m.j.p.a.AutodetectProcessManager] [node_t3] Closing job [max-number-of-jobs-limit-job-91], because [close job (api)]
  1> [2018-05-01T04:45:24,025][INFO ][o.e.x.m.j.p.a.o.AutoDetectResultProcessor] [max-number-of-jobs-limit-job-91] 0 buckets parsed from autodetect output
  1> [2018-05-01T04:45:24,037][INFO ][o.e.x.m.j.p.a.AutodetectCommunicator] [max-number-of-jobs-limit-job-91] job closed
  1> [2018-05-01T04:45:24,176][INFO ][o.e.x.m.j.p.a.AutodetectProcessManager] [node_t3] Closing job [max-number-of-jobs-limit-job-148], because [close job (api)]
  1> [2018-05-01T04:45:24,179][INFO ][o.e.x.m.j.p.a.o.AutoDetectResultProcessor] [max-number-of-jobs-limit-job-148] 0 buckets parsed from autodetect output
  1> [2018-05-01T04:45:24,188][INFO ][o.e.x.m.j.p.a.AutodetectCommunicator] [max-number-of-jobs-limit-job-148] job closed
  1> [2018-05-01T04:45:24,277][INFO ][o.e.x.m.j.p.a.AutodetectProcessManager] [node_t3] Closing job [max-number-of-jobs-limit-job-73], because [close job (api)]
  2> NOTE: leaving temporary files on disk at: /var/lib/jenkins/workspace/elastic+x-pack-elasticsearch+6.2+matrix-java-periodic/ES_BUILD_JAVA/java10/ES_RUNTIME_JAVA/java8/nodes/linux/elasticsearch-extra/x-pack-elasticsearch/plugin/ml/build/testrun/internalClusterTest/J0/temp/org.elasticsearch.xpack.ml.integration.TooManyJobsIT_AE4F7001DFF3186A-001
  2> NOTE: test params are: codec=Asserting(Lucene70): {}, docValues:{}, maxPointsInLeafNode=422, maxMBSortInHeap=7.53989832981585, sim=RandomSimilarity(queryNorm=true): {}, locale=en-CA, timezone=Indian/Maldives
  2> NOTE: Linux 3.10.0-693.17.1.el7.x86_64 amd64/Oracle Corporation 1.8.0_162 (64-bit)/cpus=8,threads=1,free=316431048,total=508559360
  2> NOTE: All tests run in this JVM: [AutodetectResultProcessorIT, JobProviderIT, TooManyJobsIT]

FAILURE: Build failed with an exception.

  1> [2018-05-01T04:45:24,280][INFO ][o.e.x.m.j.p.a.o.AutoDetectResultProcessor] [max-number-of-jobs-limit-job-73] 0 buckets parsed from autodetect output
* What went wrong:
Execution failed for task ':x-pack-elasticsearch:plugin:ml:internalClusterTest'.
  1> [2018-05-01T04:45:24,289][INFO ][o.e.x.m.j.p.a.AutodetectCommunicator] [max-number-of-jobs-limit-job-73] job closed
> There were test failures: 9 suites, 41 tests, 1 error [seed: AE4F7001DFF3186A]
  1> [2018-05-01T04:45:24,370][INFO ][o.e.x.m.j.p.a.AutodetectProcessManager] [node_t3] Closing job [max-number-of-jobs-limit-job-205], because [close job (api)]

  1> [2018-05-01T04:45:24,382][INFO ][o.e.x.m.j.p.a.o.AutoDetectResultProcessor] [max-number-of-jobs-limit-job-205] 0 buckets parsed from autodetect output
* Try:
  1> [2018-05-01T04:45:24,394][INFO ][o.e.x.m.j.p.a.AutodetectCommunicator] [max-number-of-jobs-limit-job-205] job closed
Run with --stacktrace option to get the stack trace. Run with --debug option to get more log output. Run with --scan to get full insights.
  1> [2018-05-01T04:45:24,455][INFO ][o.e.x.m.j.p.a.AutodetectProcessManager] [node_t3] Closing job [max-number-of-jobs-limit-job-55], because [close job (api)]

  1> [2018-05-01T04:45:24,457][INFO ][o.e.x.m.j.p.a.o.AutoDetectResultProcessor] [max-number-of-jobs-limit-job-55] 0 buckets parsed from autodetect output
* Get more help at https://help.gradle.org
  1> [2018-05-01T04:45:24,467][INFO ][o.e.x.m.j.p.a.AutodetectCommunicator] [max-number-of-jobs-limit-job-55] job closed

  1> [2018-05-01T04:45:24,552][INFO ][o.e.x.m.j.p.a.AutodetectProcessManager] [node_t3] Closing job [max-number-of-jobs-limit-job-130], because [close job (api)]
BUILD FAILED in 32m 43s
  1> [2018-05-01T04:45:24,555][INFO ][o.e.x.m.j.p.a.o.AutoDetectResultProcessor] [max-number-of-jobs-limit-job-130] 0 buckets parsed from autodetect output
  1> [2018-05-01T04:45:24,564][INFO ][o.e.x.m.j.p.a.AutodetectCommunicator] [max-number-of-jobs-limit-job-130] job closed
  1> [2018-05-01T04:45:44,660][WARN ][o.e.x.m.i.TooManyJobsIT  ] Force-closing jobs failed.
  1> java.util.concurrent.ExecutionException: RemoteTransportException[[node_t1][127.0.0.1:55882][cluster:admin/xpack/ml/job/close]]; nested: ElasticsearchException[Failed to force close job [_all] with [1] failures, rethrowing last, all Exceptions: [the task with id job-max-number-of-jobs-limit-job-130 doesn't exist]]; nested: ResourceNotFoundException[the task with id job-max-number-of-jobs-limit-job-130 doesn't exist];
  1>  at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.getValue(BaseFuture.java:265) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:252) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:94) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.xpack.ml.support.BaseMlIntegTestCase.deleteAllJobs(BaseMlIntegTestCase.java:337) [test/:?]
  1>  at org.elasticsearch.xpack.ml.support.BaseMlIntegTestCase.cleanupWorkaround(BaseMlIntegTestCase.java:214) [test/:?]
  1>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
  1>  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
  1>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
  1>  at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_162]
  1>  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:965) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) [lucene-test-framework-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:44]
  1>  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) [randomizedtesting-runner-2.5.2.jar:?]
  1>  at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
  1> Caused by: org.elasticsearch.transport.RemoteTransportException: [node_t1][127.0.0.1:55882][cluster:admin/xpack/ml/job/close]
  1> Caused by: org.elasticsearch.ElasticsearchException: Failed to force close job [_all] with [1] failures, rethrowing last, all Exceptions: [the task with id job-max-number-of-jobs-limit-job-130 doesn't exist]
  1>  at org.elasticsearch.xpack.ml.action.TransportCloseJobAction$2.sendResponseOrFailure(TransportCloseJobAction.java:367) ~[main/:?]
  1>  at org.elasticsearch.xpack.ml.action.TransportCloseJobAction$2.onFailure(TransportCloseJobAction.java:346) ~[main/:?]
  1>  at org.elasticsearch.action.ActionListener$1.onFailure(ActionListener.java:68) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.action.support.ContextPreservingActionListener.onFailure(ContextPreservingActionListener.java:50) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.action.support.TransportAction$1.onFailure(TransportAction.java:91) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$1.onFailure(TransportMasterNodeAction.java:160) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.xpack.core.persistent.RemovePersistentTaskAction$TransportAction$1.onFailure(RemovePersistentTaskAction.java:164) ~[x-pack-core-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.xpack.core.persistent.PersistentTasksClusterService$3.onFailure(PersistentTasksClusterService.java:162) ~[x-pack-core-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.MasterService$SafeClusterStateTaskListener.onFailure(MasterService.java:473) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.MasterService$TaskOutputs.notifyFailedTasks(MasterService.java:408) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:199) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:133) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_162]
  1>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_162]
  1>  ... 1 more
  1> Caused by: org.elasticsearch.ResourceNotFoundException: the task with id job-max-number-of-jobs-limit-job-130 doesn't exist
  1>  at org.elasticsearch.xpack.core.persistent.PersistentTasksClusterService$3.execute(PersistentTasksClusterService.java:156) ~[x-pack-core-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:45) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:643) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:273) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:198) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:133) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) ~[elasticsearch-6.2.5-SNAPSHOT.jar:6.2.5-SNAPSHOT]
  1>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_162]
  1>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_162]
  1>  ... 1 more

This led to the following failure:

java.lang.RuntimeException: Had to resort to force-closing job, something went wrong?
  at __randomizedtesting.SeedInfo.seed([AE4F7001DFF3186A:DADB3F1D7CC6C5]:0)
  at org.elasticsearch.xpack.ml.support.BaseMlIntegTestCase.deleteAllJobs(BaseMlIntegTestCase.java:342)
  at org.elasticsearch.xpack.ml.support.BaseMlIntegTestCase.cleanupWorkaround(BaseMlIntegTestCase.java:214)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:965)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
  at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: RemoteTransportException[[node_t1][127.0.0.1:55882][cluster:admin/xpack/ml/job/close]]; nested: IllegalStateException[timed out after 20s];
  at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.getValue(BaseFuture.java:265)
  at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:252)
  at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:94)
  at org.elasticsearch.xpack.ml.support.BaseMlIntegTestCase.deleteAllJobs(BaseMlIntegTestCase.java:329)
  ... 36 more
Caused by: RemoteTransportException[[node_t1][127.0.0.1:55882][cluster:admin/xpack/ml/job/close]]; nested: IllegalStateException[timed out after 20s];
Caused by: java.lang.IllegalStateException: timed out after 20s
  at org.elasticsearch.xpack.core.persistent.PersistentTasksService$2.onTimeout(PersistentTasksService.java:181)
  at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:317)
  at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:244)
  at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:581)
  at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)

@DaveCTurner DaveCTurner added >test Issues or PRs that are addressing/adding tests :ml Machine learning labels May 1, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@droberts195
Copy link
Contributor

I saw this again in a local test run that I did. The underlying reason why it takes so long to close the large number of jobs that TooManyJobsIT opens is that they do a renormalization on close. If they had processed data this would be justified, but they don't in this test so it isn't. I opened elastic/ml-cpp#393 to fix the root cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >test-failure Triaged test failures from CI
Projects
None yet
4 participants