Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TEST] TransportClientNodesServiceTests#testListenerFailures() fails #37567

Closed
ywelsch opened this issue Jan 17, 2019 · 2 comments
Closed

[TEST] TransportClientNodesServiceTests#testListenerFailures() fails #37567

ywelsch opened this issue Jan 17, 2019 · 2 comments
Assignees
Labels
:Distributed Coordination/Network Http and internode communication implementations >test-failure Triaged test failures from CI

Comments

@ywelsch
Copy link
Contributor

ywelsch commented Jan 17, 2019

Failure:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA=openjdk12,ES_RUNTIME_JAVA=openjdk12,nodes=virtual&&linux/184/

Has been failing quite often these last days (probably due to slow CI), but has also failed a few times before in the last year for the same reason.

Relevant log lines:

1> [2019-01-17T05:17:32,346][INFO ][o.e.c.t.TransportClientNodesServiceTests] [testListenerFailures] before test
  1> [2019-01-17T05:17:58,696][INFO ][o.e.c.t.TransportClientNodesService] [transport-client-nodes-service-tests] failed to get node info for {#transport#-1}{3XpWofOQTZeOBYTNvrnkDA}{0.0.0.0}{0.0.0.0:150}, disconnecting...
  1> org.elasticsearch.transport.SendRequestTransportException: [][0.0.0.0:150][cluster:monitor/nodes/liveness]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:640) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesServiceTests$TestIteration$2$1.sendRequest(TransportClientNodesServiceTests.java:156) ~[test/:?]
  1> 	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:543) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler.doSample(TransportClientNodesService.java:420) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$NodeSampler.sample(TransportClientNodesService.java:361) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$ScheduledNodeSampler.run(TransportClientNodesService.java:393) [main/:?]
  1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) [main/:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
  1> 	at java.lang.Thread.run(Thread.java:835) [?:?]
  1> Caused by: org.elasticsearch.transport.ConnectTransportException: [][0.0.0.0:150] node not available
  1> 	at org.elasticsearch.client.transport.FailAndRetryMockTransport$1.sendRequest(FailAndRetryMockTransport.java:121) ~[test/:?]
  1> 	at org.elasticsearch.test.transport.StubbableTransport$WrappedConnection.sendRequest(StubbableTransport.java:219) ~[framework-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:628) ~[main/:?]
  1> 	... 9 more
  1> [2019-01-17T05:17:58,698][INFO ][o.e.c.t.TransportClientNodesService] [transport-client-nodes-service-tests] failed to get node info for {#transport#-2}{9-sT8EriRtKHTmK8yQWJrw}{0.0.0.0}{0.0.0.0:151}, disconnecting...
  1> org.elasticsearch.transport.SendRequestTransportException: [][0.0.0.0:151][cluster:monitor/nodes/liveness]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:640) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesServiceTests$TestIteration$2$1.sendRequest(TransportClientNodesServiceTests.java:156) ~[test/:?]
  1> 	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:543) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler.doSample(TransportClientNodesService.java:420) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$NodeSampler.sample(TransportClientNodesService.java:361) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$ScheduledNodeSampler.run(TransportClientNodesService.java:393) [main/:?]
  1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) [main/:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
  1> 	at java.lang.Thread.run(Thread.java:835) [?:?]
  1> Caused by: org.elasticsearch.transport.ConnectTransportException: [][0.0.0.0:151] node not available
  1> 	at org.elasticsearch.client.transport.FailAndRetryMockTransport$1.sendRequest(FailAndRetryMockTransport.java:121) ~[test/:?]
  1> 	at org.elasticsearch.test.transport.StubbableTransport$WrappedConnection.sendRequest(StubbableTransport.java:219) ~[framework-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:628) ~[main/:?]
  1> 	... 9 more
  1> [2019-01-17T05:17:58,699][INFO ][o.e.c.t.TransportClientNodesService] [transport-client-nodes-service-tests] failed to get node info for {#transport#-3}{0xsVbXXuTuepxFpPlRKteQ}{0.0.0.0}{0.0.0.0:152}, disconnecting...
  1> org.elasticsearch.transport.SendRequestTransportException: [][0.0.0.0:152][cluster:monitor/nodes/liveness]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:640) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesServiceTests$TestIteration$2$1.sendRequest(TransportClientNodesServiceTests.java:156) ~[test/:?]
  1> 	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:543) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler.doSample(TransportClientNodesService.java:420) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$NodeSampler.sample(TransportClientNodesService.java:361) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$ScheduledNodeSampler.run(TransportClientNodesService.java:393) [main/:?]
  1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) [main/:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
  1> 	at java.lang.Thread.run(Thread.java:835) [?:?]
  1> Caused by: org.elasticsearch.transport.ConnectTransportException: [][0.0.0.0:152] node not available
  1> 	at org.elasticsearch.client.transport.FailAndRetryMockTransport$1.sendRequest(FailAndRetryMockTransport.java:121) ~[test/:?]
  1> 	at org.elasticsearch.test.transport.StubbableTransport$WrappedConnection.sendRequest(StubbableTransport.java:219) ~[framework-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:628) ~[main/:?]
  1> 	... 9 more
  1> [2019-01-17T05:17:58,711][INFO ][o.e.c.t.TransportClientNodesService] [transport-client-nodes-service-tests] failed to get node info for {#transport#-4}{XR_xNdMoSku8lHI1XT5pEw}{0.0.0.0}{0.0.0.0:153}, disconnecting...
  1> org.elasticsearch.transport.SendRequestTransportException: [][0.0.0.0:153][cluster:monitor/nodes/liveness]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:640) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesServiceTests$TestIteration$2$1.sendRequest(TransportClientNodesServiceTests.java:156) ~[test/:?]
  1> 	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:543) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler.doSample(TransportClientNodesService.java:420) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$NodeSampler.sample(TransportClientNodesService.java:361) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$ScheduledNodeSampler.run(TransportClientNodesService.java:393) [main/:?]
  1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) [main/:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
  1> 	at java.lang.Thread.run(Thread.java:835) [?:?]
  1> Caused by: org.elasticsearch.transport.ConnectTransportException: [][0.0.0.0:153] node not available
  1> 	at org.elasticsearch.client.transport.FailAndRetryMockTransport$1.sendRequest(FailAndRetryMockTransport.java:121) ~[test/:?]
  1> 	at org.elasticsearch.test.transport.StubbableTransport$WrappedConnection.sendRequest(StubbableTransport.java:219) ~[framework-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:628) ~[main/:?]
  1> 	... 9 more
  1> [2019-01-17T05:17:58,712][INFO ][o.e.c.t.TransportClientNodesService] [transport-client-nodes-service-tests] failed to get node info for {#transport#-5}{yJimFiniQpqkPM4LrLCGaQ}{0.0.0.0}{0.0.0.0:154}, disconnecting...
  1> org.elasticsearch.transport.SendRequestTransportException: [][0.0.0.0:154][cluster:monitor/nodes/liveness]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:640) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesServiceTests$TestIteration$2$1.sendRequest(TransportClientNodesServiceTests.java:156) ~[test/:?]
  1> 	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:543) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler.doSample(TransportClientNodesService.java:420) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$NodeSampler.sample(TransportClientNodesService.java:361) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$ScheduledNodeSampler.run(TransportClientNodesService.java:393) [main/:?]
  1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) [main/:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
  1> 	at java.lang.Thread.run(Thread.java:835) [?:?]
  1> Caused by: org.elasticsearch.transport.TransportException: TransportService is closed stopped can't send request
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:622) ~[main/:?]
  1> 	... 9 more
  1> [2019-01-17T05:17:58,714][INFO ][o.e.c.t.TransportClientNodesService] [transport-client-nodes-service-tests] failed to get node info for {#transport#-6}{-wz-iMWkQIKOsu6RnuQ1cQ}{0.0.0.0}{0.0.0.0:155}, disconnecting...
  1> org.elasticsearch.transport.SendRequestTransportException: [][0.0.0.0:155][cluster:monitor/nodes/liveness]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:640) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesServiceTests$TestIteration$2$1.sendRequest(TransportClientNodesServiceTests.java:156) ~[test/:?]
  1> 	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:543) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler.doSample(TransportClientNodesService.java:420) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$NodeSampler.sample(TransportClientNodesService.java:361) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$ScheduledNodeSampler.run(TransportClientNodesService.java:393) [main/:?]
  1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) [main/:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
  1> 	at java.lang.Thread.run(Thread.java:835) [?:?]
  1> Caused by: org.elasticsearch.transport.TransportException: TransportService is closed stopped can't send request
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:622) ~[main/:?]
  1> 	... 9 more
  1> [2019-01-17T05:17:58,714][INFO ][o.e.c.t.TransportClientNodesService] [transport-client-nodes-service-tests] failed to get node info for {#transport#-7}{SwYHcB1aT1yBjV5JB_15Ug}{0.0.0.0}{0.0.0.0:156}, disconnecting...
  1> org.elasticsearch.transport.SendRequestTransportException: [][0.0.0.0:156][cluster:monitor/nodes/liveness]
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:640) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesServiceTests$TestIteration$2$1.sendRequest(TransportClientNodesServiceTests.java:156) ~[test/:?]
  1> 	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:543) ~[main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$SimpleNodeSampler.doSample(TransportClientNodesService.java:420) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$NodeSampler.sample(TransportClientNodesService.java:361) [main/:?]
  1> 	at org.elasticsearch.client.transport.TransportClientNodesService$ScheduledNodeSampler.run(TransportClientNodesService.java:393) [main/:?]
  1> 	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:660) [main/:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
  1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
  1> 	at java.lang.Thread.run(Thread.java:835) [?:?]
  1> Caused by: org.elasticsearch.transport.TransportException: TransportService is closed stopped can't send request
  1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:622) ~[main/:?]
  1> 	... 9 more
  1> [2019-01-17T05:17:58,718][INFO ][o.e.c.t.TransportClientNodesServiceTests] [testListenerFailures] after test
FAILURE 26.4s J0 | TransportClientNodesServiceTests.testListenerFailures <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: 
  2> NOTE: leaving temporary files on disk at: /var/lib/jenkins/workspace/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA/openjdk12/ES_RUNTIME_JAVA/openjdk12/nodes/virtual&&linux/server/build/testrun/unitTest/J0/temp/org.elasticsearch.client.transport.TransportClientNodesServiceTests_3B98F6E6CDBD99BB-001
  2> NOTE: test params are: codec=Asserting(Lucene80): {}, docValues:{}, maxPointsInLeafNode=1410, maxMBSortInHeap=6.779936463485111, sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@4bc6e4ed), locale=wo-SN, timezone=Pacific/Enderbury
  2> NOTE: Linux 3.16.0-7-amd64 amd64/Oracle Corporation 12-ea (64-bit)/cpus=16,threads=1,free=355963064,total=536870912
   > Expected: <6>
  2> NOTE: All tests run in this JVM: [ScriptContextTests, NodeJoinControllerTests, NamedXContentProviderTests, DeDuplicatingTokenFilterTests, IdsQueryBuilderTests, LagDetectorTests, SourceFieldTypeTests, UUIDTests, XContentParserUtilsTests, SearchServiceTests, ClusterStatsNodesTests, PeerFinderMessagesTests, IndexShardRetentionLeaseTests, DeleteResponseTests, TransportClientNodesServiceTests]
   >      but: was <4>
   > 	at __randomizedtesting.SeedInfo.seed([3B98F6E6CDBD99BB:72EEE27E4B78DA8C]:0)
   > 	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
   > 	at org.elasticsearch.client.transport.TransportClientNodesServiceTests.testListenerFailures(TransportClientNodesServiceTests.java:299)
   > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   > 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   > 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   > 	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
   > 	at java.base/java.lang.Thread.run(Thread.java:835)

To me it looks as if we're not properly handling the SendRequestTransportException in TransportClientNodesService.

I will mute the test for now.

@ywelsch ywelsch added :Distributed Coordination/Network Http and internode communication implementations >test-failure Triaged test failures from CI labels Jan 17, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

cshtdd pushed a commit to cshtdd/elasticsearch that referenced this issue Jan 17, 2019
@ywelsch
Copy link
Contributor Author

ywelsch commented Jun 6, 2019

With the transport client going away, I don't think it makes sense to spend too much effort on this test. The issue I've identified is not causing any harm, it just does not report disconnects as quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Network Http and internode communication implementations >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

3 participants