Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] SearchStatsIT & IndexingIT failures #37179

Closed
davidkyle opened this issue Jan 7, 2019 · 8 comments · Fixed by #37180
Closed

[CI] SearchStatsIT & IndexingIT failures #37179

davidkyle opened this issue Jan 7, 2019 · 8 comments · Fixed by #37180
Assignees
Labels
:Search/Search Search-related issues that do not fall into other categories >test-failure Triaged test failures from CI

Comments

@davidkyle
Copy link
Member

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/1090/console

This is a reproducible failure that errors with

  2> java.lang.AssertionError
  2>    at __randomizedtesting.SeedInfo.seed([33185699574722A3]:0)
  2>    at org.elasticsearch.action.search.SearchPhaseController$TopDocsStats.getTotalHits(SearchPhaseController.java:763)
  2>    at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:510)
  2>    at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:421)
  2>    at org.elasticsearch.action.search.SearchPhaseController.reducedScrollQueryPhase(SearchPhaseController.java:412)
  2>    at org.elasticsearch.action.search.SearchScrollQueryThenFetchAsyncAction$1.run(SearchScrollQueryThenFetchAsyncAction.java:71)
  2>    at org.elasticsearch.action.search.SearchScrollAsyncAction$1.innerOnResponse(SearchScrollAsyncAction.java:189)
  2>    at org.elasticsearch.action.search.SearchActionListener.onResponse(SearchActionListener.java:45)
./gradlew :server:integTest \
  -Dtests.seed=33185699574722A3 \
  -Dtests.class=org.elasticsearch.search.stats.SearchStatsIT \
  -Dtests.method="testOpenContexts" \
  -Dtests.security.manager=true \
  -Dtests.locale=ga-IE \
  -Dtests.timezone=Africa/Windhoek \
  -Dcompiler.java=11 \
  -Druntime.java=8
@davidkyle davidkyle added :Search/Search Search-related issues that do not fall into other categories >test-failure Triaged test failures from CI labels Jan 7, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@jimczi jimczi self-assigned this Jan 7, 2019
@davidkyle
Copy link
Member Author

Muted on master in 7cc749d

@davidkyle
Copy link
Member Author

FieldLevelSecurityTests failed with the same assertion.

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA=openjdk12,ES_RUNTIME_JAVA=java11,nodes=virtual&&linux/165/console

Does not reproduce

./gradlew :x-pack:plugin:security:unitTest \
  -Dtests.seed=6624CB3E3B1A0E76 \
  -Dtests.class=org.elasticsearch.integration.FieldLevelSecurityTests \
  -Dtests.method="testScroll" \
  -Dtests.security.manager=true \
  -Dtests.locale=sr-Latn-ME \
  -Dtests.timezone=Africa/Malabo \
  -Dcompiler.java=12 \
  -Druntime.java=11
Caused by: java.lang.AssertionError
	at __randomizedtesting.SeedInfo.seed([6624CB3E3B1A0E76]:0)
	at org.elasticsearch.action.search.SearchPhaseController$TopDocsStats.getTotalHits(SearchPhaseController.java:763)
	at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:510)
	at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:421)
	at org.elasticsearch.action.search.SearchPhaseController.reducedScrollQueryPhase(SearchPhaseController.java:412)

@droberts195
Copy link
Contributor

@jimczi please could you check if the problem in
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.6+matrix-java-periodic/ES_BUILD_JAVA=java11,ES_RUNTIME_JAVA=zulu11,nodes=virtual&&linux/38/consoleText is the same?

This failure happened in an ML test which timed out because it took more than 20 minutes. However, the root cause is that an assertion tripped while initializing a scroll:

ERROR   0.00s J3 | TooManyJobsIT (suite) <<< FAILURES!
   > Throwable #1: java.lang.Exception: Suite timeout exceeded (>= 1200000 msec).
   >    at __randomizedtesting.SeedInfo.seed([ACEE7CCD596311C5]:0)
   > Throwable #2: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=1882, name=elasticsearch[node_t1][search][T#3], state=RUNNABLE, group=TGRP-TooManyJobsIT]
   > Caused by: java.lang.AssertionError
   >    at __randomizedtesting.SeedInfo.seed([ACEE7CCD596311C5]:0)
   >    at org.elasticsearch.index.search.stats.ShardSearchStats.lambda$onQueryPhase$2(ShardSearchStats.java:101)
   >    at org.elasticsearch.index.search.stats.ShardSearchStats.computeStats(ShardSearchStats.java:142)
   >    at org.elasticsearch.index.search.stats.ShardSearchStats.onQueryPhase(ShardSearchStats.java:93)
   >    at org.elasticsearch.index.shard.SearchOperationListener$CompositeListener.onQueryPhase(SearchOperationListener.java:155)
   >    at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:407)
   >    at org.elasticsearch.search.SearchService.access$100(SearchService.java:126)
   >    at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:360)
   >    at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:356)
   >    at org.elasticsearch.search.SearchService$4.doRun(SearchService.java:1117)
   >    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:759)
   >    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
   >    at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41)
   >    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
   >    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   >    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   >    at java.base/java.lang.Thread.run(Thread.java:834)

That failure was in 6.6.

If it's a completely different problem and you'd prefer it raised in a new issue let me know.

@jimczi
Copy link
Contributor

jimczi commented Jan 7, 2019

If it's a completely different problem and you'd prefer it raised in a new issue let me know.

Yes this is a different issue, the one described here should only happen on master.

@droberts195
Copy link
Contributor

Yes this is a different issue, the one described here should only happen on master

OK cool. I transferred it to #37185.

@davidkyle
Copy link
Member Author

I think this assertion is also causing failures in the mixed cluster test IndexingIT as the error takes down the node.

From the log file:

[2019-01-07T00:42:11,829][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-3] fatal error in thread [elasticsearch[node-3][search][T#24]], exiting
java.lang.AssertionError: null
	at org.elasticsearch.action.search.SearchPhaseController$TopDocsStats.getTotalHits(SearchPhaseController.java:763) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:510) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:421) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at org.elasticsearch.action.search.SearchPhaseController$1.reduce(SearchPhaseController.java:737) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:101) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at org.elasticsearch.action.search.FetchSearchPhase.access$000(FetchSearchPhase.java:44) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:86) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:759) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.0.0-SNAPSHOT.jar:7.0.0-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?]

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=centos/165/console

@davidkyle davidkyle changed the title [CI] SearchStatsIT failure [CI] SearchStatsIT & IndexingIT failures Jan 7, 2019
jimczi added a commit that referenced this issue Jan 8, 2019
This change fixes an unreleased bug that assigns the wrong totalHits to scroll
queries.

Closes #37179
@jimczi
Copy link
Contributor

jimczi commented Jan 9, 2019

IndexingIT is still failing with the same failure:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+bwc-tests/307/console
I am not able to reproduce yet but the previous fix is not enough so reopening.

@jimczi jimczi reopened this Jan 9, 2019
jimczi added a commit that referenced this issue Jan 9, 2019
…etTotalHits

This change turns an assertion into an IllegalStateException in SearchPhaseController#getTotalHits.
The goal is to help identify the cause of the failures in #37179
which seems to fail only in CI.
The assertion will be restored when the issue is solved (NORELEASE).
@jimczi jimczi closed this as completed in 95479f1 Jan 9, 2019
jimczi added a commit that referenced this issue Jan 9, 2019
This commit fixes the clone of TopFieldDocs.

Relates #37179
Relates #37266
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search/Search Search-related issues that do not fall into other categories >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants