Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] SearchWithRandomIOExceptionsIT testRandomDirectoryIOExceptions failing #106752

Closed
alex-spies opened this issue Mar 26, 2024 · 7 comments · Fixed by #107128
Closed

[CI] SearchWithRandomIOExceptionsIT testRandomDirectoryIOExceptions failing #106752

alex-spies opened this issue Mar 26, 2024 · 7 comments · Fixed by #107128
Assignees
Labels
low-risk An open issue or test failure that is a low risk to future releases :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch >test-failure Triaged test failures from CI

Comments

@alex-spies
Copy link
Contributor

Didn't reproduce with Java 21 (Doesn't build with Java 22 for me).

Build scan:
https://gradle-enterprise.elastic.co/s/nnfrfw6wwuiss/tests/:server:internalClusterTest/org.elasticsearch.search.basic.SearchWithRandomIOExceptionsIT/testRandomDirectoryIOExceptions

Reproduction line:

./gradlew ':server:internalClusterTest' --tests "org.elasticsearch.search.basic.SearchWithRandomIOExceptionsIT.testRandomDirectoryIOExceptions" -Dtests.seed=9B348C8D920C550D -Dtests.locale=ar-SA -Dtests.timezone=Pacific/Efate -Druntime.java=22

Applicable branches:
8.13

Reproduces locally?:
Didn't try

Failure history:
Failure dashboard for org.elasticsearch.search.basic.SearchWithRandomIOExceptionsIT#testRandomDirectoryIOExceptions

Failure excerpt:

java.lang.AssertionError: All incoming requests on node [node_s2] should have finished. Expected 0 bytes for requests in-flight but got 78 bytes; pending tasks [[]]

  at org.elasticsearch.test.InternalTestCluster.lambda$assertRequestsFinished$43(InternalTestCluster.java:2521)
  at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1255)
  at org.elasticsearch.test.InternalTestCluster.assertRequestsFinished(InternalTestCluster.java:2512)
  at org.elasticsearch.test.InternalTestCluster.assertAfterTest(InternalTestCluster.java:2486)
  at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:593)
  at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2316)
  at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
  at java.lang.reflect.Method.invoke(Method.java:580)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:1004)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
  at java.lang.Thread.run(Thread.java:1570)

@alex-spies alex-spies added :Search/Search Search-related issues that do not fall into other categories >test-failure Triaged test failures from CI labels Mar 26, 2024
@elasticsearchmachine elasticsearchmachine added blocker Team:Search Meta label for search team labels Mar 26, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@original-brownbear original-brownbear self-assigned this Apr 4, 2024
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Apr 4, 2024
…HasValueListener

We're in some cases tripping an assertion (`assertSearcherIsWarmedUp`) when we run the logic and no refresh actually happened
because of induced exceptions.
This really should only run if the refresh actually went through in any case.

fixes elastic#106752
@original-brownbear
Copy link
Member

This is coming from the new field caps functionality for filtering out empty fields:

WARNUNG: Uncaught exception in thread: Thread[#3248,elasticsearch[node_s1][refresh][T#1],5,TGRP-SearchWithRandomIOExceptionsIT]
java.lang.AssertionError: searcher was not warmed up yet for source[field_has_value]
	at __randomizedtesting.SeedInfo.seed([EAC4AFEF743AA0AD]:0)
	at org.elasticsearch.index.engine.InternalEngine.assertSearcherIsWarmedUp(InternalEngine.java:509)
	at org.elasticsearch.index.engine.Engine$1.acquireSearcherInternal(Engine.java:741)
	at org.elasticsearch.index.engine.Engine$SearcherSupplier.acquireSearcher(Engine.java:1351)
	at org.elasticsearch.index.engine.Engine.acquireSearcher(Engine.java:792)
	at org.elasticsearch.index.engine.Engine.acquireSearcher(Engine.java:785)
	at org.elasticsearch.index.engine.Engine.acquireSearcher(Engine.java:781)
	at org.elasticsearch.index.shard.IndexShard$RefreshFieldHasValueListener.afterRefresh(IndexShard.java:4007)
	at org.apache.lucene.search.ReferenceManager.notifyRefreshListenersRefreshed(ReferenceManager.java:275)
	at org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:182)
	at org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:240)
	at org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:2044)
	at org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:2015)
	at org.elasticsearch.index.engine.Engine.lambda$externalRefresh$9(Engine.java:1119)

fix incoming in #107128

@elasticsearchmachine elasticsearchmachine added the needs:risk Requires assignment of a risk label (low, medium, blocker) label Apr 24, 2024
@cbuescher
Copy link
Member

Hi, just checking in if there something missing with this fix that prevented it to be merged? @original-brownbear I'm assigning low risk for now assuming this is mostly a test setup issue.

@cbuescher cbuescher added low-risk An open issue or test failure that is a low risk to future releases and removed needs:risk Requires assignment of a risk label (low, medium, blocker) labels Apr 26, 2024
@kkrik-es
Copy link
Contributor

kkrik-es commented May 1, 2024

Seems like there's almost one flaky run per day. Muting for now.

@javanna javanna removed the :Search/Search Search-related issues that do not fall into other categories label Jul 17, 2024
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Search Meta label for search team labels Jul 17, 2024
@javanna javanna added :Search Foundations/Search Catch all for Search Foundations and removed needs:triage Requires assignment of a team area label labels Jul 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine elasticsearchmachine added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 17, 2024
original-brownbear added a commit that referenced this issue Sep 30, 2024
…HasValueListener (#107128)

We're in some cases tripping an assertion (`assertSearcherIsWarmedUp`) when we run the logic and no refresh actually happened because of induced exceptions.
This really should only run if the refresh actually went through in any case.

fixes #106752
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Sep 30, 2024
…HasValueListener (elastic#107128)

We're in some cases tripping an assertion (`assertSearcherIsWarmedUp`) when we run the logic and no refresh actually happened because of induced exceptions.
This really should only run if the refresh actually went through in any case.

fixes elastic#106752
elasticsearchmachine pushed a commit that referenced this issue Sep 30, 2024
…HasValueListener (#107128) (#113826)

We're in some cases tripping an assertion (`assertSearcherIsWarmedUp`) when we run the logic and no refresh actually happened because of induced exceptions.
This really should only run if the refresh actually went through in any case.

fixes #106752
matthewabbott pushed a commit to matthewabbott/elasticsearch that referenced this issue Oct 4, 2024
…HasValueListener (elastic#107128)

We're in some cases tripping an assertion (`assertSearcherIsWarmedUp`) when we run the logic and no refresh actually happened because of induced exceptions.
This really should only run if the refresh actually went through in any case.

fixes elastic#106752
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
low-risk An open issue or test failure that is a low risk to future releases :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch >test-failure Triaged test failures from CI
Projects
None yet
9 participants