Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Concurrent Segment Search] Fix SignificantTerms related ITs #8509

Closed
jed326 opened this issue Jul 6, 2023 · 3 comments · Fixed by #8807
Closed

[Concurrent Segment Search] Fix SignificantTerms related ITs #8509

jed326 opened this issue Jul 6, 2023 · 3 comments · Fixed by #8807
Assignees
Labels
distributed framework :test Adding or fixing a test

Comments

@jed326
Copy link
Collaborator

jed326 commented Jul 6, 2023

Subtask for #7357 to focus on the test failures related to SigTerms:

org.opensearch.search.aggregations.bucket.TermsShardMinDocCountIT.testShardMinDocCountSignificantTermsTest

REPRODUCE WITH: ./gradlew 'null' --tests "org.opensearch.search.aggregations.bucket.TermsShardMinDocCountIT.testShardMinDocCountSignificantTermsTest" -Dtests.seed=5A2A89155E5AADBB -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=nb -Dtests.timezone=Africa/Addis_Ababa -Druntime.java=17

java.lang.AssertionError: 
Expected: <0>
     but: was <2>
Expected :<0>
Actual   :<2>
<Click to see difference>


	at __randomizedtesting.SeedInfo.seed([5A2A89155E5AADBB:2F32E39B05FAA329]:0)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
	at org.junit.Assert.assertThat(Assert.java:964)
	at org.junit.Assert.assertThat(Assert.java:930)
	at org.opensearch.search.aggregations.bucket.TermsShardMinDocCountIT.testShardMinDocCountSignificantTermsTest(TermsShardMinDocCountIT.java:103)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testBackgroundVsSeparateSet
REPRODUCE WITH: ./gradlew 'null' --tests "org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testBackgroundVsSeparateSet" -Dtests.seed=6D7B107FC7821E62 -Dtests.locale=mk -Dtests.timezone=America/Atka -Druntime.java=17
REPRODUCE WITH: ./gradlew 'null' --tests "org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testBackgroundVsSeparateSet" -Dtests.seed=6D7B107FC7821E62 -Dtests.locale=mk -Dtests.timezone=America/Atka -Druntime.java=17

java.lang.NullPointerException: Cannot invoke "org.opensearch.search.aggregations.bucket.terms.SignificantTerms$Bucket.getSignificanceScore()" because the return value of "org.opensearch.search.aggregations.bucket.terms.SignificantTerms.getBucketByKey(String)" is null

	at org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testBackgroundVsSeparateSet(SignificantTermsSignificanceScoreIT.java:358)
	at org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testBackgroundVsSeparateSet(SignificantTermsSignificanceScoreIT.java:261)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testScoresEqualForPositiveAndNegative
REPRODUCE WITH: ./gradlew 'null' --tests "org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testScoresEqualForPositiveAndNegative" -Dtests.seed=6D7B107FC7821E62 -Dtests.locale=mk -Dtests.timezone=America/Atka -Druntime.java=17
REPRODUCE WITH: ./gradlew 'null' --tests "org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testScoresEqualForPositiveAndNegative" -Dtests.seed=6D7B107FC7821E62 -Dtests.locale=mk -Dtests.timezone=America/Atka -Druntime.java=17

Failed to execute phase [query], all shards failed
; shardFailures {[ssHwNGysTLmOcFBG_NA27w][test][0]: RemoteTransportException[[node_s0][127.0.0.1:51969][indices:data/read/search[phase/query]]]; nested: QueryPhaseExecutionException[Query Failed [Failed to execute main query]]; nested: NotSerializableExceptionWrapper[runtime_exception: java.util.concurrent.ExecutionException: java.lang.AssertionError: StoredFieldsReader are only supposed to be consumed in the thread in which they have been acquired. But was acquired in Thread[opensearch[node_s0][index_searcher][T#3],5,TGRP-SignificantTermsSignificanceScoreIT] and consumed in Thread[opensearch[node_s0][index_searcher][T#2],5,TGRP-SignificantTermsSignificanceScoreIT].]; nested: NotSerializableExceptionWrapper[execution_exception: java.lang.AssertionError: StoredFieldsReader are only supposed to be consumed in the thread in which they have been acquired. But was acquired in Thread[opensearch[node_s0][index_searcher][T#3],5,TGRP-SignificantTermsSignificanceScoreIT] and consumed in Thread[opensearch[node_s0][index_searcher][T#2],5,TGRP-SignificantTermsSignificanceScoreIT].]; nested: NotSerializableExceptionWrapper[assertion_error: StoredFieldsReader are only supposed to be consumed in the thread in which they have been acquired. But was acquired in Thread[opensearch[node_s0][index_searcher][T#3],5,TGRP-SignificantTermsSignificanceScoreIT] and consumed in Thread[opensearch[node_s0][index_searcher][T#2],5,TGRP-SignificantTermsSignificanceScoreIT].]; }
	at org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:665)

org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testXContentResponse

REPRODUCE WITH: ./gradlew 'null' --tests "org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testXContentResponse" -Dtests.seed=6D7B107FC7821E62 -Dtests.locale=mk -Dtests.timezone=America/Atka -Druntime.java=17

java.lang.AssertionError: 
Expected: "{\"class\":{\"doc_count_error_upper_bound\":0,\"sum_other_doc_count\":0,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"bg_count\":7,\"buckets\":[{\"key\":0,\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":5}]}},{\"key\":\"1\",\"doc_count\":3,\"sig_terms\":{\"doc_count\":3,\"bg_count\":7,\"buckets\":[{\"key\":1,\"doc_count\":3,\"score\":0.75,\"bg_count\":4}]}}]}}"
     but: was "{\"class\":{\"doc_count_error_upper_bound\":0,\"sum_other_doc_count\":0,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"bg_count\":14,\"buckets\":[{\"key\":0,\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":10}]}},{\"key\":\"1\",\"doc_count\":3,\"sig_terms\":{\"doc_count\":3,\"bg_count\":14,\"buckets\":[{\"key\":1,\"doc_count\":3,\"score\":0.75,\"bg_count\":8}]}}]}}"
Expected :{\"class\":{\"doc_count_error_upper_bound\":0,\"sum_other_doc_count\":0,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"bg_count\":7,\"buckets\":[{\"key\":0,\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":5}]}} ...

Actual   :{\"class\":{\"doc_count_error_upper_bound\":0,\"sum_other_doc_count\":0,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"bg_count\":14,\"buckets\":[{\"key\":0,\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":10}] ...

org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testReduceFromSeveralShards
REPRODUCE WITH: ./gradlew 'null' --tests "org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testReduceFromSeveralShards" -Dtests.seed=6D7B107FC7821E62 -Dtests.locale=mk -Dtests.timezone=America/Atka -Druntime.java=17

java.lang.AssertionError: Request breaker not reset to 0 on node: node_s0
Expected: <0L>
     but: was <240L>
Expected :<0L>
Actual   :<240L>
<Click to see difference>


	at __randomizedtesting.SeedInfo.seed([6D7B107FC7821E62:13A8213967CADF8]:0)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
	at org.junit.Assert.assertThat(Assert.java:964)
	at org.opensearch.test.InternalTestCluster.lambda$ensureEstimatedStats$41(InternalTestCluster.java:2659)
	at org.opensearch.test.OpenSearchTestCase.assertBusy(OpenSearchTestCase.java:1084)
	at org.opensearch.test.OpenSearchTestCase.assertBusy(OpenSearchTestCase.java:1057)

org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testPopularTermManyDeletedDocs
REPRODUCE WITH: ./gradlew 'null' --tests "org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testPopularTermManyDeletedDocs" -Dtests.seed=6D7B107FC7821E62 -Dtests.locale=mk -Dtests.timezone=America/Atka -Druntime.java=17

java.lang.AssertionError: Request breaker not reset to 0 on node: node_s0
Expected: <0L>
     but: was <240L>
Expected :<0L>
Actual   :<240L>
<Click to see difference>


	at __randomizedtesting.SeedInfo.seed([6D7B107FC7821E62:C84246803B2AAFB0]:0)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
	at org.junit.Assert.assertThat(Assert.java:964)
	at org.opensearch.test.InternalTestCluster.lambda$ensureEstimatedStats$41(InternalTestCluster.java:2659)
	at org.opensearch.test.OpenSearchTestCase.assertBusy(OpenSearchTestCase.java:1084)
	at org.opensearch.test.OpenSearchTestCase.assertBusy(OpenSearchTestCase.java:1057)
	at org.opensearch.test.InternalTestCluster.ensureEstimatedStats(InternalTestCluster.java:2657)
	at org.opensearch.test.TestCluster.assertAfterTest(TestCluster.java:104)
@jed326 jed326 added enhancement Enhancement or improvement to existing feature or request untriaged labels Jul 6, 2023
@jed326 jed326 added :test Adding or fixing a test and removed enhancement Enhancement or improvement to existing feature or request labels Jul 6, 2023
@jed326 jed326 self-assigned this Jul 10, 2023
@jed326 jed326 moved this from Todo to In Progress in Concurrent Search Jul 10, 2023
@jed326
Copy link
Collaborator Author

jed326 commented Jul 11, 2023

Breaking down the different failures here:

Test Failure Reason Solution
SignificantTermsSignificanceScoreIT.testBackgroundVsSeparateSet The test is not finding the correct buckets in the aggs response
SignificantTermsSignificanceScoreIT.testScoresEqualForPositiveAndNegative Shard queries are failing. Seems to be the similar issue as #8095, however did not get resolved after #8303
SignificantTermsSignificanceScoreIT.testXContentResponse bg_count is getting summed across slices so the output is NUM_SLICES times the expected output
SignificantTermsSignificanceScoreIT.testReduceFromSeveralShards After test execution failures, likely due to other tests failing. Does not fail when run on its own. Does not fail running by itself
SignificantTermsSignificanceScoreIT.testPopularTermManyDeletedDocs After test execution failures, likely due to other tests failing. Does not fail when run on its own. Does not fail running by itself
TermsShardMinDocCountIT.testShardMinDocCountSignificantTermsTest Ran this several times with the reproduction seed and not seeing failures anymore

@jed326
Copy link
Collaborator Author

jed326 commented Jul 14, 2023

Took a deeper look at this and there are 3 main issues here, which I'm going to further break into other tasks to have a clear conversation thread.

  1. bg_count in SignificantTerms performs a shard level query, so when we reduce the slice level aggs these counts get incorrectly summed together -- Tracking in [Concurrent Segment Search] SignificantTerms agg should not gather bg_count for each slice #8703
  2. significantText queries are encountering assert codec failures even after CardinalityIT/NestedIT test failures with concurrent search enabled and AssertingCodec #8303 is merged -- Tracking in [Concurrent Segment Search] SourceLookup is not thread safe across segment slices #8790
  3. SignificantTermsSignificanceScoreIT.testBackgroundVsSeparateSet is not finding the expected number of buckets. Currently I suspect this is due to the bg_count issue messing with scoring so planning on revisiting this later.

@jed326
Copy link
Collaborator Author

jed326 commented Jul 25, 2023

I ran some more tests and confirmed that #8790 should resolve the rest of the test failures under SignificantTermsSignificanceScoreIT.
I am still seeing test failures for TermsShardMinDocCountIT.testShardMinDocCountSignificantTermsTest, however these seem related to concurrent aggs in general rather than specifically significant terms aggs, so I will track that in a different issue #8860

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributed framework :test Adding or fixing a test
Projects
Status: Done
1 participant