Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Concurrent Segment Search] SourceLookup is not thread safe across segment slices #8790

Closed
jed326 opened this issue Jul 20, 2023 · 1 comment · Fixed by #8807
Closed

[Concurrent Segment Search] SourceLookup is not thread safe across segment slices #8790

jed326 opened this issue Jul 20, 2023 · 1 comment · Fixed by #8807
Assignees
Labels
bug Something isn't working distributed framework Search:Aggregations :test Adding or fixing a test

Comments

@jed326
Copy link
Collaborator

jed326 commented Jul 20, 2023

This is a subtask for #8509 (comment)

Reproduction:

./gradlew ':server:internalClusterTest' --tests "org.opensearch.search.aggregations.bucket.SignificantTermsSignificanceScoreIT.testScoresEqualForPositiveAndNegative" -Dtests.seed=6D7B107FC7821E62 -Dtests.locale=mk -Dtests.timezone=America/Atka

This is a race condition though so the same seed does not necessarily guarantee replication

The above IT sometimes fails with an assert codec error:

cause = {AssertionError@15455} "java.lang.AssertionError: StoredFieldsReader are only supposed to be consumed in the thread in which they have been acquired. But was acquired in Thread[opensearch[node_s0][index_searcher][T#2],5,TGRP-SignificantTermsSignificanceScoreIT] and consumed in Thread[opensearch[node_s0][index_searcher][T#1],5,TGRP-SignificantTermsSignificanceScoreIT]."
stackTrace = {StackTraceElement[29]@15492} 
 0 = {StackTraceElement@15494} "org.apache.lucene.tests.codecs.asserting.AssertingCodec.assertThread(AssertingCodec.java:44)"
 1 = {StackTraceElement@15495} "org.apache.lucene.tests.codecs.asserting.AssertingStoredFieldsFormat$AssertingStoredFieldsReader.document(AssertingStoredFieldsFormat.java:74)"
 2 = {StackTraceElement@15496} "org.opensearch.search.lookup.SourceLookup.loadSourceIfNeeded(SourceLookup.java:102)"
 3 = {StackTraceElement@15497} "org.opensearch.search.lookup.SourceLookup.extractRawValues(SourceLookup.java:178)"
 4 = {StackTraceElement@15498} "org.opensearch.search.aggregations.bucket.terms.SignificantTextAggregatorFactory$SignificantTextCollectorSource$1.collectFromSource(SignificantTextAggregatorFactory.java:240)"
 5 = {StackTraceElement@15499} "org.opensearch.search.aggregations.bucket.terms.SignificantTextAggregatorFactory$SignificantTextCollectorSource$1.collect(SignificantTextAggregatorFactory.java:221)"
 6 = {StackTraceElement@15500} "org.opensearch.search.aggregations.LeafBucketCollectorBase.collect(LeafBucketCollectorBase.java:75)"
 7 = {StackTraceElement@15501} "org.opensearch.search.aggregations.bucket.terms.MapStringTermsAggregator$SignificantTermsResults$1.collect(MapStringTermsAggregator.java:508)"

This is because SourceLookup is not thread safe and it's fields are being updated by multiple threads at the same time. This is happening here:

private void collectFromSource(int doc, long owningBucketOrd, DuplicateByteSequenceSpotter spotter) throws IOException {
sourceLookup.setSegmentAndDocument(ctx, doc);
BytesRefHash inDocTerms = new BytesRefHash(256, bigArrays);
try {
for (String sourceField : sourceFieldNames) {
Iterator<String> itr = sourceLookup.extractRawValues(sourceField).stream().map(obj -> {
if (obj == null) {
return null;
}
if (obj instanceof BytesRef) {
return fieldType.valueForDisplay(obj).toString();
}
return obj.toString();
}).iterator();

There are a few possible ways to make this thread safe, currently exploring the following:

  1. Create a new SourceLookup object per slice
  2. Make the fields of SourceLookup ThreadLocal
@jed326
Copy link
Collaborator Author

jed326 commented Jul 20, 2023

sourceLookup is final in SearchLookup. On the Aggregator side SignificantTextAggregatorFactory is the only place where SearchLookup::source is used, so in the aggregator side it doesn’t look like we need to access SourceLookup through SearchLookup again.

PR #8807 contains an easy fix for this wrt SignificantText, however it looks like it's possible there is a similar problem for SearchLookup in the other scripting use cases. Since this focus of this issue and parent issue is to resolve failing SignificantTerms tests, I will open a separate issue to track investigation into LeafSearchLookup use cases.

@github-project-automation github-project-automation bot moved this from In Progress to Done in Concurrent Search Jul 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working distributed framework Search:Aggregations :test Adding or fixing a test
Projects
Status: Done
2 participants