-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profiler: Don’t profile NEXTDOC for ConstantScoreQuery. #33196
Conversation
A ConstantScore query will return the iterator of its inner query. However, when profiling, the constant score query is wrapped separately from its inner query, which distorts the times emitted by the profiler. Return the iterator directly in such a case. Closes elastic#23430
Pinging @elastic/es-search-aggs |
|
||
assertThat(breakdownConstantScoreQuery.get(QueryTimingType.CREATE_WEIGHT.toString() + "_count").longValue(), greaterThan(0L)); | ||
assertThat(breakdownConstantScoreQuery.get(QueryTimingType.BUILD_SCORER.toString() + "_count").longValue(), greaterThan(0L)); | ||
assertThat(breakdownConstantScoreQuery.get(QueryTimingType.NEXT_DOC.toString() + "_count").longValue(), equalTo(0L)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it shouldn't be 0 but the same value as the wrapped query since timings are supposed to include timings of children
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Adrien! I see what you mean, somewhat the constant score query should inherit the timers from their children.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, or in that particular case, from their child since constant-score queries can only have one child.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there are at least three cases here when building a profile scorer of a constant score query:
- Normal case: The provided scorer is a normal scorer with one child that is an instance of ProfileScorer.
- Total hits query: The provided scorer is already a profile scorer with no children.
- Cached total hits query: The provided scorer is a normal scorer with no children.
I think the difference in total hits queries come from this method inside Lucene:
public Weight createWeight(Query query, boolean needsScores, float boost) throws IOException {
final QueryCache queryCache = this.queryCache;
Weight weight = query.createWeight(this, needsScores, boost);
if (needsScores == false && queryCache != null) {
weight = queryCache.doCache(weight, queryCachingPolicy);
}
return weight;
}
Where the weight created is a profiled weight but get transformed into a CachingWrapperWeight. Then depending if the query is cached or not the scorer gets wrapped into a profile scorer or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't want to do anything if the underlying scorer is cached since it will be consumed differently from this scorer. We should only try to reuse timing information if the underlying scorer is a ProfileScorer too?
…ying iterator when it is not cached
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a question.
profileScorer = (ProfileScorer) scorer; | ||
} else if (w.getQuery() instanceof ConstantScoreQuery && scorer.getChildren().size() == 1) { | ||
profileScorer = (ProfileScorer) scorer.getChildren().iterator().next().child; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand when the if
case may happen but I'm less sure about the second else if
, when does it happen? Can you leave comments explaining?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, here is where the complexity is:
-
The first
if
is hit when you have a totalHits query that is not cached. (Test testConstantScoreTotalHitsQuery and cache ==false). -
The
else if
is hit when you have a top N query. (Test testConstantScoreQuery) -
There is a third case where we have a totalHits query and it is cached. In this case the provided scorer is not an instance of ProfileScorer and has no children. In that case we do not do any special treatment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Can you add a comment to explain this? Also should we do an instanceof check on the scorer in the second case to be safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a comment. Random test shows that the second check is always true but I will add a instanceof check to be on the safe side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @iverase !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
* Profiler: Don’t profile NEXTDOC for ConstantScoreQuery. A ConstantScore query will return the iterator of its inner query. However, when profiling, the constant score query is wrapped separately from its inner query, which distorts the times emitted by the profiler. Return the iterator directly in such a case. Closes #23430
* master: (46 commits) Fixing assertions in integration test (elastic#33833) [CCR] Rename idle_shard_retry_delay to poll_timout in auto follow patterns (elastic#33821) HLRC: Delete ML calendar (elastic#33775) Move DocsStats into Engine (elastic#33835) [Docs] Clarify accessing Date methods in painless (elastic#33560) add elasticsearch-shard tool (elastic#32281) Cut over to unwrap segment reader (elastic#33843) SQL: Fix issue with options for QUERY() and MATCH(). (elastic#33828) Emphasize that filesystem-level backups don't work (elastic#33102) Use the global doc id to generate a random score (elastic#33599) Add minimal sanity checks to custom/scripted similarities. (elastic#33564) Profiler: Don’t profile NEXTDOC for ConstantScoreQuery. (elastic#33196) [CCR] Change FollowIndexAction.Request class to be more user friendly (elastic#33810) SQL: day and month name functions tests locale providers enforcement (elastic#33653) TESTS: Set SO_LINGER = 0 for MockNioTransport (elastic#32560) Test: Relax jarhell gradle test (elastic#33787) [CCR] Fail with a descriptive error if leader index does not exist (elastic#33797) Add ES version 6.4.2 (elastic#33831) MINOR: Remove Some Dead Code in Scripting (elastic#33800) Ensure realtime `_get` and `_termvectors` don't run on the network thread (elastic#33814) ...
…ner scorer anymore
A ConstantScore query will return the iterator of its inner query.
However, when profiling, the constant score query is wrapped separately
from its inner query, which distorts the times emitted by the profiler.
Return the iterator directly in such a case.
Closes #23430