-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use ValueFetcher when loading text snippets to highlight #63572
Conversation
Pinging @elastic/es-search (:Search/Highlighting) |
return textsToHighlight; | ||
MapperService mapperService, | ||
FetchSubPhase.HitContext hitContext) throws IOException { | ||
ValueFetcher fetcher = fieldType.valueFetcher(mapperService, null, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
null
looks is going to cause this to fail on runtime fields. Do we filter those out other places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to highlight this with a comment of 'not sure about this' - I'm pretty sure that we filter out non-text fields further up the stack, but I need to double-check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do try to highlight runtime fields. I'm not sure how we manage not to fail here. Runtime fields really need the SearchLookup
to do anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vaguely remember looking at this too, and finding that for runtime fields we simply don't return anything highlighted rather than failing, as they are not in _source nor stored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that runtime fields can't be highlighted anyway, I just added a null check for SearchLookup inside the value fetcher in AbstractScriptFieldType.
server/src/main/java/org/elasticsearch/search/fetch/subphase/highlight/HighlightUtils.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to see the value fetcher abstraction simplify some special cases! One general comment -- now that we use value fetchers, we will highlight copy_to
content. (For multi-fields, I think they were already handled before this change.) Perhaps we could add tests for this enhancement and make it more visible in the version 'changes' docs?
@romseygeek just a heads up that I marked this PR as closing #59931. |
We have an interesting test failure here:
Keyword value fetchers normalize their data, but this highlighter test is expecting normalization not to be applied to the highlighter output. I'm thinking we should change the test, as it makes more sense for the behaviour to be consistent across |
@@ -60,7 +60,7 @@ public SourceValueFetcher(String fieldName, QueryShardContext context, Object nu | |||
for (String path : sourcePaths) { | |||
Object sourceValue = lookup.extractValue(path, nullValue); | |||
if (sourceValue == null) { | |||
return List.of(); | |||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug - if the first of multiple source paths didn't have any values, we returned early instead of continuing to check further paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for catching this! Maybe we could fix this in a separate PR (with a quick test) to keep this one scoped to highlighting?
@@ -198,6 +199,9 @@ protected final void checkAllowExpensiveQueries(QueryShardContext context) { | |||
|
|||
@Override | |||
public ValueFetcher valueFetcher(QueryShardContext context, SearchLookup lookup, String format) { | |||
if (lookup == null) { | |||
return v -> Collections.emptyList(); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option here would be to pass the Fetch-phase search lookup through all the highlighting code to HighlightUtils.loadValues(), and I think we probably will want to do that eventually, but it's not really ready yet given our uncertainty around how that lookup is built and passed around anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to ask why it is null, the supplier could throw unsupported operation exception and not be null, but it is no supplier in this case. I am not liking that much that it can be null, but I am not sure how we can avoid that in this scenario.
@@ -60,7 +60,7 @@ public SourceValueFetcher(String fieldName, QueryShardContext context, Object nu | |||
for (String path : sourcePaths) { | |||
Object sourceValue = lookup.extractValue(path, nullValue); | |||
if (sourceValue == null) { | |||
return List.of(); | |||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for catching this! Maybe we could fix this in a separate PR (with a quick test) to keep this one scoped to highlighting?
//percolator needs to always load from source, thus it sets the global force source to true | ||
List<Object> textsToHighlight; | ||
if (forceSource == false && fieldType.isStored()) { | ||
boolean storedFieldsAvailable) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it'd be best to keep the original forceSource
name and value -- it matches the force_source
REST option that this value comes and avoids having mixed concepts in the code.
} | ||
assert textsToHighlight != null; | ||
return textsToHighlight; | ||
ValueFetcher fetcher = fieldType.valueFetcher(qsc, null, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just opened #65292 -- if we like the direction and it gets merged, then MappedFieldType#valueFetcher
won't need a dedicated SearchLookup
at all. I think highlighting on runtime fields would 'just work' here (although it'd be worth a test!)
I opened #65375 to fix the copy_to fields bug separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks great to me. Some last high-level comments:
- We're now able to highlight runtime keyword fields. It could also be good to add a REST test for runtime keywords, as I don't think we have any tests for this yet. And I think we can remove the highlight test exclusion from
xpack/plugin/runtime-fields/qa/build.gradle
. - This PR makes some changes to the default highlighting behavior: we now include
copy_to
fields, and keywords are normalized. We could make a small note in the migration docs about these changes?
Unfortunately this doesn't quite work yet, because the highlighters try to re-analyze the values from source, and trying to pull an index analyzer for runtime fields will fail - @jimczi found this elsewhere. I'll try and fix this in a follow up, I think it's just a case of amending
++, will push a change for this |
Actually this will have to wait for the backport, as there's no 7.11 migration doc in master |
HighlighterUtils.loadFieldValues() loads values directly from the source, and then callers have to deal with filtering out values that would have been removed by an ignore_above filter on keyword fields. Instead, we can use the ValueFetcher for the relevant field, which handles all this logic for us. Closes elastic#59931.
For 7.11 release notes: This change means that you can highlight fields populated via |
…5441) HighlighterUtils.loadFieldValues() loads values directly from the source, and then callers have to deal with filtering out values that would have been removed by an ignore_above filter on keyword fields. Instead, we can use the ValueFetcher for the relevant field, which handles all this logic for us. Closes #59931.
HighlighterUtils.loadFieldValues() loads values directly from the source, and
then callers have to deal with filtering out values that would have been removed
by an
ignore_above
filter on keyword fields. Instead, we can use theValueFetcher for the relevant field, which handles all this logic for us.
Closes #59931.