-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified highlighter: include additional context outside of highlighted sentence to reach target fragment_size #28089
Labels
>feature
:Search Relevance/Highlighting
How a query matched a document
Team:Search Relevance
Meta label for the Search Relevance team in Elasticsearch
Comments
marshalium
added
:Search Relevance/Highlighting
How a query matched a document
>feature
labels
Jan 5, 2018
jimczi
added a commit
to jimczi/elasticsearch
that referenced
this issue
Jan 8, 2018
…ighter The unified highlighter selects a single sentence per fragment from the offset of the first highlighted term. This change modifies this selection and allows more than one sentence in a single fragment. The expansion is done forward (on the right of the matching offset), sentences are added to the current fragment iff the overall size of the fragment is smaller than the maximum length (fragment_size). We should also add a way to expand the left context with the surrounding sentences but this is currently avoided because the unified highlighter in Lucene uses only the first offset that matches the query to derive the start and end offset of the next fragment. If we expand on the left we could split multiple terms that would be grouped otherwise. Breaking this limitation implies some changes in the core of the unified highlighter. Closes elastic#28089
jimczi
added a commit
that referenced
this issue
Jan 11, 2018
…ighter (#28132) The unified highlighter selects a single sentence per fragment from the offset of the first highlighted term. This change modifies this selection and allows more than one sentence in a single fragment. The expansion is done forward (on the right of the matching offset), sentences are added to the current fragment iff the overall size of the fragment is smaller than the maximum length (fragment_size). We should also add a way to expand the left context with the surrounding sentences but this is currently avoided because the unified highlighter in Lucene uses only the first offset that matches the query to derive the start and end offset of the next fragment. If we expand on the left we could split multiple terms that would be grouped otherwise. Breaking this limitation implies some changes in the core of the unified highlighter. Closes #28089
jimczi
added a commit
that referenced
this issue
Jan 11, 2018
…ighter (#28132) The unified highlighter selects a single sentence per fragment from the offset of the first highlighted term. This change modifies this selection and allows more than one sentence in a single fragment. The expansion is done forward (on the right of the matching offset), sentences are added to the current fragment iff the overall size of the fragment is smaller than the maximum length (fragment_size). We should also add a way to expand the left context with the surrounding sentences but this is currently avoided because the unified highlighter in Lucene uses only the first offset that matches the query to derive the start and end offset of the next fragment. If we expand on the left we could split multiple terms that would be grouped otherwise. Breaking this limitation implies some changes in the core of the unified highlighter. Closes #28089
Thank you @jimczi! |
Sorry to update on an old thread. Since what version is this fix available? |
@lfplazas10 you can see the version on the linked pr: |
Note the pr only expands context to the right of the match. Any sentences to the left (i.e leading context) are not included at the moment |
javanna
added
the
Team:Search Relevance
Meta label for the Search Relevance team in Elasticsearch
label
Jul 12, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
>feature
:Search Relevance/Highlighting
How a query matched a document
Team:Search Relevance
Meta label for the Search Relevance team in Elasticsearch
Describe the feature
Currently, the unified highlighter can only provide context by including the sentence the highlighted word is in. This is sometimes a very short highlight. For example, given text in a field like this:
Running a query for the term
sentence
using the unified highlighter andfragment_size
set to300
, results in a highlight that, while it includes the word that we're looking for, does not provide much context and is nowhere close to the target size requested:In contrast, run the same query with the plain highlighter results in a highlight with much more useful context (and in this case another highlighted word!):
The unified highlighter should include as much context as possible without going over the target fragment size. This will result in more consistently sized highlights (which is nice for visual consistency) and will provide more useful context in cases where the highlight occurs in a short sentence.
cc: @colings86
The text was updated successfully, but these errors were encountered: