-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit the analyzed text for highlighting #27934
Limit the analyzed text for highlighting #27934
Conversation
- Introduce index level settings to control the max number of character to be analyzed for highlighting - Throw an error if analysis is required on a larger text Closes elastic#27517
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good, I left some comment regarding naming.
* indexing with offsets or term vectors is recommended. | ||
*/ | ||
public static final Setting<Integer> MAX_ANALYZED_OFFSET_SETTING = | ||
Setting.intSetting("index.analyze.max_analyzed_offset", 10000, 1, Property.Dynamic, Property.IndexScope); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe index.highlight.max_analyzed_offset
to indicate that it applies to highlighting only ?
/** | ||
* Returns the maximum number of chars that will be analyzed in a highlight request | ||
*/ | ||
public int getMaxAnalyzedOffset() { return this.maxAnalyzedOffset; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, getHighlightMaxAnalyzedOffset ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@jimczi thanks for the review, Jim |
- Introduce index level settings to control the max number of character to be analyzed for highlighting - Create a deprecation warning if analysis is required on a larger text Closes #27517
Hi @mayya-sharipova and @jimczi. I was just testing this change and it looks like it's going to break Kibana for users with large fields. It's quite common for users to have fields with >10000 characters so with this change in 6.2 Kibana's Discover (which uses highlighting) will start throwing errors. Is it necessary to throw an error? Without the error, users might see a change in highlighting on large fields but at least Kibana wouldn't stop working. The downside is that users may think highlighting is broken if they see matching text that is not highlighted. That might be an acceptable tradeoff and breaking change in a major release but I think either change might be surprising in a minor. |
The setting should be unlimited by default in 6.2. @mayya-sharipova can you change the value for that branch and maybe add a deprecation warning for texts bigger than the default limit in 7 ? |
@Bargs thank you for reporting this. Have you tested this on 6.2 and experienced the error? Errors are thrown only on 7.x |
@jimczi So, for 6.2 do we want to do following: Sorry, slack has connection problems for me, can't open it for 30 mins |
Sorry, like a fool I only tested quickly in master. If 6.2 doesn't respond with an error, Kibana should still work fine. Though we might end up spamming their deprecation logs on every search request. |
For 6.2 I would set the limit to -1 and allow updating this setting only if the value is > 0. This way if the value is -1 you know that the setting is not explicitly set. Then if the value is -1 you can issue a deprecation warning when the text exceeds the default value set in 7 ( |
- Introduce index level setting "index.highlight.max_analyzed_offset" to control the max number of character to be analyzed for highlighting - Make this setting to be unset by default (equal to -1) - Issue a deprecation warning if setting is unset and analysis is required on a text larger than ES v.7.x max setting (10000) - Throw IllegalArgumentException is setting is set by a user, and analysis is required on a text larger than the user's set value. Closes elastic#27517 Adding validator for index.highlight.max_analyzed_offset setting
Increase the default limit of index.highlight.max_analyzed_offset to 1M instead of previous 10K. Enhance an error message when offset increased to include field name, index name and doc_id. Relates to elastic#27934, elastic/kibana#16764
Increase the default limit of index.highlight.max_analyzed_offset to 1M instead of previous 10K. Enhance an error message when offset increased to include field name, index name and doc_id. Relates to #27934, elastic/kibana#16764
Increase the default limit of index.highlight.max_analyzed_offset to 1M instead of previous 10K. Enhance an error message when offset increased to include field name, index name and doc_id. Relates to #27934, elastic/kibana#16764
👍 |
to be analyzed for highlighting
Closes #27517