-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kibana use highlighting only on small fields or fields indexed with offsets #16764
Comments
This definitely needs to be a coordinated effort to ensure that Kibana does not simply break because of the change in ES. At the moment it breaks only in proxy scenarios (when there is a proxy between ES and Kibana or Kibana and the user) like https://github.com/elastic/cloud/issues/10532 (still very painful when you hit it) , but going forward it will break every deployment with large fields indexed by logstash or filebeat. |
Would it make sense to somehow equip individuals in a deprecation path with an enumeration ... heck perhaps a visualization ... as to all the searches that need some highlight text truncation? Would it make sense to offer a link to such a report within deprecation warnings? (And if we're already doing it apologies in advance) |
Thanks for the heads up! I'll admit my highlighting knowledge is pretty rudimentary, so excuse me if these questions are pretty basic:
Is there any API we can use to determine the average or median size of a given field?
What work does this require from the user ahead of time? Most Kibana users are used to being able to point Kibana at any index and get highlighting for their searches. Making any sort of assumption about the state of their index settings and mappings is difficult unless it's an extremely common default, like text/keyword multi-fields for example. It's also going to be tough to tell users that highlighting is going to suddenly stop working on their old indices unless they reindex. |
I think we need an explicit parameter to truncate highlighting. If I understand correctly, Kibana sends a query to highlight all fields ( @Bargs what is the current behavior for large texts ? Are they truncated in the visualization ? What is the limit (if there is one) ? |
@jimczi I think this is a very good suggestion. Do you mean we should add another parameter to the highlight request, smth like |
We will actually show you the entire field in the Discover doc table. 10,000 characters may seem like a lot, but I could see this being useful if you're searching for a needle in a haystack. If highlighting suddenly stops working after n number of characters (a limit the user doing the querying may not have set up and may be unaware of), I expect they'll perceive highlighting as being broken or unreliable. Should we be using the unified highlighter anyway? Does the unified highlighter require the user to set things up at index creation time or can we use it dynamically? |
@Bargs thanks for the update. Looks like the limit would not help in your current Kibana workflow. About |
I marked this as a blocker for 7.0 since we'll need to figure out what to do before then. |
@jimczi I wonder if we can change the behaviour of unified/plain highlighter NOT to produce any errors/warnings, and instead just analyze and highlight ONLY the number of chars set in |
I think throwing an error is fine but we should maybe revise the default limit. 10K is maybe too restrictive especially for Kibana, would 1MB be enough to consider that it's too big to present in a viz ? |
@jimczi thanks Jim, sounds resonable |
1MB sounds reasonable, users who really need highlighting on larger fields can handle the additional set up. In Kibana it would be nice if we could catch the error and provide a user friendly message about the failure and how it can be fixed. So my last question is, if someone runs into this limit, what do we need to ask them to change in their mappings and is there anything we need to update in Kibana's search request body? |
@Bargs Answering your questions:
We would recommend to always use
The thrown error will say it all. The current error message is smth like this: So, basically, they would either need to increase |
What happens if you have a field over 1MB but you don't want to increase the max_analyzed_offset or add reindex the field but you are ok with not having the search on it highlighted? Any way to have Kibana ignore that field during searching so that an exception isn't thrown? |
@trevan It is a good question. In the ES search/highlight request there is no way to exclude certain fields, only to include specific fields: |
@mayya-sharipova, using an include list might be a problem for those of us with 1000s of fields. |
Yeah it would be nice to have the option to exclude fields in ES's highlight API. That said, we might be able to fake it in Kibana by including all the fields stored in the index pattern except for the fields the user wants to exclude. @trevan would you only want to exclude those large fields from highlighting, or would you want to exclude them from Discover completely? If the latter, we could tie this into the existing "Source filters". But perhaps it's a bad idea to couple those two concepts. It might also make more sense to persist this setting at the search level, it's hard to say. In any event, I'm thinking highlight exclusions is more of a separate enhancement request. If a user has to increase the limit and they don't want to reindex, it doesn't put them in a different spot than where they are today where there's no limit at all. At least by having a limit, users will be more aware of the impact of highlighting and can choose to turn it off completely for the time being if they need to. @mayya-sharipova we're not specifying a highlighter in our request, so it sounds like it should use the unified highlighter by default? If that's the case, I don't think there's anything we need to update on the Kibana side of things as long as the limit is a reasonable default and the error message returned from ES gives the user actionable advice. |
@Bargs, no I don't want to exclude them from Discover. My specific situation is that we we have a few fields that occasionally get a value that size. The field value is normally less than 1k. It will be really frustrating to have Discover sometimes work and then for it to sometimes not work just because the really large field is sometimes present. |
@trevan gotcha. Do you agree that, once there's a limit, manually increasing the limit should effectively create the same situation we have today, where there is no limit? If so, could you create a separate ticket to track the enhancement request for excluding individual fields from highlighting? I think it's a great idea and I want to make sure we don't lose it, but I think it'll require some more discussion than we should get into in this comment thread. |
Increase the default limit of `index.highlight.max_analyzed_offset` to 1M instead of previous 10K. Enhance an error message when offset increased to include field name, index name and doc_id. Relates to elastic/kibana#16764
Increase the default limit of `index.highlight.max_analyzed_offset` to 1M instead of previous 10K. Enhance an error message when offset increased to include field name, index name and doc_id. Relates to elastic/kibana#16764
Increase the default limit of index.highlight.max_analyzed_offset to 1M instead of previous 10K. Enhance an error message when offset increased to include field name, index name and doc_id. Relates to elastic#27934, elastic/kibana#16764
Increase the default limit of index.highlight.max_analyzed_offset to 1M instead of previous 10K. Enhance an error message when offset increased to include field name, index name and doc_id. Relates to #27934, elastic/kibana#16764
Increase the default limit of index.highlight.max_analyzed_offset to 1M instead of previous 10K. Enhance an error message when offset increased to include field name, index name and doc_id. Relates to #27934, elastic/kibana#16764
Increase the default limit of `index.highlight.max_analyzed_offset` to 1M instead of previous 10K. Enhance an error message when offset increased to include field name, index name and doc_id. Relates to elastic/kibana#16764
@mayya-sharipova @jimczi I was just re-reading this thread, trying to remember if there's anything we still need to update in Kibana, and it seems like there isn't. We're already using the unified highlighter since we just use the default. As I understand it, everything else is up to the user. Am I missing anything or can we close this ticket out? |
@bars I think we can close this ticket. |
With the upgrade to ES 7 there are changes to avoid default limits on higliht with ES structural apps matching strings longer than 10k characters more here: elastic/kibana#16764 (comment)
With the upgrade to ES 7 there are changes to avoid default limits on higliht with ES structural apps matching strings longer than 10k characters more here: elastic/kibana#16764 (comment) Signed-off-by: Luis Guzman <[email protected]>
With the upgrade to ES 7 there are changes to avoid default limits on higliht with ES structural apps matching strings longer than 10k characters more here: elastic/kibana#16764 (comment) Signed-off-by: Luis Guzman <[email protected]>
With the upgrade to ES 7 there are changes to avoid default limits on higliht with ES structural apps matching strings longer than 10k characters more here: elastic/kibana#16764 (comment) Signed-off-by: Luis Guzman <[email protected]>
Kibana version: 6.2
Elasticsearch version: 6.2
Describe the feature:
Since 6.2, we have modified highlighting to limit the analyzed text for highlighting to 10000 chars. See elastic/elasticsearch#27934.
Unless a field is indexed with offsets (
"index_options": "offsets"
or"term_vector": "with_positions_offsets"
), or unless index settingindex.highlight.max_analyzed_offset
is set to higher than 10K chars, an attempt to highlight a field with more than 10K chars will produce:We have seen people getting a lot of deprecation warnings from highlighting when trying to hit a default Kibana page in
6.2
.To avoid this and prevent future errors in 7.0, we propose that:
and also
unified
highlighter (unless Kibana already uses it)The text was updated successfully, but these errors were encountered: