-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kibana 5.5.0 breaks visualizations where field type conflict exists. #12728
Comments
Absolutely! This makes Kibana almost completely useless for us. We have about 1TB of log data (1.2 billion entries) in ES and Kibana is now preventing us from using it in a meaningful way - for no apparent reason. For example, it's also not possible to click the "Filter for value" (magnifying glass with plus sign) for these fields, but of course when I manually copy the field name and value into the search field, it works as expected. |
This is the unintended result of us moving to field_caps to determine if a field is searchable or aggregatable, which takes into account the mappings as a whole rather than us determining that for the given time range. Moving away from field_caps isn't really an option, but we really should continue to support searching/aggregating against fields even in circumstances when field_caps is conservative with its results. Would it be helpful if you could explicitly override searchable/aggregatable on those fields for your index pattern? |
That sounds like a great alternative. We have to customize index mappings anyway for formatting of numeric data, etc. |
Yes that would be good. A bulk option would then also be good. We have about 700 field (I know, too many), but even marking the most important 20 or so would be quite an effort. Ideally it should have an option to only scan the X most recent indices for the field type and use that. I would guess that most of our inconsistent field mappings come from changes we have made in the past. But then again, it could happen anytime again if for example a newer version of ES or LS would map things differently from how they did it before. |
One benefit of the configurable searchable/aggregatable approach is that it is still useful even if we also added some sort of fields-only-from-recent-index type of feature. I doubt we can implement a performant feature that scales that definitively answers the question of what you can do with a given field in all circumstances, so being able to override whatever our default behavior is for a given field based on your internal knowledge of your data remains useful. |
Hey folks, Thanks for your patience. I'm going to investigate this more and I'll keep this thread updated. |
Update on this. In 5.4 and below for time-based data, Kibana used the field_stats api to expand index patterns into time-constrained ES indices (assuming this option was selected during index pattern creation). For setups with conflicting field types, this meant that if the time window didn't include an index with a conflicting field, everything worked fine. In 5.5, we deprecated There is a recent change within ES that seems to function as a work around. We've found that performing a search with
However, before assuming that will work, we want to get some input from folks on the ES team and understand if this proposed solution will work. |
in a nutshell what you are saying is correct. I personally wouldn't recommend to set |
@s1monw Yes! I have a scenario setup with two indices - If I run a search on
But there is no error if I run the same search with
Here is the full search for clarity (
|
fair enough, but it's still a workaround and I think we should not design defaults for this. What is the fix you had in mind with this? Make it optional? |
We wouldn't enable it by default for every request. We can go a few different ways in Kibana:
And possibly others that I'm not aware of (paging @skearns64 for more information here) |
sounds good to me |
I chatted with @chrisronline a bit and my favored approach in the short term was to make this an index setting and automatically enable the setting at index pattern creation and update time, if the index pattern contains conflicts. For the first iteration, I suggest leaving out the editability of this setting via UI, but ensuring it appears in the index pattern doc as a setting, so support could disable it and we could take the time to build an advanced UI to allow manual config of this, if we need it. |
I'd like to understand the problem more fully, and am happy to get on a call if it would be easier to talk through in person. The way I see it is as follows: let's say you have two indices, If you aggregate on If you query just the new index you get results and no failures, and if you query just the old index you get a hard failure. Adding Wouldn't it be better to just show the vis and add an annotation that it contains shard failures? |
i wonder if that works, it depends which shard we hit first and which one returns the first result to dominate the reduction of results. it might be that he wrong one wins and we fail the wrong shards depending on who you ask I guess?! :) |
That definitely makes sense to me. Kibana actually does still show the visualization with a shard error, but the visualization is just using potentially incomplete data. I don't think this is a new issue; I'm sure this behavior occurred in 5.4 and below, but the user could get around this by narrowing the time scope of their dashboards/queries. However, with the @skearns64 - Does that seem right to you too? |
++, I think that captures it @chrisronline |
We do need to return I don't think we need to solve this problem in a post-field_stats world, though. This issue arises when people have to change their mappings but they are at a scale where reindexing their existing data to have the correct mappings isn't easy or practical, but at that scale they are going to have far more than 128 shards, which is the default value for If you were searching on a time range that spanned a mapping change and were thus getting incomplete data, then we're going to fire warnings due to the shard failures, which seems like exactly what we should be doing. If a person has fewer than 128 shards, then it should be a painless reindex to fix their data, which is the right thing to do in any circumstance where it's practical. |
I agree with what @epixa said. I don't think there is much harm in setting |
@s1monw Is the default value of |
It's only per query setable
… On 24. Jul 2017, at 14:35, Court Ewing ***@***.***> wrote:
@s1monw Is the default value of pre_filter_shard_size configurable on Elasticsearch globally or only per-query?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@epixa Our current plan is to sort of "reuse" the existing concept of the However, reading through the latest comments, it'd be nice if we could forgo this code within Kibana and rely on a global ES config to alter this value in case they have a shard size less than 128 and reindexing is not an option. |
I don't really think that Kibana should be making it convenient for people to follow bad practices, and having a relatively small amount of data (<128 shards) with conflicting mappings and refusing to reindex to fix them seems like a Bad Thing to me. If you have enough data that reindexing becomes impractical, then that's completely understandable and the If you have <128 shards, it's a good thing that Kibana warns you of shard failures when aggregating on conflicted fields so that you can fix the underlying issue before your data grows to a point where it's no longer practical to resolve. I don't think we should have any configuration for toggling Am I missing anything here? Is there some circumstance I'm not considering where a person has fewer than 128 shards and a conflicting mapping that they want to aggregate on and there is a legitimate reason for not fixing the conflicting mappings? Edit: I don't think a global configuration in Elasticsearch to address this use-case is really appropriate either. |
Honestly, I don't know the scale at which reindexing becomes unrealistic. I didn't assume that 128 correlated to that, but if it does, then I don't think we need to do anything more than updating the UI logic to allow conflicted fields to be aggregatable and searchable. How much work would be involved in making |
@epixa - I think having this as a (preferably hidden, automatically configured) setting is important and useful. Imagine someone with daily indices and moderate indexing rates, where they have 1 shard per day and a 90 day retention period. They discover their mapping error after a few weeks, and fix it. But since they don't have 128 shards, every single request they make that includes the conflicting field will return a toast error message about Shard failure. We could encourage them to reindex, but that may or may not be practical/possible on their hardware and their mappings. I don't think it's a good experience and it is a change from how this was handled in Kibana previously. The proposed solution here would detect the field conflict and set |
Thanks for the scenario Steve, that helps |
Kibana version: 5.5.0
Elasticsearch version: 5.5.0
Server OS version: Red Hat Enterprise Linux Server release 7.3
Browser version: Chrome 59.0.3071.115
Browser OS version: Fedora release 23
Original install method (e.g. download page, yum, from source, etc.): YUM
Description of the problem including expected versus actual behavior:
From versions 5.4 and below, when visualizing or running an aggregate on a field that has historically different string mappings (i.e. Older index mappings have "text" mapping, newer mappings have "keyword" mappings), no error would throw as long as the time range selected didn't include indices with the text mapping. Now, with 5.5.0, if a field has multiple mappings over a range of indices, and Kibana detects this, the field is completely unavailable for any analysis. This affects vizualizations, as well as discovery, as those fields in discovery no longer allow quick filters (magnifying glass).
The proposed solution, reindexing every index to be homogeneous with respect to field mappings, is unacceptable with large and numerous indexes. It was also unnecessary in previous 5.4.0 and below, as these would only throw a warning or silently fail, allowing continued usefulness on more current indices.
Downgrading Kibana to 5.4.x provides a workaround.
Steps to reproduce:
Errors in browser console (if relevant):
Provide logs and/or server output (if relevant):
The text was updated successfully, but these errors were encountered: