Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kibana 5.5.0 breaks visualizations where field type conflict exists. #12728

Closed
ilezcano opened this issue Jul 10, 2017 · 26 comments
Closed

Kibana 5.5.0 breaks visualizations where field type conflict exists. #12728

ilezcano opened this issue Jul 10, 2017 · 26 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience regression Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@ilezcano
Copy link

Kibana version: 5.5.0

Elasticsearch version: 5.5.0

Server OS version: Red Hat Enterprise Linux Server release 7.3

Browser version: Chrome 59.0.3071.115

Browser OS version: Fedora release 23

Original install method (e.g. download page, yum, from source, etc.): YUM

Description of the problem including expected versus actual behavior:
From versions 5.4 and below, when visualizing or running an aggregate on a field that has historically different string mappings (i.e. Older index mappings have "text" mapping, newer mappings have "keyword" mappings), no error would throw as long as the time range selected didn't include indices with the text mapping. Now, with 5.5.0, if a field has multiple mappings over a range of indices, and Kibana detects this, the field is completely unavailable for any analysis. This affects vizualizations, as well as discovery, as those fields in discovery no longer allow quick filters (magnifying glass).

The proposed solution, reindexing every index to be homogeneous with respect to field mappings, is unacceptable with large and numerous indexes. It was also unnecessary in previous 5.4.0 and below, as these would only throw a warning or silently fail, allowing continued usefulness on more current indices.

Downgrading Kibana to 5.4.x provides a workaround.

Steps to reproduce:

  1. Upgrade ELK to 5.5.0
  2. Have indices that have different mappings for different index generations (i.e. foo=text in older indexes, but foo=keyword in newer ones)
  3. Refresh index patterns from management tab (Optional)
  4. Attempt to run a previously saved visualization that contains "foo" as a term. OR
  5. Browse to discovery and attempt to filter on "foo".

Errors in browser console (if relevant):

Provide logs and/or server output (if relevant):

@ilezcano ilezcano changed the title Kibana 5.5.0 breaks visualizations where field mapping disparity exists. Kibana 5.5.0 breaks visualizations where field type conflict exists. Jul 10, 2017
@seidler2547
Copy link

Absolutely! This makes Kibana almost completely useless for us. We have about 1TB of log data (1.2 billion entries) in ES and Kibana is now preventing us from using it in a meaningful way - for no apparent reason.

For example, it's also not possible to click the "Filter for value" (magnifying glass with plus sign) for these fields, but of course when I manually copy the field name and value into the search field, it works as expected.

@epixa epixa added bug Fixes for quality problems that affect the customer experience regression labels Jul 14, 2017
@epixa
Copy link
Contributor

epixa commented Jul 14, 2017

This is the unintended result of us moving to field_caps to determine if a field is searchable or aggregatable, which takes into account the mappings as a whole rather than us determining that for the given time range.

Moving away from field_caps isn't really an option, but we really should continue to support searching/aggregating against fields even in circumstances when field_caps is conservative with its results.

Would it be helpful if you could explicitly override searchable/aggregatable on those fields for your index pattern?

@ilezcano
Copy link
Author

That sounds like a great alternative. We have to customize index mappings anyway for formatting of numeric data, etc.

@seidler2547
Copy link

Yes that would be good. A bulk option would then also be good. We have about 700 field (I know, too many), but even marking the most important 20 or so would be quite an effort. Ideally it should have an option to only scan the X most recent indices for the field type and use that. I would guess that most of our inconsistent field mappings come from changes we have made in the past. But then again, it could happen anytime again if for example a newer version of ES or LS would map things differently from how they did it before.

@alexfrancoeur alexfrancoeur added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Jul 17, 2017
@epixa
Copy link
Contributor

epixa commented Jul 18, 2017

One benefit of the configurable searchable/aggregatable approach is that it is still useful even if we also added some sort of fields-only-from-recent-index type of feature. I doubt we can implement a performant feature that scales that definitively answers the question of what you can do with a given field in all circumstances, so being able to override whatever our default behavior is for a given field based on your internal knowledge of your data remains useful.

@chrisronline chrisronline self-assigned this Jul 18, 2017
@chrisronline
Copy link
Contributor

Hey folks,

Thanks for your patience. I'm going to investigate this more and I'll keep this thread updated.

@chrisronline
Copy link
Contributor

Update on this.

In 5.4 and below for time-based data, Kibana used the field_stats api to expand index patterns into time-constrained ES indices (assuming this option was selected during index pattern creation). For setups with conflicting field types, this meant that if the time window didn't include an index with a conflicting field, everything worked fine.

In 5.5, we deprecated field_stats in favor of field_caps thinking it would be a transparent change for users. Unfortunately, this is not the case. The important difference with field_caps is that it's not possible to expand index patterns into time-constrained ES indices. Because of this, any conflicting field types will cause an error with each search request, regardless of time window.

There is a recent change within ES that seems to function as a work around. We've found that performing a search with ?pre_filter_shard_size=1 will ensure that only shards that might contain data are actually queried. Our proposed solution is to enable this for searching as either:

  • kibana.yml configuration option
  • per index pattern (similar to the Expand index pattern when searching [DEPRECATED] option)
  • global config through Advanced Settings

However, before assuming that will work, we want to get some input from folks on the ES team and understand if this proposed solution will work.

@BigFunger @skearns64 @clintongormley @s1monw

@s1monw
Copy link

s1monw commented Jul 19, 2017

However, before assuming that will work, we want to get some input from folks on the ES team and understand if this proposed solution will work.

in a nutshell what you are saying is correct. I personally wouldn't recommend to set pre_filter_shard_size to 1 by default but maybe let the user specify it? I would want to write a test that simulates what you are doing before I give my thumbs up, did you actually try the solution proposed?

@chrisronline
Copy link
Contributor

@s1monw Yes!

I have a scenario setup with two indices - foo and foo2 - both have two documents that are one year apart.

If I run a search on foo* that is time-boxed to only see data from a single index, I get the error:

"reason": {
  "type": "illegal_argument_exception",
  "reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [foo] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
}

But there is no error if I run the same search with ?pre_filter_shard_size=1 and it seems to work as expected based on _shards part of the output:

"_shards": {
  "total": 10,
  "successful": 10,
  "skipped": 8,
  "failed": 0
},

Here is the full search for clarity (foo is the conflicting field in this case):

GET foo*/_search
{
  "size": 0,
  "query": {
    "range": {
      "time": {
        "gte": 1499880983453,
        "lte": 1500485783453,
        "format": "epoch_millis"
      }
    }
  },
  "aggs": {
    "termAgg": {
      "terms": {
        "field": "foo",
        "size": 2
      }
    }
  }
}

@s1monw
Copy link

s1monw commented Jul 19, 2017

fair enough, but it's still a workaround and I think we should not design defaults for this. What is the fix you had in mind with this? Make it optional?

@chrisronline
Copy link
Contributor

We wouldn't enable it by default for every request. We can go a few different ways in Kibana:

  • kibana.yml configuration option
  • per index pattern (similar to the Expand index pattern when searching [DEPRECATED] option)
  • global config through Advanced Settings

And possibly others that I'm not aware of (paging @skearns64 for more information here)

@s1monw
Copy link

s1monw commented Jul 19, 2017

sounds good to me

@skearns64
Copy link

I chatted with @chrisronline a bit and my favored approach in the short term was to make this an index setting and automatically enable the setting at index pattern creation and update time, if the index pattern contains conflicts. For the first iteration, I suggest leaving out the editability of this setting via UI, but ensuring it appears in the index pattern doc as a setting, so support could disable it and we could take the time to build an advanced UI to allow manual config of this, if we need it.

@clintongormley
Copy link
Contributor

Hi @chrisronline

I'd like to understand the problem more fully, and am happy to get on a call if it would be easier to talk through in person.

The way I see it is as follows: let's say you have two indices, logs-2016 and logs-2017. Field status is a text field in the old index and a keyword field in the new index.

If you aggregate on status with an index pattern of logs*, then you will get results for the new index and a shard failure message for the old index. In other words, you get results that you can display for all of the data that is available, and you can warn that we couldn't get data from logs-2016. It sounds like Kibana is failing the whole visualisation if there are shard level failures?

If you query just the new index you get results and no failures, and if you query just the old index you get a hard failure.

Adding pre_filter_shard_size=1 will only change this outcome if there is a date filter which excludes the old index, but imagine that you have 5 visualisations and you want to see the data from both 2016 and 2017. The status field type change only affects one of those visualisations, and you'd still like to see the data available from 2017. The shard pre-filtering doesn't help you here. Either you don't see the status visualisation at all, or you have to limit all visualisations to 2017.

Wouldn't it be better to just show the vis and add an annotation that it contains shard failures?

@s1monw
Copy link

s1monw commented Jul 20, 2017

Wouldn't it be better to just show the vis and add an annotation that it contains shard failures?

i wonder if that works, it depends which shard we hit first and which one returns the first result to dominate the reduction of results. it might be that he wrong one wins and we fail the wrong shards depending on who you ask I guess?! :)

@chrisronline
Copy link
Contributor

@clintongormley

That definitely makes sense to me.

Kibana actually does still show the visualization with a shard error, but the visualization is just using potentially incomplete data.

I don't think this is a new issue; I'm sure this behavior occurred in 5.4 and below, but the user could get around this by narrowing the time scope of their dashboards/queries. However, with the field_caps change in 5.5, they are no longer able to get around this situation. I think we just need to provide a way to get back to that pre 5.5 state where the users are able to get around this by modifying the time range.

@skearns64 - Does that seem right to you too?

@skearns64
Copy link

++, I think that captures it @chrisronline

@epixa
Copy link
Contributor

epixa commented Jul 24, 2017

We do need to return searchable and aggregatable based on whether any of the values are true when there are conflicts, which will unblock people that are using 5.5 along with the existing field_stats behavior.

I don't think we need to solve this problem in a post-field_stats world, though. This issue arises when people have to change their mappings but they are at a scale where reindexing their existing data to have the correct mappings isn't easy or practical, but at that scale they are going to have far more than 128 shards, which is the default value for pre_filter_shard_size, so the optimization in Elasticsearch will prevent shard failures from firing on otherwise valid time ranges.

If you were searching on a time range that spanned a mapping change and were thus getting incomplete data, then we're going to fire warnings due to the shard failures, which seems like exactly what we should be doing.

If a person has fewer than 128 shards, then it should be a painless reindex to fix their data, which is the right thing to do in any circumstance where it's practical.

@s1monw
Copy link

s1monw commented Jul 24, 2017

I agree with what @epixa said. I don't think there is much harm in setting pre_filter_shard_size to a low value, yet I still wouldn't go and do it out of the box. If you do so it might hide other issues that the user and / or we should be aware of. I am not sure if this is common enough to have a per-dashboard checkbox or something like this (note, I am not an expert on this kind of stuff by far) maybe it can be as simple as a setting the user can set in kibana?

@epixa
Copy link
Contributor

epixa commented Jul 24, 2017

@s1monw Is the default value of pre_filter_shard_size configurable on Elasticsearch globally or only per-query?

@s1monw
Copy link

s1monw commented Jul 24, 2017 via email

@chrisronline
Copy link
Contributor

@epixa Our current plan is to sort of "reuse" the existing concept of the isExpandable field on the index pattern type, where we set that appropriately if there are field conflicts detected from the field caps call. Then, if that is set on the index pattern that the query or dashboard is using, we will attach ?pre_filter_shard_size=1 to the request.

However, reading through the latest comments, it'd be nice if we could forgo this code within Kibana and rely on a global ES config to alter this value in case they have a shard size less than 128 and reindexing is not an option.

@epixa
Copy link
Contributor

epixa commented Jul 24, 2017

I don't really think that Kibana should be making it convenient for people to follow bad practices, and having a relatively small amount of data (<128 shards) with conflicting mappings and refusing to reindex to fix them seems like a Bad Thing to me. If you have enough data that reindexing becomes impractical, then that's completely understandable and the pre_filter_shard_size optimization should prevent shard errors.

If you have <128 shards, it's a good thing that Kibana warns you of shard failures when aggregating on conflicted fields so that you can fix the underlying issue before your data grows to a point where it's no longer practical to resolve.

I don't think we should have any configuration for toggling pre_filter_shard_size=1 in Kibana.

Am I missing anything here? Is there some circumstance I'm not considering where a person has fewer than 128 shards and a conflicting mapping that they want to aggregate on and there is a legitimate reason for not fixing the conflicting mappings?

Edit: I don't think a global configuration in Elasticsearch to address this use-case is really appropriate either.

@chrisronline
Copy link
Contributor

Honestly, I don't know the scale at which reindexing becomes unrealistic. I didn't assume that 128 correlated to that, but if it does, then I don't think we need to do anything more than updating the UI logic to allow conflicted fields to be aggregatable and searchable.

How much work would be involved in making pre_filter_shard_size configurable in ES? @s1monw

@skearns64
Copy link

@epixa - I think having this as a (preferably hidden, automatically configured) setting is important and useful. Imagine someone with daily indices and moderate indexing rates, where they have 1 shard per day and a 90 day retention period. They discover their mapping error after a few weeks, and fix it. But since they don't have 128 shards, every single request they make that includes the conflicting field will return a toast error message about Shard failure. We could encourage them to reindex, but that may or may not be practical/possible on their hardware and their mappings. I don't think it's a good experience and it is a change from how this was handled in Kibana previously.

The proposed solution here would detect the field conflict and set pre_filter_shard_size=1 on just those index patterns. If the user queries the conflicting field over a time period where no conflict is present, no errors are shown. If they query across a time range where a conflict does exist, they will see the error.

@epixa
Copy link
Contributor

epixa commented Jul 26, 2017

Thanks for the scenario Steve, that helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience regression Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

No branches or pull requests

8 participants