Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dashboard] If a single visualization fails don't kill the entire dashboard #9747

Closed
stacey-gammon opened this issue Jan 5, 2017 · 6 comments · Fixed by #11337
Closed

[Dashboard] If a single visualization fails don't kill the entire dashboard #9747

stacey-gammon opened this issue Jan 5, 2017 · 6 comments · Fixed by #11337
Assignees
Labels
enhancement New value added to drive a business result Feature:Dashboard Dashboard related features Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v5.5.0 v6.0.0-alpha2

Comments

@stacey-gammon
Copy link
Contributor

Kibana has a problem generating the correct field information types if a field doesn't have data for it because the field_stats api won't return anything.

See #9466 and elastic/elasticsearch#22438 for reference.

Beats uses import scripts so this is not an uncommon situation to get in and I've seen a few discuss tickets related to it.

Aside from fixing the original issue, we can further help the situation by making sure an entire dashboard doesn't fail because of one bad visualization.

To Repro

  • turn off redis output in metricbeat.yml
output.redis:
  # Boolean flag to enable or disable the output module.
  enabled: false
  • run the import dashboard from metric beats
  • start collecting metric beat data
  • Refresh your field list (so it grabs the field information from the field_stats api)
  • create a dashboard with a visualization that works and a redis visualization.
  • save your dashboard
  • re-open it.
  • All visualizations will fail even though the redis one is the invalid one.
@0asp0
Copy link

0asp0 commented Feb 13, 2017

I encountered the same problem during rolling out to our production environment. We have some detail fields for errors, but these errors occur seldom, sometimes not in a whole week.

I exported all saved objects from dev instance and imported them into the naked prod environment. Even index templates with creation of empty indexes does not help here.

Even after enabling filebeat and when we have the data of the current day in there, our main dashboards are broken, because we did not receive our one critical error during this time.

Importing fake data (e.g. with very old timestamp) is nasty.
Not loading the other visualizations which are ok, is taking a lot of time, especially finding the problematic visualization on a complex dashboard with 20 or more panels.

@stacey-gammon
Copy link
Contributor Author

I looked into how to fix this and I believe the problem is with the courier. When one of the requests sent to callClient in call_client.js fails, it aborts all of the requests.

Specifically in this portion of the code:

    Promise.map(executable, function (req) {
      return Promise.try(req.getFetchParams, void 0, req)
      .then(function (fetchParams) {
        return (req.fetchParams = fetchParams);
      });
    })
    .then(function (reqsFetchParams) {
      return strategy.reqsFetchParamsToBody(reqsFetchParams);
    })

getFetchParams fails, and due to Promise.map, If any promise in the array is rejected, or any promise returned by the mapper function is rejected, the returned promise is rejected as well.

@spalger what do you think?

@stacey-gammon stacey-gammon added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Feb 13, 2017
@spalger
Copy link
Contributor

spalger commented Feb 13, 2017

We should totally be canceling individual requests when they fail here, not all requests.

@stacey-gammon
Copy link
Contributor Author

I briefly played around with a possible solution of returning null if it failed, and filtering out those null results, but it didn't work (I believe something was still expecting a response and when it didn't get it there was an exception thrown).

I'm not sure how the communication would go, in indicating which results passed and which didn't.

Do you have a fix in mind, or are you willing to look into this? I can investigate further but I'm pretty sure it will be more efficient if you take a look. :)

@pemontto
Copy link

pemontto commented Mar 3, 2017

We're in the progress of migration and have different sets of fields populated across different customers but serve them the same dashboards. In pre-migration testing this issue is impacting every single customer but not as a result of a single panel.
As a workaround for now I'm pulling the index-pattern docs from the Kibana index, updating the searchable flags, and then reindexing them, a sub-optimal solution (plenty of errors build up in the console). The added bonus is I set the _type field to index so it can be used in filters.

@tommikiviniemi-srs
Copy link

+1.

This is a huge issue for us, as sometimes a heavier chart fail due to timeout, which in turn fails the entire dashboard. Only the failing chart should be blank while the others should display.

Also, it seems as a dashboard only visualises charts once ALL of them have finished loading, it'd be far preferable if the charts became visible as soon as they have loaded (probably the same root cause?).
I think this is a bug and not an enhancement request as I don't think it's expected behaviour that one failing chart takes down an entire dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Dashboard Dashboard related features Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v5.5.0 v6.0.0-alpha2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants