-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle warnings from task manager without stack traces. #30692
Handle warnings from task manager without stack traces. #30692
Conversation
💔 Build Failed |
Pinging @elastic/kibana-platform |
Swallow stack traces and do not throw errors.
retest |
💚 Build Succeeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Handle warnings from task manager without stack traces. * Fix typescript errors. * Catch no living connection errors. Swallow stack traces and do not throw errors. * Should use string substitution instead of concatenation.
* Handle warnings from task manager without stack traces. * Fix typescript errors. * Catch no living connection errors. Swallow stack traces and do not throw errors. * Should use string substitution instead of concatenation.
* Handle warnings from task manager without stack traces. * Fix typescript errors. * Catch no living connection errors. Swallow stack traces and do not throw errors. * Should use string substitution instead of concatenation.
state: { stats: {}, runs: 0 }, | ||
}); | ||
}catch(e) { | ||
server.log(['warning', 'maps'], `Error scheduling telemetry task, received ${e.message}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple things:
- This is using the tag
maps
, and I assume it should be using a tag relevant to task manager. - I don't think swallowing all possible exceptions and treating them as warnings is the right approach here, or pretty much ever. There are specific errors we'd expect to see in the course of normal operation (like Elasticsearch not being available or authentication being rejected), but if something truly exceptional occurs, we'll want it to be treated properly as an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- this is the maps collector, so that part is right
- I agree this shouldn't be swallowed, but on the other hand, I hope the timing errors or connection handling can be handled more by Task Manager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timing errors should be avoidable, by reworking the way the task manager gets initialized. So that we completely avoid the chicken or the egg problem. But I agree that it would be nice to handle these inside the task manager rather than put this on the consumers of the task manager service.
@@ -99,7 +99,7 @@ export class TaskManager { | |||
}) | |||
.catch((err: Error) => { | |||
// FIXME: check the type of error to make sure it's actually an ES error | |||
logger.warning(err.message); | |||
logger.warning(`PollError ${err.message}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sort of thing should be handled by a logger tag rather than string concatenation. This ensures folks can reliably filter/drill-down alerts based on type.
@njd5475 I know this is already merged, but please address the feedback I just posted in a followup PR. /cc @tsullivan |
Also, please add some tests for this behavior |
Coming back to this after testing a 6.7.0 build. I start Kibana and Elasticsearch at the same time, and it takes Elasticsearch awhile to start all of the shards. There are 2 plugins that try to schedule their own tasks, and they hit a problem that gets somewhat obscured by this change:
How plugins can schedule their own built-in tasks unfortunately seems to be a mystery. Ideally, waiting for the Question for Platform: Should plugins wait on some other event before scheduling built-in tasks? Should they keep retrying on their own? All task scheduling goes through taskManager.schedule, so it would be great for the consumer if Task Manager "managed" the retrying. (2) (1) and helps the Task Manager know that it has all task types registered by this point. An unrecognized task type is a bug, not a race condition |
Summary
This PR adds catches to for the initialization errors that are thrown by the Task Manager. So that the logs are not plagued with stack traces.
Checklist
Use
strikethroughsto remove checklist items you don't feel are applicable to this PR.This was checked for cross-browser compatibility, including a check against IE11Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n supportDocumentation was added for features that require explanation or tutorialsUnit or functional tests were updated or added to match the most common scenariosThis was checked for keyboard-only and screenreader accessibilityzFor maintainers
This was checked for breaking API changes and was labeled appropriatelyThis includes a feature addition or change that requires a release note and was labeled appropriately