Catch @hapi/podium errors #84575

watson · 2020-11-30T19:25:04Z

If an async error occurred within the call to this.events.emit, the error would be very hard to trace. In my case I would not get any meaningful stack trace and it took me many days to track it down to this location. Adding this catch solves that issue. I think exiting as I do here is ok, as the alternative (not having the catch) would resolve in an unhandled rejection which has the same effect. But please let me know if it can be done better 😃

pgayvallet · 2020-12-01T10:01:14Z

packages/kbn-legacy-logging/src/legacy_logging_server.ts

+        console.error('An unexpected error occurred while writing to the log:', err.stack);
+        process.exit(1);


Now that this is identified, maybe we would like to just log the error and not exit the process? This seems a bit overkill for an error occurring inside logging? @restrry wdyt?

This seems a bit overkill for an error occurring inside logging?

Maybe it's an overkill, but it's easy to miss that log message in stderr, so debugging a Kibana crash without logs might be a nightmare. The current PR replicates the current behavior, and it doesn't seem we've ever had any complaints about it. CC @joshdover

This commit has been in my working directory for about a week, so some of the details about the original behaviour is unfortunately a little bit fuzzy, but I just tried to recreate my original issue which I was trying to debug when discovering this, and found that under normal circumstances, we do get a regular unhandled rejection crash which fairly easily can be traced to the right location. However, my special circumstance was that the server was shut down prior to the unhandled rejection event being fired, and so it was never logged.

I'm fine with either logging an continuing, re-throwing, or using process.exit.

Maybe it's an overkill, but it's easy to miss that log message in stderr, so debugging a Kibana crash without logs might be a nightmare.

I'm probably missing something, but how does crashing the process make debugging this easier for the user?

That said, does the current audit logging implementation go through this mechanism? If so, we may want a crash here if we're unable to write audit log events. Even if not, I also lean towards just leaving this as the current behavior until we remove legacy logging altogether in 8.0 and make the behavior change then.

how does crashing the process make debugging this easier for the user?

I think it's just meant to make it easier for us as developers 😅 The end-user not so much.

Today the process crashes on unhandled rejections as the process is considered to be in an unknown state and that it's unsafe to continue:

kibana/src/setup_node_env/exit_on_warning.js

Lines 62 to 72 in 5420177

// While the above warning listener would also be called on

// unhandledRejection warnings, we can give a better error message if we

// handle them separately:

process.on('unhandledRejection', function (reason) {

console.error('Unhandled Promise rejection detected:');

console.error();

console.error(reason);

console.error();

console.error('Terminating process...');

process.exit(1);

});

does the current audit logging implementation go through this mechanism?

I'm actually not sure. @thomheymann Do you know if audit logging uses the same logger class as the rest of Kibana?

I'm probably missing something, but how does crashing the process make debugging this easier for the user?

Kibana could stop logging messages at any time due to an error in logging system. Should it crash later (for example, due to a lack of disk space), users aren't be able to diagnosis a problem without logs.

That said, does the current audit logging implementation go through this mechanism? If so, we may want a crash here if we're unable to write audit log events.

Yes, it does.

Even if not, I also lean towards just leaving this as the current behavior until we remove legacy logging altogether in 8.0 and make the behavior change then.

Ideally, we should initiate a graceful shutdown for such cases, but right now the core doesn't provide such functionality.

@restrry So does that mean, that until such a time when we have a graceful shutdown mechanism, that the current approach in this PR is ok to merge?

@watson Yes, I think so

watson · 2020-12-02T22:00:38Z

@elasticmachine merge upstream

watson · 2020-12-02T22:46:25Z

@elasticmachine merge upstream

kibanamachine · 2020-12-03T00:26:17Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request
Commit: 88de16e

Metrics [docs]

✅ unchanged

History

💚 Build #91541 succeeded 82350ed
💚 Build #90854 succeeded aab7e9a

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

* master: (40 commits) fix: 🐛 don't add separator befor group on no main items (elastic#83166) [Security Solution][Detections] Implements indicator match rule cypress test (elastic#84323) [APM] Add APM agent config options (elastic#84678) Fixed a11y issue on rollup jobs table selection (elastic#84567) [Discover] Refactor getContextUrl to separate file (elastic#84503) [Embeddable] Export CSV action for Lens embeddables in dashboard (elastic#83654) [TSVB] [Cleanup] Remove extra dateFormat props (elastic#84749) [Lens] Migrate legacy es client and remove total hits as int (elastic#84340) Improve logging pipeline in @kbn/legacy-logging (elastic#84629) Catch @hapi/podium errors (elastic#84575) [Discover] Unskip date histogram test (elastic#84727) Rename server.xsrf.whitelist to server.xsrf.allowlist (elastic#84791) [Enterprise Search] Fix schema errors button (elastic#84842) [APM] Removes react-sticky dependency in favor of using CSS (elastic#84589) [Maps] Always initialize routes on server-startup (elastic#84806) [Fleet] EPM support to handle uploaded file paths (elastic#84708) [Snapshot Restore] Fix initial policy form state (elastic#83928) Upgrade Node.js to version 14 (elastic#83425) [Security Solution] Keep Endpoint policies up to date with license changes (elastic#83992) [Security Solution][Exceptions] Implement exceptions for ML rules (elastic#84006) ...

* master: (236 commits) fix: 🐛 don't add separator befor group on no main items (elastic#83166) [Security Solution][Detections] Implements indicator match rule cypress test (elastic#84323) [APM] Add APM agent config options (elastic#84678) Fixed a11y issue on rollup jobs table selection (elastic#84567) [Discover] Refactor getContextUrl to separate file (elastic#84503) [Embeddable] Export CSV action for Lens embeddables in dashboard (elastic#83654) [TSVB] [Cleanup] Remove extra dateFormat props (elastic#84749) [Lens] Migrate legacy es client and remove total hits as int (elastic#84340) Improve logging pipeline in @kbn/legacy-logging (elastic#84629) Catch @hapi/podium errors (elastic#84575) [Discover] Unskip date histogram test (elastic#84727) Rename server.xsrf.whitelist to server.xsrf.allowlist (elastic#84791) [Enterprise Search] Fix schema errors button (elastic#84842) [APM] Removes react-sticky dependency in favor of using CSS (elastic#84589) [Maps] Always initialize routes on server-startup (elastic#84806) [Fleet] EPM support to handle uploaded file paths (elastic#84708) [Snapshot Restore] Fix initial policy form state (elastic#83928) Upgrade Node.js to version 14 (elastic#83425) [Security Solution] Keep Endpoint policies up to date with license changes (elastic#83992) [Security Solution][Exceptions] Implement exceptions for ML rules (elastic#84006) ...

…overy-action-group * upstream/master: (48 commits) [Lens] accessibility screen reader issues (elastic#84395) [Logs UI] Fetch single log entries via a search strategy (elastic#81710) fix: 🐛 don't add separator befor group on no main items (elastic#83166) [Security Solution][Detections] Implements indicator match rule cypress test (elastic#84323) [APM] Add APM agent config options (elastic#84678) Fixed a11y issue on rollup jobs table selection (elastic#84567) [Discover] Refactor getContextUrl to separate file (elastic#84503) [Embeddable] Export CSV action for Lens embeddables in dashboard (elastic#83654) [TSVB] [Cleanup] Remove extra dateFormat props (elastic#84749) [Lens] Migrate legacy es client and remove total hits as int (elastic#84340) Improve logging pipeline in @kbn/legacy-logging (elastic#84629) Catch @hapi/podium errors (elastic#84575) [Discover] Unskip date histogram test (elastic#84727) Rename server.xsrf.whitelist to server.xsrf.allowlist (elastic#84791) [Enterprise Search] Fix schema errors button (elastic#84842) [APM] Removes react-sticky dependency in favor of using CSS (elastic#84589) [Maps] Always initialize routes on server-startup (elastic#84806) [Fleet] EPM support to handle uploaded file paths (elastic#84708) [Snapshot Restore] Fix initial policy form state (elastic#83928) Upgrade Node.js to version 14 (elastic#83425) ...

Catch @hapi/podium errors

aab7e9a

watson added v8.0.0 release_note:skip Skip the PR/issue when compiling release notes v7.11.0 labels Nov 30, 2020

watson self-assigned this Nov 30, 2020

watson requested a review from a team as a code owner November 30, 2020 19:25

pgayvallet reviewed Dec 1, 2020

View reviewed changes

mshustov approved these changes Dec 1, 2020

View reviewed changes

Merge branch 'master' into catch-podium-errors

82350ed

Merge branch 'master' into catch-podium-errors

88de16e

watson merged commit 770a005 into elastic:master Dec 3, 2020

watson deleted the catch-podium-errors branch December 3, 2020 08:33

watson mentioned this pull request Dec 3, 2020

[7.x] Catch @hapi/podium errors (#84575) #84867

Merged

watson added a commit that referenced this pull request Dec 3, 2020

Catch @hapi/podium errors (#84575) (#84867)

a4cad77

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Catch @hapi/podium errors #84575

Catch @hapi/podium errors #84575

watson commented Nov 30, 2020

pgayvallet Dec 1, 2020

mshustov Dec 1, 2020 •

edited

Loading

watson Dec 1, 2020

joshdover Dec 1, 2020

watson Dec 1, 2020

mshustov Dec 2, 2020 •

edited

Loading

watson Dec 2, 2020

mshustov Dec 3, 2020

watson commented Dec 2, 2020

watson commented Dec 2, 2020

kibanamachine commented Dec 3, 2020

		console.error('An unexpected error occurred while writing to the log:', err.stack);
		process.exit(1);

	// While the above warning listener would also be called on
	// unhandledRejection warnings, we can give a better error message if we
	// handle them separately:
	process.on('unhandledRejection', function (reason) {
	console.error('Unhandled Promise rejection detected:');
	console.error();
	console.error(reason);
	console.error();
	console.error('Terminating process...');
	process.exit(1);
	});

Catch @hapi/podium errors #84575

Catch @hapi/podium errors #84575

Conversation

watson commented Nov 30, 2020

pgayvallet Dec 1, 2020

Choose a reason for hiding this comment

mshustov Dec 1, 2020 • edited Loading

Choose a reason for hiding this comment

watson Dec 1, 2020

Choose a reason for hiding this comment

joshdover Dec 1, 2020

Choose a reason for hiding this comment

watson Dec 1, 2020

Choose a reason for hiding this comment

mshustov Dec 2, 2020 • edited Loading

Choose a reason for hiding this comment

watson Dec 2, 2020

Choose a reason for hiding this comment

mshustov Dec 3, 2020

Choose a reason for hiding this comment

watson commented Dec 2, 2020

watson commented Dec 2, 2020

kibanamachine commented Dec 3, 2020

💚 Build Succeeded

Metrics [docs]

History

mshustov Dec 1, 2020 •

edited

Loading

mshustov Dec 2, 2020 •

edited

Loading