-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handles ns to ms conversion for event loop delay metrics #118447
Handles ns to ms conversion for event loop delay metrics #118447
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self review
const now = Date.now(); | ||
|
||
return { | ||
const mockedRawCollectedDataInNs = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mocks the raw data returned from Node's native API in nanoseconds.
return mockedRawCollectedDataInNs; | ||
} | ||
|
||
function createMockMonitorDataMsHistogram( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create mock data in ms returned from the monitor
|
||
return new MockEventLoopDelaysMonitor(); | ||
} | ||
|
||
export const mocked = { | ||
createHistogram: createMockHistogram, | ||
createHistogram: createMockRawNsDataHistogram, // raw data as received from Node.js perf_hooks.monitorEventLoopDelay([options]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
createHistogram
is used to generate unit test input data that the monitor then converts from ns
-> ms
.
// This ensures that the wiring of the `collect` function is correct. | ||
const mockedHistogram = mocked.createHistogram(); | ||
expect(histogramData).toEqual(mockedHistogram); | ||
|
||
expect(histogramData.min).toEqual(nsToMs(mockedHistogram.min)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't come up with an elegant way to test all of this that was still explicit enough for us to glance over and realize there's a unit conversion taking place.
Since the units already threw us once, I'm erring on the side of caution here.
* Converts time metric from ns to ms | ||
**/ | ||
export function nsToMs(metric: number) { | ||
return moment.duration(metric / ONE_MILLISECOND_AS_NANOSECONDS).asMilliseconds(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm torn between using moment
here and just treating the number
as a number
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the native API whenever possible. I'd remove moment
usage as it doesn't bring any value.
mean, | ||
exceeds, | ||
stddev, | ||
const collectedDataMs: IntervalHistogram = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation ended up being the easiest to read.
@@ -32,7 +32,7 @@ export const MONITOR_EVENT_LOOP_DELAYS_RESET = 24 * 60 * 60 * 1000; | |||
export const MONITOR_EVENT_LOOP_DELAYS_START = 1 * 60 * 1000; | |||
|
|||
/** | |||
* Mean event loop delay threshold for logging a warning. | |||
* Mean event loop delay threshold in ms for logging a warning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the unit here as it wasn't explicit.
@@ -59,19 +59,19 @@ export const eventLoopDelaysUsageSchema: MakeSchemaFrom<EventLoopDelaysUsageRepo | |||
min: { | |||
type: 'long', | |||
_meta: { | |||
description: 'The minimum recorded event loop delay.', | |||
description: 'The minimum recorded event loop delay in ms.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding units to telemetry field descriptions helps!
const meanDurationMs = moment | ||
.duration(mean / ONE_MILLISECOND_AS_NANOSECONDS) | ||
.asMilliseconds(); | ||
const { mean: meanMS } = eventLoopDelaysMonitor.collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data's already in ms
by now.
@elasticmachine merge upstream |
@elasticmachine merge upstream |
Pinging @elastic/kibana-core (Team:Core) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/** | ||
* Nanosecond to milisecond conversion unit | ||
*/ | ||
export const ONE_MILLISECOND_AS_NANOSECONDS = 1000000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
export const ONE_MILLISECOND_AS_NANOSECONDS = 1000000; | |
export const ONE_MILLISECOND_AS_NANOSECONDS = 1_000_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely easier to parse by eye. Nice suggestion!
* Converts time metric from ns to ms | ||
**/ | ||
export function nsToMs(metric: number) { | ||
return moment.duration(metric / ONE_MILLISECOND_AS_NANOSECONDS).asMilliseconds(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the native API whenever possible. I'd remove moment
usage as it doesn't bring any value.
}, | ||
}; | ||
|
||
this.loopMonitor.enable(); | ||
return collectedData; | ||
return collectedDataMs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not all data in the returned object are in milliseconds.
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: |
) Co-authored-by: Kibana Machine <[email protected]>
) Co-authored-by: Kibana Machine <[email protected]>
…118612) Co-authored-by: Kibana Machine <[email protected]> Co-authored-by: Christiane (Tina) Heiligers <[email protected]>
…118613) Co-authored-by: Kibana Machine <[email protected]> Co-authored-by: Christiane (Tina) Heiligers <[email protected]>
…igrate-away-from-injected-css-js * 'main' of github.com:elastic/kibana: (221 commits) [Reporting] Add log level to config (elastic#118319) [Metrics UI] Skip failing waffle chart color palette test (elastic#118527) [APM] chore: Unify naming of 'apm/scripts/**/*' with snake_case (elastic#118328) skip flaky suites (elastic#99581, elastic#118356, elastic#118474) [Alerting] Initial implementation of alerting task `cancel()` (elastic#114289) chore(NA): creates pkg_npm_types bazel rule (elastic#116465) skip flaky suite (elastic#116892) Bump chromedriver to 95.0.0 (elastic#116724) [Data visualizer] Improve design of expanded rows (elastic#118125) [APM] prefer ECS field names for HTTP and URL (elastic#118485) Update query_debugging_in_development_and_production.md (elastic#118491) [Uptime] adjust Elastic Synthetics integration functional tests (elastic#118163) [kbn/rule-data-utils] add submodules and require public use them (elastic#117963) [DOCS] Refresh APM correlation screenshots (elastic#116723) (elastic#118577) Handles ns to ms conversion for event loop delay metrics (elastic#118447) [Cases] Rename functional tests folder (elastic#118490) dummy commit skip flaky suite (elastic#118593) Improve workpad schema validation (elastic#97838) skip flaky suite (elastic#118584) ... # Conflicts: # src/plugins/dashboard/public/application/embeddable/viewport/dashboard_viewport.tsx
) Co-authored-by: Kibana Machine <[email protected]>
Co-authored-by: Kibana Machine <[email protected]>
Resolves #116778
How this PR fixes the issue:
This PR converts the raw data in core's
EventLoopDelayMonitor
fromns
toms
and reports the data inms
. We source the data from Node's nativemonitorEventLoopDelay
histogram, that is inns
.How was the bug introduced?
Event loop delays collection was moved from the kibana_usage_collection plugin to core. There were several other changes that resulted from this move, including changing the process event_loop_delay to read from the mean of the histogram which uses nanoseconds rather than milliseconds.
Monitoring gets kibana metrics from the status endpoint, which includes core’s metrics.
On setup, monitoring initializes the bulkUploader with metrics from core and gets the kibana stats from there.
How was the metric calculated before?
Prior to this change, we were using
bench.elapsed
from hapi/hoek.Reference for
bench.elapsed
internals: https://github.com/hapijs/hoek/blob/master/lib/bench.js#L19-L22.Screenshots
Actual numbers from these screen shots will differ due to environment. Review relative orders of magnitude between Event loop delay.
7-15 stack deployment (cloud-staging)
![cloud_7_15_stack_event_loop_delay](https://user-images.githubusercontent.com/11909450/141694663-4cd673d0-4bbe-490e-8c5f-02ecb956b2fe.png)
![Cloud_staging_7-16-monitoring-UI](https://user-images.githubusercontent.com/11909450/141694684-02e878a0-df21-4c9e-a96d-22821681c164.png)
![post_fix_local_8-1](https://user-images.githubusercontent.com/11909450/141694737-2a753d6d-a14b-4445-b0eb-3020fdf6d3ba.png)
7-16 stack deployment (cloud-staging)
** This PR (running locally) **
PR deployed on cloud here. Use the same auth creds as running this PR locally.
Checklist
Risk Matrix
event_loop_delay metrics
reported incorrectly.Release Note
Fixes bug in
event_loop_delay
metrics units.