[Telemetry] track event loop utilization #103477
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Monitor event loop utilization
The
eventLoopUtilization()
method returns an object that contains the cumulative duration of time the event loop has been both idle and active as a high resolution milliseconds timer.The utilization value is the calculated Event Loop Utilization (ELU).
ELU is similar to CPU utilization, except that it only measures event loop statistics and not CPU usage.
It represents the percentage of time the event loop has spent outside the event loop's event provider (e.g. epoll_wait). No other CPU idle time is taken into consideration. The following is an example of how a mostly idle process will have a high ELU.
In some cases the CPU is mostly idle while running this script but the event loop is haulted. For example
child_process.spawnSync()
blocks the event loop from proceeding while the CPU might be idle during that time.How is this different from event loop delays
Event loop delays histogram tracks delays in the event loop. How long it takes for the event loop to flop a full cycle (check incoming requests, async functions/callbacks, timeouts, etc) to be able to jump to the next lines of code and handle more incoming requests.
So while tracking delays is crticial to check that Kibana servers are running smoothly. Tracking ELU gives another insight about the level of utilization the kibana servers run at, we can also track increasing and decreasing trends across releases and enabled feature. It is also a window for us to check how important it is adopt using multi-processes if we have high ELU