[Reporting] Add logging of CPU usage and memory consumption by Chromium in the reporting. #99109

dokmic · 2021-05-03T19:37:25Z

Summary

Add logging of CPU usage and memory consumption by Chromium in the reporting.

Memory consumption is equal to the JavaScript heap size at the end of the screenshot transaction. That shows the process memory consumption proportionally, and therefore, can be used as historical data. Unfortunately, there is no other reliable way to get the metric because the Memory Domain is an experimental feature.

CPU usage is evaluated by the following formula Process Time / Actual Time / Number of Virtual Cores. Process time and actual time are vectors, so their division will give average CPU usage for a certain range. The ProcessTime can be higher than ActualTime on multicore systems, so it has to be divided by the number of virtual cores to get a relative value in the 0..1 range. This approach takes a constant time O(1) and more accurate than a polling mechanism. There is another potential way to get the CPU usage, but it is not available in the current Chromium build anyway.

Those metrics are being logged in percentage and megabytes accordingly. At the same time, they are exposed to APM as a decimal and a number of bytes for the following formatting in Kibana itself.

The metrics are sent to the APM in the custom context and will be available in transaction.custom.cpu and transaction.custom.memory fields. Please mind that the custom context might be empty when sampling probability is less than 1.

Resolves #79793.

Checklist

Unit or functional tests were updated or added to match the most common scenarios

elasticmachine · 2021-05-03T19:37:27Z

Pinging @elastic/kibana-app-services (Team:AppServices)

tsullivan · 2021-05-04T19:57:06Z

x-pack/plugins/reporting/server/browsers/chromium/driver_factory/metrics.ts

+}
+
+function getCpuUsage(start: NormalizedMetrics, end: NormalizedMetrics) {
+  return (end.ProcessTime - start.ProcessTime) / (end.Timestamp - start.Timestamp) / cpus().length;


What is the difference between x.ProcessTime and x.Timestamp?

ProcessTime is the amount of time for which a CPU was used in the current process (Wikipedia). It is a sum of the time from all the CPU cores that have been used.
Timestamp is just the current time.

Sometimes when the process was in the idle state, the ProcessTime will be less than the actual time the process was running for. Sometimes, when the process uses multiple cores, the ProcessTime will be more than the actual duration.

tsullivan · 2021-05-04T20:12:44Z

x-pack/plugins/reporting/server/browsers/chromium/driver_factory/metrics.ts

+import type { Metrics as PuppeteerMetrics } from 'puppeteer';
+import { cpus } from 'os';
+
+declare module 'puppeteer' {


Was this added because the code returns a different object than is defined by the TS type?

In the puppeter' type definitions they return object for the send method which is too generic and doesn't provide any type safety. The declaration here doesn't change the original signature but adds another override for that method.

graphaelli · 2021-05-04T22:59:11Z

x-pack/plugins/reporting/server/browsers/chromium/driver_factory/index.ts

+              endMetrics
+            );
+
+            apm.setCustomContext({ cpu, memory });


I suspect you'll want to perform range queries, aggregate, etc on this so you'll want to use labels instead - custom context is not indexed while labels are

@graphaelli But labels are equivalent to tags. Do you think it's logically correct to use labels here? It's something we can use aggregation for (like CPU cores), but here we are dealing with metrics.
The data we pass is fully random and distinct. Please correct me if I am wrong, but indexing this will increase the index size. Besides that, I can still filter transactions by the custom context fields.

@dokmic we do want to perform aggregations on the data, which means it has to be indexed. The custom context is not indexed, and we can see the raw data, but in order to get charts of the data, it needs to be indexed. It will increase the index size very negligibly since it's numeric.

tsullivan · 2021-05-05T00:22:23Z

There is another potential way to get the CPU usage, but it is not available in the current Chromium build anyway.

@dokmic I expect this will change after https://github.com/elastic/kibana/issues/90496, but I hope we can merge this PR first

tsullivan · 2021-05-05T00:31:13Z

@dokmic I expected that the changes in this PR would populate cpu and memory numbers somewhere in APM transaction data. Unfortunately, I don't see those numbers anywhere in the APM data yet.

I started a local apm-server and pointed my Kibana dev server to it by adding these lines to the kibana.xpack.dev.yml file:

elastic.apm.active: true
elastic.apm.serverUrl: "http://localhost:8200"

Then I started the dev server using yarn start -- --config=config/kibana.xpack.dev.yml

I ran a few test PNG reports using Sample ECommerce data. In the APM app, I see the transactions that should have the cpu and memory numbers:

The details of those transactions don't have custom Memory or CPU metrics.

I also set up a custom index pattern for apm-*-transaction* and dug through the raw APM data. I'm not seeing the info anywhere.

It doesn't seem like this is related to labels / custom context. I'm not sure if apm.setCustomContext is working when there is not an APM transaction "in scope", if that makes sense. Can you look into it?

dokmic · 2021-05-05T19:43:09Z

@dokmic I expected that the changes in this PR would populate cpu and memory numbers somewhere in APM transaction data. Unfortunately, I don't see those numbers anywhere in the APM data yet.

@tsullivan The custom context might be empty when sampling probability is less than 1. Could you please make sure that transactionSampleRate is 1 in your configuration?

tsullivan · 2021-05-06T18:04:34Z

@dokmic let's also add a metric for the byte size of the PDF that was generated:

--- a/x-pack/plugins/reporting/server/export_types/printable_pdf/lib/generate_pdf.ts
+++ b/x-pack/plugins/reporting/server/export_types/printable_pdf/lib/generate_pdf.ts
@@ -89,7 +89,11 @@ export async function generatePdfObservableFactory(reporting: ReportingCore) {
           tracker.startGetBuffer();
           logger.debug(`Generating PDF Buffer...`);
           buffer = await pdfOutput.getBuffer();
-          logger.debug(`PDF buffer byte length: ${buffer?.byteLength || 0}`);
+          {
+            const byteLength = buffer?.byteLength || 0;
+            logger.debug(`PDF buffer byte length: ${byteLength}`);
+            tracker.setByteLength(byteLength);
+          }
           tracker.endGetBuffer();
         } catch (err) {
           logger.error(`Could not generate the PDF buffer!`);
--- a/x-pack/plugins/reporting/server/export_types/printable_pdf/lib/tracker.ts
+++ b/x-pack/plugins/reporting/server/export_types/printable_pdf/lib/tracker.ts
@@ -8,6 +8,7 @@
 import apm from 'elastic-apm-node';

 interface PdfTracker {
+  setByteLength: (byteLength: number) => void;
   startLayout: () => void;
   endLayout: () => void;
   startScreenshots: () => void;
@@ -77,6 +78,9 @@ export function getTracker(): PdfTracker {
     endGetBuffer() {
       if (apmGetBuffer) apmGetBuffer.end();
     },
+    setByteLength(byteLength: number) {
+      apmTrans?.setLabel('byte_length', byteLength);
+    },
     end() {
       if (apmTrans) apmTrans.end();
     },

And also something similar for PNG.

tsullivan

LGTM

The main goal is to see memory consumption while generating a PDF report, and this PR achieves it.

I see a few issues that are lower priority

On my workstation, our ProcessTime metric is consistently -0.01. The overal calculated CPU utilization is 0. I'm not sure if that means we need to revisit the CPU calculation.
I see the ByteSize, CPU and Memory labels on reports that go through generate_pdf. For PNG reports which go through generate_png, I only see ByteSize, not CPU and Memory. This should eventually be fixed, but I am OK with having this PR merged without that fix.

x-pack/plugins/reporting/server/export_types/png/lib/generate_png.ts

kibanamachine · 2021-05-19T20:58:45Z

💚 Build Succeeded

Metrics [docs]

Unknown metric groups

References to deprecated APIs

id	before	after	diff
`canvas`	29	25	-4
`crossClusterReplication`	8	6	-2
`fleet`	22	20	-2
`globalSearch`	4	2	-2
`indexManagement`	12	7	-5
`infra`	261	149	-112
`lens`	67	45	-22
`licensing`	18	15	-3
`maps`	286	208	-78
`ml`	121	115	-6
`monitoring`	109	56	-53
`stackAlerts`	101	95	-6
total			-295

History

💔 Build #126601 failed 2365ff34c3bcaa3af5548aedab7fdaad8b41dc67
💚 Build #126433 succeeded ab7ff4af918ed8f8d45dbec4e95375f0d13688ab
💔 Build #126262 failed 29fdadc70321ef256127af2fe4ef8fa9c464168f
💚 Build #125624 succeeded 31357a72c1cb955544f18fb56e0642642b7ef23d
💔 Build #125585 failed 6f2dbbd6db8b7077d250cae042f1e285972ddd56

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

…um (elastic#99109) * Add logging of CPU usage by chromium * Add logging of memory consumption by chromium * Add PDF report byte length logging * Add PNG report byte length logging # Conflicts: # x-pack/plugins/reporting/server/browsers/chromium/driver_factory/index.ts

…um (#99109) (#100384) * Add logging of CPU usage by chromium * Add logging of memory consumption by chromium * Add PDF report byte length logging * Add PNG report byte length logging # Conflicts: # x-pack/plugins/reporting/server/browsers/chromium/driver_factory/index.ts

…um (elastic#99109) * Add logging of CPU usage by chromium * Add logging of memory consumption by chromium * Add PDF report byte length logging * Add PNG report byte length logging

dokmic added review (Deprecated) Feature:Reporting Use Reporting:Screenshot, Reporting:CSV, or Reporting:Framework instead v8.0.0 Team:AppServices release_note:skip Skip the PR/issue when compiling release notes v7.14.0 labels May 3, 2021

dokmic requested a review from tsullivan May 3, 2021 19:37

dokmic force-pushed the feature/79793 branch from f993a19 to f9e0ea1 Compare May 4, 2021 08:52

tsullivan reviewed May 4, 2021

View reviewed changes

graphaelli reviewed May 4, 2021

View reviewed changes

dokmic force-pushed the feature/79793 branch 2 times, most recently from 6f2dbbd to 31357a7 Compare May 12, 2021 16:17

tsullivan approved these changes May 12, 2021

View reviewed changes

tsullivan reviewed May 12, 2021

View reviewed changes

x-pack/plugins/reporting/server/export_types/png/lib/generate_png.ts Outdated Show resolved Hide resolved

dokmic force-pushed the feature/79793 branch 2 times, most recently from 29fdadc to ab7ff4a Compare May 18, 2021 18:39

dokmic and others added 6 commits May 19, 2021 21:12

Add logging of CPU usage by chromium

c2c1838

Add logging of memory consumption by chromium

665e37d

Add PDF report byte length logging

920f864

Add PNG report byte length logging

e45faa5

Fix APM transactions in PNG job to gather all metrics in one transaction

3f8ce24

guard access to methods on variables that depend on initialization

4d7f9f2

dokmic force-pushed the feature/79793 branch from 2365ff3 to 4d7f9f2 Compare May 19, 2021 19:12

dokmic merged commit cb0c584 into elastic:master May 20, 2021

dokmic deleted the feature/79793 branch May 20, 2021 09:51

dokmic mentioned this pull request May 20, 2021

[7.x] [Reporting] Add logging of CPU usage and memory consumption by Chromium (#99109) #100384

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Reporting] Add logging of CPU usage and memory consumption by Chromium in the reporting. #99109

[Reporting] Add logging of CPU usage and memory consumption by Chromium in the reporting. #99109

dokmic commented May 3, 2021 •

edited

Loading

elasticmachine commented May 3, 2021

tsullivan May 4, 2021

dokmic May 5, 2021

tsullivan May 4, 2021

dokmic May 5, 2021

graphaelli May 4, 2021

dokmic May 5, 2021

tsullivan May 5, 2021

tsullivan commented May 5, 2021

tsullivan commented May 5, 2021

dokmic commented May 5, 2021

tsullivan commented May 6, 2021

tsullivan left a comment

kibanamachine commented May 19, 2021

References to deprecated APIs

[Reporting] Add logging of CPU usage and memory consumption by Chromium in the reporting. #99109

[Reporting] Add logging of CPU usage and memory consumption by Chromium in the reporting. #99109

Conversation

dokmic commented May 3, 2021 • edited Loading

Summary

Checklist

elasticmachine commented May 3, 2021

tsullivan May 4, 2021

Choose a reason for hiding this comment

dokmic May 5, 2021

Choose a reason for hiding this comment

tsullivan May 4, 2021

Choose a reason for hiding this comment

dokmic May 5, 2021

Choose a reason for hiding this comment

graphaelli May 4, 2021

Choose a reason for hiding this comment

dokmic May 5, 2021

Choose a reason for hiding this comment

tsullivan May 5, 2021

Choose a reason for hiding this comment

tsullivan commented May 5, 2021

tsullivan commented May 5, 2021

dokmic commented May 5, 2021

tsullivan commented May 6, 2021

tsullivan left a comment

Choose a reason for hiding this comment

kibanamachine commented May 19, 2021

💚 Build Succeeded

Metrics [docs]

References to deprecated APIs

History

dokmic commented May 3, 2021 •

edited

Loading