Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay Single User Journey Finish #129991

Closed
wants to merge 4 commits into from
Closed
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions x-pack/test/performance/services/performance.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ apm.start({
secretToken: 'CTs9y3cvcfq13bQqsB',
});

export const sleep = (ms: number) => new Promise((r) => setTimeout(r, ms));

export interface StepCtx {
page: Page;
kibanaUrl: string;
Expand Down Expand Up @@ -166,6 +168,7 @@ export class PerformanceTestingService extends FtrService {

private async tearDown(page: Page, client: CDPSession, context: BrowserContext) {
if (page) {
await sleep(5000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we can simply use await page.waitForTimeout(5000); ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Could you add a comment why it's necessary? With a link to flushInterval config option.
  2. Let's maybe configure flushInterval explicitly to avoid relying on the default value? Having it configured, we can set the delay slightly above the flushInterval value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not using rum agent here and we don't have flushInterval on node agent. Ref: https://www.elastic.co/guide/en/apm/agent/nodejs/current/agent-api.html#apm-flush. But I will add relevant comment to explain why do we use flush

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don’t use the RUM agent in this file, but we can configure flushInterval for the RUM agent on Kibana client-side

env: {
ELASTIC_APM_ACTIVE: process.env.ELASTIC_APM_ACTIVE,
ELASTIC_APM_CONTEXT_PROPAGATION_ONLY: 'false',
ELASTIC_APM_ENVIRONMENT: process.env.CI ? 'ci' : 'development',
ELASTIC_APM_TRANSACTION_SAMPLE_RATE: '1.0',
ELASTIC_APM_SERVER_URL: APM_SERVER_URL,
ELASTIC_APM_SECRET_TOKEN: APM_PUBLIC_TOKEN,
// capture request body for both errors and request transactions
// https://www.elastic.co/guide/en/apm/agent/nodejs/current/configuration.html#capture-body
ELASTIC_APM_CAPTURE_BODY: 'all',
// capture request headers
// https://www.elastic.co/guide/en/apm/agent/nodejs/current/configuration.html#capture-headers
ELASTIC_APM_CAPTURE_HEADERS: true,
// request body with bigger size will be trimmed.
// 300_000 is the default of the APM server.
// for a body with larger size, we might need to reconfigure the APM server to increase the limit.
// https://www.elastic.co/guide/en/apm/agent/nodejs/current/configuration.html#long-field-max-length
ELASTIC_APM_LONG_FIELD_MAX_LENGTH: 300_000,
ELASTIC_APM_GLOBAL_LABELS: Object.entries({
ftrConfig: `x-pack/test/performance/tests/config.playwright`,
performancePhase: process.env.TEST_PERFORMANCE_PHASE,
journeyName: process.env.JOURNEY_NAME,
testJobId,

and then use the same / a bit longer delay in the tearDown step.
Did I miss anything?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. That's still the test runners config and I don't know if we are able to provide env variable to change flush interval.
  2. I also liked that idea however but I'm not sure about this scenario: let's say there are multiple apm events are being sent it's going to close the page when it gets the first response which is not we want ? @mshustov @vigneshshanmugam maybe it doesn't work like that ?

Copy link
Contributor

@mshustov mshustov Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also liked that idea however but I'm not sure about this scenario: let's say there are multiple apm events are being sent it's going to close the page when it gets the first response which is not we want ?

It might happen in theory. We can extend this logic to count the number of outcoming and responded requests.

this.rumRequestsInFlight = []

await page.route('**', async (route, request) => {
   if(isRumReportingRequest(route)) {
   rumRequestsInFlight.push(request.response);
});

function tearDown() {
  // wait for the one more reporting call
  await page.waitForResponse((response) => response.url().includes('/rum/events'));
  // make sure all the requests were finished
  await Promise.all(rumRequestsInFlight);

@vigneshshanmugam is there API to disable metrics collection by the RUM agent and enforce it to report already collected events?

maybe it doesn't work like that ?

@suchcodemuchwow how would you suggest this problem be solved?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's say there are multiple apm events are being sent it's going to close the page when it gets the first response which is not we want ?
Agreed, if there are multiple transactions in the page within the flush interval, we would need extra guard to count for the no of possible APM server calls from the agent.

The better way to fix this would be

  1. (Recommended) - Update the RUM agent to the latest version which makes sure the events queue in the agent is flushed before the users leaves the page and we dont need any workarounds here. elastic/apm-agent-rum-js@2429814
  2. For the time being, I think counting strategy based on the api server calls could work.
  3. (Hacky workaround) - Have not tested it though.
page.evaluate(() => {
  const apmServer = window.elasticApm.serviceFactory.getService('ApmServer')
  apmServer.queue.flush()
});

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Recommended) - Update the RUM agent to the latest version

Opened #130765

apm.flush();
await client.detach();
Comment on lines 171 to 172
Copy link
Member

@dmlemeshko dmlemeshko Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vigneshshanmugam suggested to add
await page.waitForResponse((response) => response.url().includes('/rum/events')); after flush() to make sure the RUM agent response is sent to APM server before browser is killed.

await page.close();
Expand Down