-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delay Single User Journey Finish #129991
Delay Single User Journey Finish #129991
Conversation
…ons are uploaded to apm
@@ -166,6 +168,7 @@ export class PerformanceTestingService extends FtrService { | |||
|
|||
private async tearDown(page: Page, client: CDPSession, context: BrowserContext) { | |||
if (page) { | |||
await sleep(5000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think we can simply use await page.waitForTimeout(5000);
?
@@ -166,6 +168,7 @@ export class PerformanceTestingService extends FtrService { | |||
|
|||
private async tearDown(page: Page, client: CDPSession, context: BrowserContext) { | |||
if (page) { | |||
await sleep(5000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Could you add a comment why it's necessary? With a link to flushInterval config option.
- Let's maybe configure
flushInterval
explicitly to avoid relying on the default value? Having it configured, we can set the delay slightly above theflushInterval
value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not using rum agent here and we don't have flushInterval on node agent. Ref: https://www.elastic.co/guide/en/apm/agent/nodejs/current/agent-api.html#apm-flush. But I will add relevant comment to explain why do we use flush
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don’t use the RUM agent in this file, but we can configure flushInterval
for the RUM agent on Kibana client-side
kibana/x-pack/test/performance/config.playwright.ts
Lines 43 to 65 in 34d5ef8
env: { | |
ELASTIC_APM_ACTIVE: process.env.ELASTIC_APM_ACTIVE, | |
ELASTIC_APM_CONTEXT_PROPAGATION_ONLY: 'false', | |
ELASTIC_APM_ENVIRONMENT: process.env.CI ? 'ci' : 'development', | |
ELASTIC_APM_TRANSACTION_SAMPLE_RATE: '1.0', | |
ELASTIC_APM_SERVER_URL: APM_SERVER_URL, | |
ELASTIC_APM_SECRET_TOKEN: APM_PUBLIC_TOKEN, | |
// capture request body for both errors and request transactions | |
// https://www.elastic.co/guide/en/apm/agent/nodejs/current/configuration.html#capture-body | |
ELASTIC_APM_CAPTURE_BODY: 'all', | |
// capture request headers | |
// https://www.elastic.co/guide/en/apm/agent/nodejs/current/configuration.html#capture-headers | |
ELASTIC_APM_CAPTURE_HEADERS: true, | |
// request body with bigger size will be trimmed. | |
// 300_000 is the default of the APM server. | |
// for a body with larger size, we might need to reconfigure the APM server to increase the limit. | |
// https://www.elastic.co/guide/en/apm/agent/nodejs/current/configuration.html#long-field-max-length | |
ELASTIC_APM_LONG_FIELD_MAX_LENGTH: 300_000, | |
ELASTIC_APM_GLOBAL_LABELS: Object.entries({ | |
ftrConfig: `x-pack/test/performance/tests/config.playwright`, | |
performancePhase: process.env.TEST_PERFORMANCE_PHASE, | |
journeyName: process.env.JOURNEY_NAME, | |
testJobId, |
and then use the same / a bit longer delay in the
tearDown
step.Did I miss anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I liked Vighesh' proposal from better
https://github.com/elastic/kibana/pull/129991/files/024d9aad25fe7a7c9cb1f63605e3d4d69ef6321f#r853962293
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- That's still the test runners config and I don't know if we are able to provide env variable to change flush interval.
- I also liked that idea however but I'm not sure about this scenario: let's say there are multiple apm events are being sent it's going to close the page when it gets the first response which is not we want ? @mshustov @vigneshshanmugam maybe it doesn't work like that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also liked that idea however but I'm not sure about this scenario: let's say there are multiple apm events are being sent it's going to close the page when it gets the first response which is not we want ?
It might happen in theory. We can extend this logic to count the number of outcoming and responded requests.
this.rumRequestsInFlight = []
await page.route('**', async (route, request) => {
if(isRumReportingRequest(route)) {
rumRequestsInFlight.push(request.response);
});
function tearDown() {
// wait for the one more reporting call
await page.waitForResponse((response) => response.url().includes('/rum/events'));
// make sure all the requests were finished
await Promise.all(rumRequestsInFlight);
@vigneshshanmugam is there API to disable metrics collection by the RUM agent and enforce it to report already collected events?
maybe it doesn't work like that ?
@suchcodemuchwow how would you suggest this problem be solved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's say there are multiple apm events are being sent it's going to close the page when it gets the first response which is not we want ?
Agreed, if there are multiple transactions in the page within the flush interval, we would need extra guard to count for the no of possible APM server calls from the agent.
The better way to fix this would be
- (Recommended) - Update the RUM agent to the latest version which makes sure the
events queue
in the agent is flushed before the users leaves the page and we dont need any workarounds here. elastic/apm-agent-rum-js@2429814 - For the time being, I think counting strategy based on the api server calls could work.
- (Hacky workaround) - Have not tested it though.
page.evaluate(() => {
const apmServer = window.elasticApm.serviceFactory.getService('ApmServer')
apmServer.queue.flush()
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Recommended) - Update the RUM agent to the latest version
Opened #130765
💚 Build SucceededMetrics [docs]
History
To update your PR or re-run it, just comment with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice change, but probably makes sense to have other suggested from APM team in this PR?
apm.flush(); | ||
await client.detach(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vigneshshanmugam suggested to add
await page.waitForResponse((response) => response.url().includes('/rum/events'));
after flush() to make sure the RUM agent response is sent to APM server before browser is killed.
Closing this since we don't need any workaround after updating new rum agent. |
1 similar comment
Closing this since we don't need any workaround after updating new rum agent. |
We have seen some missing events from the single user performance tests and as it's also mentioned in this #129990 issue we want to delay journey finish to guarantee all events reported properly to APM.