Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control http-request transactions of APM RUM agent manually #129624

Closed
mshustov opened this issue Apr 6, 2022 · 9 comments
Closed

Control http-request transactions of APM RUM agent manually #129624

mshustov opened this issue Apr 6, 2022 · 9 comments
Labels
impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@mshustov
Copy link
Contributor

mshustov commented Apr 6, 2022

Performance WG relies on the APM agent to capture traffic between the Kibana browser app and the Kibana server. During the APM transaction tree analysis, we noticed that some AJAX requests or static assets loading might be marked as a child of other AJAX requests.
2022-04-04_14-45-46
@vigneshshanmugam explained this as the normal behavior of the RUM agent and suggested to use manual instrumentation instead of the automatic one.

if there are multiple API calls, in this example /baearch and /data, RUM agent creates a transaction with the first API call and groups them together since they happen without much delay.
The automatic transaction tries to capture most outgoing network requests within a single transaction, if we need to control this, then manual instrumenting is required.

@devcorpio suggested to disable automatic instrumentation for AJAX requests with disableInstrumentations config option.

disableInstrumentations: ['xmlhttprequest', 'fetch']

To have the correct transaction tree we need:

  • to opt-out of automatic instrumentation for AJAX requests
  • use manual instrumentation for the AJAX requests. @vigneshshanmugam @devcorpio Do you have any examples for this?
@mshustov mshustov added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Apr 6, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@devcorpio
Copy link
Contributor

Hi @mshustov,

In the part of the code where you perform the http requests you can do something like:

function fetchSomethingSomething(somethingId) {
const transaction = apm.startTransaction('fetchSomethingSomething', 'http-request')

const httpSpan = transaction.startSpan('GET(for instance) **yourendpoint**', 'external.http')
 
  return api.Fetch(
    `**yourendpoint**`
  ) .then((resp) => {
   
    httpSpan.end() // end span hare
    transaction.end() // end transaction here
  }).catch(err => {
      apm.captureError(err)
     throw err;
});
}

In this link there is another example

By the way, If a http request causes the browser to load a new html portion (which includes resources such as css, etc) and you want the agent to include timing information and spans about it in the transaction, you should create the transaction with the flag managed set to true and the agent will handle it automatically. We call this "custom managed transaction", more info here

Thanks,
Alberto

@lizozom
Copy link
Contributor

lizozom commented Apr 7, 2022

If we did this, would we still be able to measure page-load and app-load times? Or would static assets be completely excluded?

@mshustov
Copy link
Contributor Author

mshustov commented Apr 7, 2022

Or would static assets be completely excluded?

Assets loaded by js (for example, bundle chunks loaded by webpack) will be excluded. We will have to control page-load and app-load manually to keep the relationships.

@lizozom
Copy link
Contributor

lizozom commented Apr 7, 2022

Actually, we already control the page-load transaction ourselves.
I have to admit that it's a bit odd that we have to disable the entire automatic instrumentation at this point.
Maybe we're trying to use apm in a way that we shouldn't be? Maybe the agent does need to be fine tuned in some way? 🤔

@vigneshshanmugam
Copy link
Member

If the intention here is to separate API calls happening between client and server in to its own transactions, then the manual transaction is the only way to do it as the RUM agent tries to group all of them together if it can to a specific user activity.

Or would static assets be completely excluded?

We can capture all the network requests associated with the transaction even when we are doing the manual instrumentation. The instrumentation would give us the ability to start/end transaction at arbitrary intervals but also provide the option to capture the underlying network requests.

const transaction = apm.startTransaction('custom managed', 'custom', { managed: true })

// later 
transaction.end()

Let us know if this works.

@vigneshshanmugam
Copy link
Member

vigneshshanmugam commented Apr 11, 2022

Had discussion with Liza and posting back from slack thread - https://elastic.slack.com/archives/C017DFNCV5H/p1649692286349239?thread_ts=1649167476.656889&cid=C017DFNCV5H

Are we looking for separating the request  transactions even on the client side? What does that gives us? Because most of the applications prefer to group everything as we can tie up everything the users are doing in a particular session/journey.
If this is a use case that we have missed, would like to solve it on the agent side.

Before we jump in to solutions, Let's understand the intention, because we might be overthinking on how transactions in the browser and Kibana server works. In the RUM agent, everything is tied to a specific user activity. Transactions is just a APM term that is used to group them as we can get away without building a dedicated UI specific to RUM. In the backend agents these are different entities.

There are multiple phases in the frontend

  1. Users loads the web page and waits for all the static resources till page load is fired - This is grouped as page-load transaction till
  2. Post the page-load, if there are no application switch (routing change in terms of kibana) - if there is a network activity based on user click events or API call then the agent creates transactions such as click or http-request.
  3. When the user switches from one app to another (ex - /discover to /apm) - Grouped as route-change and all the underlying network activity will be added as well.

Lets take this example trace - https://kibana-ops-e2e-perf.kb.us-central1.gcp.cloud.es.io:9243/app/apm/services/kibana-frontend/transactions/view?kuery=c6dde9b70e26c84f&rangeFrom=now-7d%2Fd&rangeTo=now&environment=ENVIRONMENT_ALL&transactionName=security_login&transactionType=app-change&comparisonEnabled=true&comparisonType=week&latencyAggregationType=avg&transactionId=c6dde9b70e26c84f&traceId=2e092905071d280d3355258f2be2cb85&flyoutDetailTab=&waterfallItemId=

We have the grouping done at the app-change and the associated calls to two API's /api/licensing/info and /app/banners/info are added as children as they belong to a single user activity that triggered as a result of landing on the login page.

If the intention is to separate all the outgoing API calls as individual transactions, then we are using the RUM agent in a wrong way for the particular use-case.

TLDR: In the client, We need to consider everything as a User Acvitiy/Session instead of transactions or trace.

@lizozom
Copy link
Contributor

lizozom commented Apr 14, 2022

@mshustov can we close this issue?
I think we can work with @vigneshshanmugam and the ops team to define what we want to measure better and find a way and build the benchmarks around the data we have.
What do you think?

@lizozom
Copy link
Contributor

lizozom commented Apr 26, 2022

Closing. We discussed using an approach combining APM traces with #121992 to get the benchmarking we need.

@lizozom lizozom closed this as completed Apr 26, 2022
@exalate-issue-sync exalate-issue-sync bot reopened this Apr 26, 2022
@exalate-issue-sync exalate-issue-sync bot added the impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. label Apr 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
impact:needs-assessment Product and/or Engineering needs to evaluate the impact of the change. Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

No branches or pull requests

5 participants