Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicate transaction ignore pattern for spans #1127

Closed
eecarres opened this issue May 3, 2021 · 3 comments
Closed

Replicate transaction ignore pattern for spans #1127

eecarres opened this issue May 3, 2021 · 3 comments

Comments

@eecarres
Copy link

eecarres commented May 3, 2021

Is your feature request related to a problem? Please describe.
We currently use APM for different web services we have. But in order to use the same stack and observability tool, we introduced APM into a heavy image processing pipeline. Each job is a transaction, and we consider spans each of the stages of the pipeline (so we can have all linked and the data "has sense"). The problem we face is because of the "automatic discovery of spans" that the agent does, and which we're not able to disable in any way. This produces transactions with a mean value of 40K spans, and we can't afford to discard the "important" ones (the ones we manually capture) because that makes our data incomplete and the dashboards show fake information. So far we're "solving" the issue by incrementing to 60K the parameter ELASTIC_APM_TRANSACTION_MAX_SPANS and using ELASTIC_APM_SPAN_FRAMES_MIN_DURATION to reduce size of the info we save on non-interesting spans. But ideally we should just ignore almost all the spans or have a way of selecting which ones do we want.

Describe the solution you'd like
Ideally, a similar config parameter like ELASTIC_APM_TRANSACTIONS_IGNORE_PATTERNS but for spans. Or a way of only capturing the spans defined by us (or "manually captured" using the capture_span method.

Describe alternatives you've considered
We've considered overriding the Client class and "hack" it to disable this span automatic retrieval, but we don't like the idea, so currently we set the number of maximum spans to 60K and then just do regular cleanup on those we are not interested in.

@felixbarny
Copy link
Member

We're currently thinking about how to make the handling of huge trace more efficient.

produces transactions with a mean value of 40K spans, and we can't afford to discard the "important" ones (the ones we manually capture) because that makes our data incomplete and the dashboards show fake information.

What kind of spans are these? Are these calls to Redis? Or maybe calls to a relational DB? If so, is the statement always the same or do they differ? What's the typical duration of these spans?

We plan to de-duplicate DB spans with the same statement and group fast spans with differents statements to the same database. Would that work for you?

The Python agent has a concept of processors. You could implement a processor to drop non-interesting spans, based on their type, subtype, or duration.

@eecarres
Copy link
Author

eecarres commented May 3, 2021

It's a mixture of web requests to our own servers and direct DB calls. They're usually the same statements but repeated as many times as images we need to get info from, or update that information. I'd say 95% are below one second, but the whole pipeline (the transaction in our case) can last 2 days or even more.

The solutions you're working on seem promising! I would like just to remove all of them, but that may not be aligned with your vision. Anyways, it should be much better.

And many thanks for pointing me to the processors, will try to implement it to solve the issue!

@basepi
Copy link
Contributor

basepi commented Jun 10, 2021

I'm going to close this for now, banking on elastic/apm#432 being the best generalized solution for this type of problem. Hopefully you found processors helpful in the meantime!

@basepi basepi closed this as completed Jun 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants