Replicate transaction ignore pattern for spans #1127

eecarres · 2021-05-03T11:13:21Z

Is your feature request related to a problem? Please describe.
We currently use APM for different web services we have. But in order to use the same stack and observability tool, we introduced APM into a heavy image processing pipeline. Each job is a transaction, and we consider spans each of the stages of the pipeline (so we can have all linked and the data "has sense"). The problem we face is because of the "automatic discovery of spans" that the agent does, and which we're not able to disable in any way. This produces transactions with a mean value of 40K spans, and we can't afford to discard the "important" ones (the ones we manually capture) because that makes our data incomplete and the dashboards show fake information. So far we're "solving" the issue by incrementing to 60K the parameter ELASTIC_APM_TRANSACTION_MAX_SPANS and using ELASTIC_APM_SPAN_FRAMES_MIN_DURATION to reduce size of the info we save on non-interesting spans. But ideally we should just ignore almost all the spans or have a way of selecting which ones do we want.

Describe the solution you'd like
Ideally, a similar config parameter like ELASTIC_APM_TRANSACTIONS_IGNORE_PATTERNS but for spans. Or a way of only capturing the spans defined by us (or "manually captured" using the capture_span method.

Describe alternatives you've considered
We've considered overriding the Client class and "hack" it to disable this span automatic retrieval, but we don't like the idea, so currently we set the number of maximum spans to 60K and then just do regular cleanup on those we are not interested in.

The text was updated successfully, but these errors were encountered:

felixbarny · 2021-05-03T13:29:00Z

We're currently thinking about how to make the handling of huge trace more efficient.

produces transactions with a mean value of 40K spans, and we can't afford to discard the "important" ones (the ones we manually capture) because that makes our data incomplete and the dashboards show fake information.

What kind of spans are these? Are these calls to Redis? Or maybe calls to a relational DB? If so, is the statement always the same or do they differ? What's the typical duration of these spans?

We plan to de-duplicate DB spans with the same statement and group fast spans with differents statements to the same database. Would that work for you?

The Python agent has a concept of processors. You could implement a processor to drop non-interesting spans, based on their type, subtype, or duration.

eecarres · 2021-05-03T14:11:50Z

It's a mixture of web requests to our own servers and direct DB calls. They're usually the same statements but repeated as many times as images we need to get info from, or update that information. I'd say 95% are below one second, but the whole pipeline (the transaction in our case) can last 2 days or even more.

The solutions you're working on seem promising! I would like just to remove all of them, but that may not be aligned with your vision. Anyways, it should be much better.

And many thanks for pointing me to the processors, will try to implement it to solve the issue!

basepi · 2021-06-10T17:48:31Z

I'm going to close this for now, banking on elastic/apm#432 being the best generalized solution for this type of problem. Hopefully you found processors helpful in the meantime!

github-actions bot added the agent-python label May 3, 2021

basepi closed this as completed Jun 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicate transaction ignore pattern for spans #1127

Replicate transaction ignore pattern for spans #1127

eecarres commented May 3, 2021 •

edited

Loading

felixbarny commented May 3, 2021

eecarres commented May 3, 2021

basepi commented Jun 10, 2021

Replicate transaction ignore pattern for spans #1127

Replicate transaction ignore pattern for spans #1127

Comments

eecarres commented May 3, 2021 • edited Loading

felixbarny commented May 3, 2021

eecarres commented May 3, 2021

basepi commented Jun 10, 2021

eecarres commented May 3, 2021 •

edited

Loading