Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter Processor for Tracing #2310

Closed
cemo opened this issue Dec 20, 2020 · 27 comments
Closed

Filter Processor for Tracing #2310

cemo opened this issue Dec 20, 2020 · 27 comments
Labels
area:processor enhancement New feature or request help wanted Good issue for contributors to OpenTelemetry Service to pick up priority:p3 Lowest release:after-ga

Comments

@cemo
Copy link

cemo commented Dec 20, 2020

Is your feature request related to a problem? Please describe.
I would like to drop some of our traces such as health check requests

Describe the solution you'd like
I see that filterprocessor is only supporting metrics.
https://github.com/open-telemetry/opentelemetry-collector/tree/master/processor/filterprocessor
I think using filter processor to drop some requests might be possible. Is it possible?

@dradetsky
Copy link

I also need this capability. I don't think I can accomplish it via ignore filters in autoinstrumentation plugins, as then I get headless traces from child requests of the parent requests I want to ignore.

@dradetsky
Copy link

If the way one is supposed to implement it is using this then it's really unclear how I'm supposed to do it. That config seems to require the use of name: and the include/exclude options apply to whether or not the span is included or excluded from having its name modified, not whether the span is or is not dropped entirely.

@andrewhsu andrewhsu added enhancement New feature or request help wanted Good issue for contributors to OpenTelemetry Service to pick up priority:p3 Lowest release:after-ga spec:trace area:processor and removed feature request labels Jan 6, 2021
@ipaxos
Copy link

ipaxos commented Jan 28, 2021

Any updates regarding this? I am facing the same issue.

@morigs
Copy link
Contributor

morigs commented Feb 8, 2021

Span (not trace) filtering is really what you want to achieve? This can lead to patchy traces
If you want to filter traces, then this is a much more difficult task that cannot be performed by one processor
It called tail based sampling. You can get some background here
In this case you have to use combination of three processors from the contrib repository: routing, groupbytrace and tailsampling

@cemo
Copy link
Author

cemo commented Feb 8, 2021

@morigs Would you please share a sample :) ?

@morigs
Copy link
Contributor

morigs commented Feb 8, 2021

Tbh I didn't experiment with this yet, so you have to try yourself 😞
Btw I believe it should be better documented, so may be we should comment #407 or create new issue for this case

@morigs
Copy link
Contributor

morigs commented Feb 8, 2021

Also I can be wrong about groupbytrace requirement

@albertteoh
Copy link
Contributor

From this #560 (comment) it seems there may be some use cases for just filtering on spans and so it's worth exploring.

Agree with @morigs that it can lead to holes in traces and that a tail-based sampling approach would be best for filtering out whole traces. However, it doesn't currently support "sampling out" (i.e. dropping) traces as of now and there are plans to deprecate it.

@jpkrohling
Copy link
Member

jpkrohling commented Feb 11, 2021

We talked about this during the SIG from yesterday: https://docs.google.com/document/d/1r2JC5MB7GupCE7N32EwGEXs9V_YIsPgoFiLP4VWVMkE/edit?ts=5de972a2#heading=h.791inlvyg4dl

I mentioned a few links to requests from users in the Jaeger community, confirming that there's demand for this feature, but there are some questions to be answered. For instance, when removing spans, should the parent IDs for the child spans be remapped to a new parent? Otherwise, users will see warnings that spans are missing, and it won't be clear to end users what happened.

@puckpuck
Copy link

I ran into the need for this today, and now have 2 use cases for this.

  1. Dropping health checks, which are typically single span traces.
  2. Routing traces to different exporters that are received on the same receiver. I should be able to achieve this with multiple pipelines, but I need a way to exclude a set of traces from each pipeline. I can ensure all my spans have an attribute(s) to dictate which pipeline they belong to.

I understand the need to want and keep traces together, but using this processor requires someone to read docs. You could clearly outline in the docs that this can lead to breaking up traces and to use with caution. I would hate we delay trying to implement this because we want to remap parent/child references.

@jpkrohling
Copy link
Member

Routing traces to different exporters that are received on the same receiver.

Have you seen the routing processor?
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/routingprocessor

It might not have the features you need, but should be easy to extend it so that it uses a different source for the from_attribute property.

@miry
Copy link

miry commented Apr 21, 2021

@jpkrohling Do you have example how can I use contrib processor with main collector?

@jpkrohling
Copy link
Member

You should use the contrib distribution for that, or build your own distribution based on core + that processor.

https://github.com/open-telemetry/opentelemetry-collector-contrib/releases/tag/v0.24.0

https://github.com/open-telemetry/opentelemetry-collector-builder

@kradalby
Copy link

Another use case for our organisation is the ability to "enforce" that all traces/spans have an attribute, e.g. an application ID and that we drop everything not adhering to the standard. Yes there might be wholes in the beginning, but when everyone adds it, it should not be an issue.

@Stocco
Copy link

Stocco commented Jun 11, 2021

We also need this feature as well. The spans generated by our LB healthchecks pollutes our vision and it also increases our monitoring tool bill :'(

@morigs
Copy link
Contributor

morigs commented Jun 15, 2021

@Stocco in case you're looking for a workaround: some HTTP instrumentation allows to specify a list of exclusions.
For instance this parameter in opentelemetry-js

@gongcon
Copy link

gongcon commented Aug 13, 2021

We need this feature as well for dropping spans from envoy which is out of our control. The routing processor looks at http request headers and is very restricted, e.g. does not work with batch. Also, seems it only works on trace level, not span.

@HuBaX
Copy link

HuBaX commented Oct 4, 2021

Is anyone working on this? If not I'd like to try to implement that.

@cemo
Copy link
Author

cemo commented Oct 4, 2021

@billg-splunk
Copy link
Contributor

Just going to leave a note that having a solution for this would be very appreciated. There are various instrumentation libraries that won't be able to accommodate this. Having a method to handle it at the collector level would be extremely welcome.

@william-tran
Copy link

I have a proposal in open-telemetry/opentelemetry-collector-contrib#7561 that might solve most of these use cases, but generalizes this problem as a routing problem.

@william-tran
Copy link

@billg-splunk Take a look at open-telemetry/opentelemetry-collector-contrib#7561 (comment) where we can use attributes -> groupbyattrs -> routing. What's already there works fine for filtering out noisy spans. Re-connecting sub-trees would require a lot more effort though.

@politician
Copy link

politician commented Mar 25, 2022

@Stocco in case you're looking for a workaround: some HTTP instrumentation allows to specify a list of exclusions. For instance this parameter in opentelemetry-js

Fix: permalink

@AkselAllas
Copy link

AkselAllas commented Oct 5, 2022

I wound up doing the following and worked wonderfully:

    processors:
      tail_sampling:
        policies:
          - name: drop_noisy_traces_url
            type: string_attribute
            string_attribute:
              key: http.target
              values:
                - \/metrics
                - \/actuator*
                - opentelemetry\.proto
                - favicon\.ico
                - \/health
              enabled_regex_matching: true
              invert_match: true
    service:      pipelines:
        traces/2:
          receivers: [otlp]
          processors: [batch, tail_sampling]
          exporters: [custom]

@gunjan-it-engg
Copy link

Actually i'm using filter processing with my trace generator app and my processor giving the error below

Cannot unmarshal the configuration: error reading processor configuration for "filter":1 error decode log : * ' ' has invalid keys : spans

@tuhao1020
Copy link

We talked about this during the SIG from yesterday: https://docs.google.com/document/d/1r2JC5MB7GupCE7N32EwGEXs9V_YIsPgoFiLP4VWVMkE/edit?ts=5de972a2#heading=h.791inlvyg4dl

I mentioned a few links to requests from users in the Jaeger community, confirming that there's demand for this feature, but there are some questions to be answered. For instance, when removing spans, should the parent IDs for the child spans be remapped to a new parent? Otherwise, users will see warnings that spans are missing, and it won't be clear to end users what happened.

Can spans that have such warnings be removed by filter processor now?

hughesjj pushed a commit to hughesjj/opentelemetry-collector that referenced this issue Apr 27, 2023
@TylerHelmuth
Copy link
Member

As of https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CHANGELOG.md#v0660 filterprocessor allows dropping spans using OTTL.

Troels51 pushed a commit to Troels51/opentelemetry-collector that referenced this issue Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:processor enhancement New feature or request help wanted Good issue for contributors to OpenTelemetry Service to pick up priority:p3 Lowest release:after-ga
Projects
None yet
Development

No branches or pull requests