Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter processor fails to omit http_client_duration metric #7591

Closed
gautam-nutalapati opened this issue Feb 7, 2022 · 5 comments
Closed
Labels
bug Something isn't working

Comments

@gautam-nutalapati
Copy link

gautam-nutalapati commented Feb 7, 2022

Describe the bug
I would like to omit http metrics generated by auto-instrumentation java agent.
I am using filter to omit these metric names being pushed to prometheus.
HTTP metrics generated by java agent are not filtered as expected.

Steps to reproduce

  • Run web application using java agent auto instrumentation (for testing, I publish a user defined metric as well)
  • Run otel collector with below filter configuration
  filter/metrics:
    metrics:
     exclude:
        match_type: regexp
        metric_names:
          - api_.*
          - http_.*

What did you expect to see?

  • Metrics published to prometheus will not contain any metrics starting with api_ and http_

What did you see instead?
I see below metrics (Removed buckets to reduce clutter)

# HELP http_client_duration The duration of the outbound HTTP request
# TYPE http_client_duration histogram
http_client_duration_bucket{http_flavor="1.1",http_method="GET",http_url="http://localhost:9900/stux/v1/users/97378103842048256",le="5"} 5
http_client_duration_sum{http_flavor="1.1",http_method="GET",http_url="http://localhost:9900/stux/v1/users/97378103842048256"} 7.492469
http_client_duration_count{http_flavor="1.1",http_method="GET",http_url="http://localhost:9900/stux/v1/users/97378103842048256"} 5
http_client_duration_bucket{http_flavor="1.1",http_method="GET",http_status_code="200",http_url="https://idp.vault.dev-local.prisidio.net/.well-known/jwks.json",le="5"} 0
http_client_duration_sum{http_flavor="1.1",http_method="GET",http_status_code="200",http_url="https://idp.vault.dev-local.prisidio.net/.well-known/jwks.json"} 367.222015
http_client_duration_count{http_flavor="1.1",http_method="GET",http_status_code="200",http_url="https://idp.vault.dev-local.prisidio.net/.well-known/jwks.json"} 1
# HELP http_server_active_requests The number of concurrent HTTP requests that are currently in-flight
# TYPE http_server_active_requests gauge
http_server_active_requests{http_flavor="1.1",http_host="localhost:8060",http_method="GET",http_scheme="http"} 0
# HELP http_server_duration The duration of the inbound HTTP request
# TYPE http_server_duration histogram
http_server_duration_bucket{http_flavor="1.1",http_host="localhost:8060",http_method="GET",http_scheme="http",http_status_code="500",le="5"} 0
http_server_duration_sum{http_flavor="1.1",http_host="localhost:8060",http_method="GET",http_scheme="http",http_status_code="500"} 2805.492884
http_server_duration_count{http_flavor="1.1",http_host="localhost:8060",http_method="GET",http_scheme="http",http_status_code="500"} 1

Without this filter I see below metrics (note that api_latency is custom metric I added which was successfully filtered in above output).

# HELP api_latency API latency time in milliseconds.
# TYPE api_latency histogram
api_latency_bucket{api_method="GET",api_name="/users/v1/profiles/me",env="dev-local",status_code="500",svc="user-profile-svc",le="5"} 0
api_latency_sum{api_method="GET",api_name="/users/v1/profiles/me",env="dev-local",status_code="500",svc="user-profile-svc"} 2665
api_latency_count{api_method="GET",api_name="/users/v1/profiles/me",env="dev-local",status_code="500",svc="user-profile-svc"} 1
# HELP http_client_duration The duration of the outbound HTTP request
# TYPE http_client_duration histogram
http_client_duration_bucket{http_flavor="1.1",http_method="GET",http_url="http://localhost:9900/stux/v1/users/97378103842048256",le="5"} 5
http_client_duration_sum{http_flavor="1.1",http_method="GET",http_url="http://localhost:9900/stux/v1/users/97378103842048256"} 7.492469
http_client_duration_count{http_flavor="1.1",http_method="GET",http_url="http://localhost:9900/stux/v1/users/97378103842048256"} 5
http_client_duration_bucket{http_flavor="1.1",http_method="GET",http_status_code="200",http_url="https://idp.vault.dev-local.prisidio.net/.well-known/jwks.json",le="5"} 0
http_client_duration_sum{http_flavor="1.1",http_method="GET",http_status_code="200",http_url="https://idp.vault.dev-local.prisidio.net/.well-known/jwks.json"} 367.222015
http_client_duration_count{http_flavor="1.1",http_method="GET",http_status_code="200",http_url="https://idp.vault.dev-local.prisidio.net/.well-known/jwks.json"} 1
# HELP http_server_active_requests The number of concurrent HTTP requests that are currently in-flight
# TYPE http_server_active_requests gauge
http_server_active_requests{http_flavor="1.1",http_host="localhost:8060",http_method="GET",http_scheme="http"} 0
# HELP http_server_duration The duration of the inbound HTTP request
# TYPE http_server_duration histogram
http_server_duration_bucket{http_flavor="1.1",http_host="localhost:8060",http_method="GET",http_scheme="http",http_status_code="500",le="5"} 0
http_server_duration_sum{http_flavor="1.1",http_host="localhost:8060",http_method="GET",http_scheme="http",http_status_code="500"} 2805.492884
http_server_duration_count{http_flavor="1.1",http_host="localhost:8060",http_method="GET",http_scheme="http",http_status_code="500"} 1

What version did you use?
Version:
collector 0.16.0 0.43.0
java agent 1.10.1

What config did you use?
Config:

extensions:
  health_check:
  pprof:
    endpoint: :1777
  zpages:
    endpoint: :55679

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317


processors:
  memory_limiter:
    check_interval: 1s
    limit_percentage: 50
    spike_limit_percentage: 30
  filter/metrics:
    metrics:
      exclude:
        match_type: regexp
        metric_names:
          - api_.*
          - http_.*
  batch/metrics:
    timeout: 30s
    send_batch_size: 200

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"
service:
  extensions: [pprof, zpages, health_check]
  pipelines:
    metrics/2:
      receivers: [otlp]
      processors: [memory_limiter, filter/metrics, batch/metrics]
      exporters: [prometheus]

Environment
OS: MacOS Monterey 12.0.1
Compiler N/A

Additional context
If filter processor intentionally avoids filtering auto-instrumented http metrics, then this issue is irrelevant.
Related to https://github.com/open-telemetry/opentelemetry-collector/issues/2310

Filter startup logs:

2022-02-07T15:46:42.075Z	info	[email protected]/filter_processor.go:78	Metric filter configured	{"kind": "processor", "name": "filter/metrics", "include match_type": "", "include expressions": [], "include metric names": [], "include metrics with resource attributes": null, "exclude match_type": "strict", "exclude expressions": [], "exclude metric names": ["api_latency", "http_client_duration"], "exclude metrics with resource attributes": null, "checksMetrics": true, "checkResouces": false}
@gautam-nutalapati gautam-nutalapati added the bug Something isn't working label Feb 7, 2022
@jpkrohling
Copy link
Member

collector 0.16.0

2022-02-07T15:46:42.075Z	info	[email protected]/

Something doesn't add up :-)

In any case, I'm pinging the code owner: cc @boostchicken

@gautam-nutalapati
Copy link
Author

gautam-nutalapati commented Feb 7, 2022

Thanks for looking, Nice catch.. 0.16.0 was from https://github.com/aws-observability/aws-otel-collector
I reproduced it with collector version is 0.43.0.
Update: I created a opentelemetry-demo-app to reproduce the issue. Run docker-compose up and hit http://localhost:8899/metrics which shows http_client_duration metric, which should be filtered based on config in configs/otel/otel-config.yaml

@boostchicken
Copy link
Member

@gautam-nutalapati can you try this on the latest versions and confirm if this is still an issue.

@gautam-nutalapati
Copy link
Author

I tried the latest version, and it has been fixed 🎉
Thank you for the fix!!

@taisph
Copy link

taisph commented Apr 18, 2023

I just tried adding a similar filter yet the metrics are not filtered. Could there be a regression?

{
    "caller": "[email protected]/metrics.go:101",
    "exclude expressions": [],
    "exclude match_type": "regexp",
    "exclude metric names": [
        "http_.*"
    ],
    "exclude metrics with resource attributes": null,
    "include expressions": [],
    "include match_type": "",
    "include metric names": [],
    "include metrics with resource attributes": null,
    "kind": "processor",
    "level": "info",
    "msg": "Metric filter configured",
    "name": "filter/metrics",
    "pipeline": "metrics",
    "ts": 1681808701.6006052
}
    processors:
      filter/metrics:
        metrics:
          exclude:
            match_type: regexp
            metric_names:
            - http_.*

    service:
      extensions: [health_check, memory_ballast]
      pipelines:
        metrics:
          receivers: [otlp, opencensus]
          processors: [memory_limiter, filter/metrics, batch/metrics]
          exporters: [prometheus, logging]
curl -s 'http://localhost:8889/metrics' | grep -E '^http_.*' | wc -l
216

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants