Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: All indicators on the Monitor page are empty #4676

Closed
zenvzenv opened this issue Aug 16, 2023 · 5 comments
Closed

[Bug]: All indicators on the Monitor page are empty #4676

zenvzenv opened this issue Aug 16, 2023 · 5 comments
Labels

Comments

@zenvzenv
Copy link

What happened?

I want to use jaeger's monitor page to monitor the running status of the application. I configured METRICS_STORAGE_TYPE according to the instructions, configured otelcol's SpanMetrics Connector and --prometheus.query.support-spanmetrics-connector=true, but it didn't work, the monitor page All indicators are empty in

Steps to reproduce

  1. export METRICS_STORAGE_TYPE=prometheus
  2. export PROMETHEUS_SERVER_URL=http://localhost:9090
  3. Set --prometheus.query.support-spanmetrics-connector=true
  4. Configure the spanmetrics connector of otelcol,my otelcol config file like this:
    extensions:
      health_check:
      zpages:
        endpoint: 0.0.0.0:55679
    
    receivers:
      otlp:
        protocols:
          grpc:
    
    processors:
      batch:
    
    exporters:
      prometheus:
        endpoint: "0.0.0.0:8889"
        namespace: promexample
    
      jaeger:
        endpoint: 10.20.74.60:14250
        tls:
          insecure: true
    
      prometheusremotewrite:
        endpoint: http://localhost:9090/api/v1/write
        namespace: promremotewriteexample
        target_info:
          enabled: true
        tls:
          insecure: true
        remote_write_queue:
          enabled: false
          num_consumers: 10
    
    connectors:
      spanmetrics:
        namespace: spanmetrics
        histogram:
          explicit:
            buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
        dimensions:
          - name: http.method
            default: GET
          - name: http.status_code
        exemplars:
          enabled: true
        exclude_dimensions: ['status.code']
        dimensions_cache_size: 1000
        aggregation_temporality: "AGGREGATION_TEMPORALITY_CUMULATIVE"    
        metrics_flush_interval: 15s 
    
    service:
      pipelines:
        traces:
          receivers: [otlp]
          exporters: [jaeger, spanmetrics]
    
        metrics:
          receivers: [spanmetrics]
          exporters: [prometheus]
    
      extensions: [health_check, zpages]

Expected behavior

The monitor page displays performance data

Relevant log output

No response

Screenshot

No response

Additional context

No response

Jaeger backend version

1.47.0

SDK

No response

Pipeline

javaagent->otelcol->spanmetrics connector->prometheus

Stogage backend

clickhouse

Operating system

linux

Deployment model

CLI

Deployment configs

No response

@zenvzenv zenvzenv added the bug label Aug 16, 2023
@albertteoh
Copy link
Contributor

Can you see your traces in Jaeger?

Have you tried going through the Troubleshooting steps?

  • I noticed you've configured otel with namespace: spanmetrics which means you'll need to add the --prometheus.query.namespace=spanmetrics flag. I suggest checking the Prometheus server to see if you can find spanmetrics.calls_total and spanmetrics.duration_bucket metrics there.

@zenvzenv
Copy link
Author

  1. I am able to see traces data in jaeger.
  2. I've checked out Troubleshooting, but it doesn't seem to be working.
  • http://all-in-one:14269/metrics is work and have jaeger_requests_total and jaeger_latency_bucket data
  • http://jaeger-query:16687/metrics is not work,so /metrics endpoint have no the following indicators
    jaeger_query_requests_total{operation="get_call_rates",result="ok"} 18
    jaeger_query_requests_total{operation="get_error_rates",result="ok"} 18
    jaeger_query_requests_total{operation="get_latencies",result="ok"} 36
    
    jaeger_query_latency_bucket{operation="get_call_rates",result="ok",le="0.005"} 5
    jaeger_query_latency_bucket{operation="get_call_rates",result="ok",le="0.01"} 13
    jaeger_query_latency_bucket{operation="get_call_rates",result="ok",le="0.025"} 18
    
    jaeger_query_latency_bucket{operation="get_error_rates",result="ok",le="0.005"} 7
    jaeger_query_latency_bucket{operation="get_error_rates",result="ok",le="0.01"} 13
    jaeger_query_latency_bucket{operation="get_error_rates",result="ok",le="0.025"} 18
    
    jaeger_query_latency_bucket{operation="get_latencies",result="ok",le="0.005"} 7
    jaeger_query_latency_bucket{operation="get_latencies",result="ok",le="0.01"} 25
    jaeger_query_latency_bucket{operation="get_latencies",result="ok",le="0.025"} 36
    
  1. Prometheus have no latency_bucket and calls_total metrics data
  2. I add --prometheus.query.namespace=spanmetrics flag to jaeger start cmd,my cmd is : SPAN_STROAGE_TYPE=grpc-plugin METRICS_STORAGE_TYPE=prometheus PROMETHEUS_SERVER_URL=http:localhost:9090 PROMETHEUS_QUERY_SUPPORT_SPANMETRICS_CONNECTOR=true jaeger-all-in-one --grpc.stroage.plugin.binary=/path/to/jaeger-clickhouse-linux-amd64 --grpc.stroage.configuration=/path/to/config.yml --prometheus.query.namespace=spanmetrics,but I can't see spanmetrics.calls_total and spanmetrics.duration_bucket metrics in prometheus

@albertteoh
Copy link
Contributor

Are you running off a docker image?
If you're running the latest tag, please remove the this image tag with docker rmi -f jaegertracing/all-in-one:latest as you may be using an old cached version.

I also suggest running the docker-compose example: https://github.com/jaegertracing/jaeger/tree/main/docker-compose/monitor#quickstart.

I've just checked and is working as expected for me.

That can be your baseline that you can use to build upon for your specific deployment.

@zenvzenv
Copy link
Author

zenvzenv commented Aug 18, 2023

I run docker-compose expample success, but my production environment does not have docker,I just can use CLI to deploy,I deployed it according to the configuration of docker, but the monitor page still has no data.The following are my relevant commands and configurations, please help me to see if there is any problem, thank you

  1. prometheus
    • start cmd: prometheus --config.file=otel-prometheus.yml --web.listen-address="0.0.0.0:9091"
    • otel-prometheus.yml like follow:
      global:
        scrape interval: 15s
        evaluation_ interval: 15s
      scrape configs:
        - job _name: 'aggregated -trace-metrics'
          scrape_interval: 25 
          static_configs:
            - targets: ['10.20.78.142:8888']
            - targets: ['10.20.78.142:8889']
  2. jaeger
    • start cmd like follow:
    export COLLECTOR_OTLP_ENABLE=false
    export METRICS_STORAGE_TYPE=prometheus
    export PROMETHEUS_SERVER_URL=http://10.20.78.142:9091
    export PROMETHEUS_QUERY_SUPPORT_SPANMETRICS_CONNECTOR=true
    nohup jaeger-all-in-one --query.ui-config jaeger-ui.json
    • jaeger-ui.json like:
    {
      "monitor": {
        "menuEnaled": true
      }
      "dependencies": {
        "menuEnaled": true
      }
    }
  3. otelcol-contrib
    • config.xml like:
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    
      jaeger:
        protocols:
          thrift_http:
            endpoint: "0.0.0.0:14278"
    
    process:
      batch:
    
    exporters:
      prometheus:
        endpoint: "0.0.0.0:8889"
    
      jaeger:
        endpoint: "10.20.78.142:14250"
        tls:
          insecure: true
    
    connectors:
      spanmetrics:
    
    service:
      pipeline:
        traces:
          receivers: [otlp, jaeger]
          processors: [batch]
          exporters: [spanmetrics, jaeger]
    
        metrics/spanmetrics:
          receivers: [spanmetrics]
          exporters: [prometheus]
  4. I also download and build microsim,and I use microsim -j http://localhost:14278/api/traces -d 24h -s 500ms cmd to generate trace data,I can see trace data and monitor data.It seems that the data imported through port 14278 can display monitor data, but the data imported through otlp protocol (port 4317) cannot display data

@albertteoh
Copy link
Contributor

Okay, so I think we've established that your challenges most likely boil down to configuration.

The reason why I suggest running the stack locally is so you have a chance to understand the configs necessary to get things working for SPM in a simple, self-contained environment that can be quickly brought up and down.

For example, when I copy your OTEL config and paste it into the docker-compose example I can see the following error immediately on startup:

monitor-otel_collector-1  | Error: failed to get config: cannot unmarshal the configuration: 2 error(s) decoding:
monitor-otel_collector-1  |
monitor-otel_collector-1  | * '' has invalid keys: process
monitor-otel_collector-1  | * 'service' has invalid keys: pipeline
monitor-otel_collector-1  | 2023/08/18 11:43:30 collector server run finished with error: failed to get config: cannot unmarshal the configuration: 2 error(s) decoding:
monitor-otel_collector-1  |
monitor-otel_collector-1  | * '' has invalid keys: process
monitor-otel_collector-1  | * 'service' has invalid keys: pipeline
monitor-otel_collector-1 exited with code 1

That's because:

process:
  batch:

should be:

processors:
  batch:

and there may be more configuration issues but at least you can test some of your deployment configs quickly within the local docker-compose setup and be confident they work, then you can deal with any problems relating to deployment and network.

In short, we're reducing the problem space down to separate problems to make it easier to troubleshoot:

  • Application configs:
    • OTEL config
    • Prometheus config
    • Jaeger config
  • Deployment configs:
    • Container image config
    • Networking config
    • K8s config
    • etc.

@jaegertracing jaegertracing locked and limited conversation to collaborators Aug 18, 2023
@albertteoh albertteoh converted this issue into discussion #4684 Aug 18, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Projects
None yet
Development

No branches or pull requests

2 participants