Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nginx auth_request does not share trace id of parent request #143

Open
phillipuniverse opened this issue Apr 12, 2022 · 7 comments
Open
Labels
bug Something isn't working

Comments

@phillipuniverse
Copy link
Contributor

phillipuniverse commented Apr 12, 2022

Describe your environment

I am running in Docker and copying in the OpenTelemetry module from https://github.com/nginxinc/nginx-unsupported-modules, specifically this image, sha256:43f2fbf6bd3ca9186e4712e9ab79d36abfa2a5d32624478cf02f46f657208272. This module is built directly from main and was created at 2022-04-12T02:50:18.92379913Z.

I am in the process of converting to OpenTracing to OpenTelemetry for Nginx. Most of this went ok, but it appears that sub requests from auth_request has their own trace_id. I would expect that the trace_id would be the same for the parent and sub request with just different span ids. Below are the files I am using to reproduce the issue.

Steps to reproduce

Dockerfile:

FROM nginx:1.21.6

RUN rm /etc/nginx/conf.d/default.conf

# Built from the package at https://github.com/nginxinc/nginx-unsupported-modules
COPY --from=ghcr.io/nginxinc/amd64/ngx_otel_module:linux-libc-nginx-1.21.6 /usr/lib/nginx/modules/otel_ngx_module.so /usr/lib/nginx/modules/otel_ngx_module.so

ADD localdev.conf /etc/nginx
ADD otel-config.toml /etc/nginx

CMD ["nginx", "-c", "/etc/nginx/localdev.conf", "-g", "daemon off;"]

localdev.conf:

# Available environment variables from https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/exporter.md
env OTEL_EXPORTER_OTLP_ENDPOINT;
env OTEL_RESOURCE_ATTRIBUTES;

load_module /usr/lib/nginx/modules/otel_ngx_module.so;

worker_rlimit_nofile 20000;  # max usable file descriptors, should be about double the worker_connections and less than the OS `nofile` ulimit

events {
    worker_connections 10000;
}

http {
    include /etc/nginx/mime.types;

    # Configuration options at https://github.com/open-telemetry/opentelemetry-cpp-contrib/tree/main/instrumentation/nginx
    # Other examples at https://github.com/nginxinc/bank-of-sirius/tree/master/src/nginx

    opentelemetry_config /etc/nginx/otel-config.toml;

    opentelemetry on;
    opentelemetry_operation_name "$request_method $uri";
    opentelemetry_trust_incoming_spans off;  # Nothing upstream should give us a trace/span
    opentelemetry_attribute http.user_agent $http_user_agent;
    opentelemetry_attribute http.referer $http_referer;
    opentelemetry_attribute net.peer.ip $remote_addr;
    opentelemetry_propagate w3c;

    # Logs all auth_request directives
    log_subrequest on;

    # Prior art for ms precision from https://thatsamguy.com/nginx-iso8601-time-format/
    map $time_iso8601 $time_iso8601_timestamp {
     ~([^+]+) $1;
    }
    map $time_iso8601 $time_iso8601_timezone {
     ~\+([0-9:]+)$ $1;
    }
    map $msec $millisec {
     ~\.([0-9]+)$ $1;
    }

    log_format json_combined escape=json '{ "time_local": "$time_local", '
    '"timestamp": "$time_iso8601_timestamp.$millisec+$time_iso8601_timezone", '
    '"remote_addr": "$remote_addr", '
    '"uri": "$uri", '
    '"request": "$request", '
    '"status": "$status", '
    '"message": "$remote_addr - $request - $status - $request_time - $body_bytes_sent - $http_user_agent", '
    '"forwarded_for": "$http_x_forwarded_for", '
    '"body_bytes_sent": "$body_bytes_sent", '
    '"request_time": "$request_time", '
    '"http_referrer": "$http_referer", '
    '"http_user_agent": "$http_user_agent", '
    '"trace_id": "$opentelemetry_trace_id", '
    '"span_id": "$opentelemetry_span_id" }';

    error_log stderr info;
    access_log /dev/stdout json_combined;

    add_header Server-Timing "traceparent;desc=\"$opentelemetry_context_traceparent\"";

    server {
        set $backend_endpoint http://host.docker.internal:8000;
        set $auth_endpoint http://host.docker.internal:8005;
        set $domain_name *;

        listen 80;

        location /public-gateway-internal-routes/auth/ {
            internal;
            proxy_set_header traceparent $opentelemetry_context_traceparent;
            proxy_redirect off;
            proxy_pass $auth_endpoint/auth/;
        }

        location / {
            auth_request /public-gateway-internal-routes/auth/;

            proxy_set_header traceparent $opentelemetry_context_traceparent;
            proxy_redirect off;
            proxy_pass $backend_endpoint;
        }

    }
}

otel-config.toml:

exporter = "otlp"
processor = "batch"

[processors.batch]
max_queue_size = 2048
schedule_delay_millis = 5000
max_export_batch_size = 512

[service]
name = "gateway" # Opentelemetry resource name

[exporters.otlp]
# Alternatively the OTEL_EXPORTER_OTLP_ENDPOINT environment variable can also be used.
host = "localhost"
port = 4317

Copy these files into a folder and then:

docker build . -t nginxbug && docker run -p 8080:80 nginxbug

Once the server is up, you can curl localhost:8080.

These are examples of the 2 access logs that are emitted:

{ "time_local": "12/Apr/2022:06:08:26 +0000", "timestamp": "2022-04-12T06:08:26.336+00:00", "remote_addr": "172.17.0.1", "uri": "/public-gateway-internal-routes/auth/", "request": "GET / HTTP/1.1", "status": "502", "message": "172.17.0.1 - GET / HTTP/1.1 - 502 - 0.000 - 0 - curl/7.79.1", "forwarded_for": "", "body_bytes_sent": "0", "request_time": "0.000", "http_referrer": "", "http_user_agent": "curl/7.79.1", "trace_id": "7a0b6b7d06ea20df87f2bf13b184d949", "span_id": "d74eaee479f75458" }
{ "time_local": "12/Apr/2022:06:08:26 +0000", "timestamp": "2022-04-12T06:08:26.336+00:00", "remote_addr": "172.17.0.1", "uri": "/", "request": "GET / HTTP/1.1", "status": "500", "message": "172.17.0.1 - GET / HTTP/1.1 - 500 - 0.000 - 177 - curl/7.79.1", "forwarded_for": "", "body_bytes_sent": "177", "request_time": "0.000", "http_referrer": "", "http_user_agent": "curl/7.79.1", "trace_id": "25a831094334f5e595ce0da24dc1c41f", "span_id": "9db43885b76e7722" }
  • /public-gateway-internal-routes/auth/ trace id - 7a0b6b7d06ea20df87f2bf13b184d949
  • / trace id - 25a831094334f5e595ce0da24dc1c41f

What is the expected behavior?

I expected both of these to share the same trace id but with different span ids. This seems like the intended behavior from #94.

What is the actual behavior?

Trace ids are different between the parent and sub request.

Additional context

N/A

@phillipuniverse phillipuniverse added the bug Something isn't working label Apr 12, 2022
@phillipuniverse phillipuniverse changed the title Nginx auth_request does not share parent trace id Nginx auth_request does not share trace id of parent request Apr 12, 2022
@jmadureira
Copy link

@phillipuniverse I came across the same problem and the problem seems to be that the auth_request is evaluated before the opentelemetry_propagate.
You can try to run the following example:

  • configure the otel to be parent_based
  • configure a location on nginx that sets the opentelemetry_propagate then proxy passes the request to your / location.

You should see on your logs 3 requests with the same trace id.

@phillipuniverse
Copy link
Contributor Author

@jmadureira this is probably what's going on, I will let you know when I have a chance to test this.

What you're saying is to move the opentelemetry_propagate inside the routes like:

        location /public-gateway-internal-routes/auth/ {
            internal;
            opentelemetry_propagate w3c;
            proxy_set_header traceparent $opentelemetry_context_traceparent;
            proxy_redirect off;
            proxy_pass $auth_endpoint/auth/;
        }

        location / {
            opentelemetry_propagate w3c;
            auth_request /public-gateway-internal-routes/auth/;

            proxy_set_header traceparent $opentelemetry_context_traceparent;
            proxy_redirect off;
            proxy_pass $backend_endpoint;
        }

@phillipuniverse
Copy link
Contributor Author

phillipuniverse commented Jun 4, 2022

@jmadureira I still can't quite get this to work event with moving in the opentelemetry_propagate inside of a location before the auth_request.

What exactly did you mean by this?

configure the otel to be parent_based

It seems like no matter what, there are different trace_ids output to the logs for the subrequest (from auth_request) and the main request.

@phillipuniverse
Copy link
Contributor Author

@DebajitDas thanks for the clarification in #174 about what that is designed to fix. Is the idea that eventually the nginx plugin goes away in favor of the otel-webserver-module and thus the issue I outlined here becomes moot? Or is the long-term strategy to continue to invest in the main nginx instrumentation?

I'm trying to figure out if I should roll up my sleeves and try to figure out what's going on with these sub requests in the main nginx library or if there is just something obvious I'm missing in my configuration.

@eguzki
Copy link

eguzki commented May 23, 2023

In APIcast we are running into the same issue. Internal requests are not added as new spans for the same trace. Adding Otel instrumentation took place with this PR #1379 leveraging the nginx instrumentation instead of otel-webserver-module.

Any progress on this issue?

@esigo
Copy link
Member

esigo commented May 24, 2023

@eguzki would this help?
kubernetes/ingress-nginx#9811 (comment)

@eguzki
Copy link

eguzki commented May 24, 2023

It helps to understand and validate our current config, but unfortunately does not work as expected. The gateway calls the "auth server" in a post action

post_action @out_of_band_authrep_action;
....

location @out_of_band_authrep_action {
....

And calling the upstream app is done using openresty's ngx.exec.

Interesting to note that the "auth" call replaces parent span and upstream span is never created.

Still looking at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants