Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does otlpreceiver support http headerX-Forwarded-For? #4901

Closed
qingbozhang opened this issue Feb 22, 2022 · 13 comments
Closed

does otlpreceiver support http headerX-Forwarded-For? #4901

qingbozhang opened this issue Feb 22, 2022 · 13 comments

Comments

@qingbozhang
Copy link
Contributor

qingbozhang commented Feb 22, 2022

I send traces data from web to otlpreceiver http port 55681, the http request header has X-Forwarded-For attribute. but the receiver not auto append http.client_ip attribute to traces spans.
how I add client ip?
version: [v0.44.0](https://github.com/open-telemetry/opentelemetry-collector/releases/tag/v0.44.0)

@jpkrohling
Copy link
Member

On the receiver configuration, you have to enable the include_metadata attribute:

// IncludeMetadata propagates the client metadata from the incoming requests to the downstream consumers
// Experimental: *NOTE* this option is subject to change or removal in the future.
IncludeMetadata bool `mapstructure:"include_metadata,omitempty"`

This doesn't seem to be documented, would you mind opening a PR documenting it?

@LolaHectorBiancaHewie
Copy link

LolaHectorBiancaHewie commented Feb 28, 2022

I'm actually having a similar problem. I have enabled include_metadata in the yaml configuration file for otlp http protocol (see snippet below) which is running version 0.45.0 of the collector, but I do not set the http.client_ip address being set. I've included a wireshark snippet where you can see the X-Forwared-For from the reverse proxy.

Yaml receiver snippet:

receivers:

jaeger:

protocols:

 grpc:
 thrift_binary:
 thrift_compact:
 thrift_http:

otlp:

protocols:

 grpc:
   include_metadata: true
 http:
   cors:
     allowed_origins:
      - "http://localhost*"
  include_metadata: true

zipkin:

Wireshark snippet:

...
Hypertext Transfer Protocol
POST /v1/traces HTTP/1.1\r\n
Connection: Keep-Alive\r\n
Content-Type: application/json\r\n
Accept: application/json\r\n
Accept-Encoding: gzip, deflate, br\r\n
Accept-Language: en-GB,en;q=0.9\r\n
Host: localhost:55681\r\n
Max-Forwards: 10\r\n
Referer: https://some-website/some-app/index.html\r\n
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.3 Safari/605.1.15\r\n
origin: https://********\r\n
X-Original-URL: /open-telemetry/traces\r\n
X-Forwarded-For: 60.21.31.5:62183\r\n
....
JavaScript Object Notation: application/json
Object

@qingbozhang
Copy link
Contributor Author

I enabled include_metadata, but I can't find http.client_ip at downstream consumers.
below is my receivers config.

receivers:
  otlp:
    protocols:
      http:
        include_metadata: true
        cors:
          allowed_origins:
            - "http://localhost*"
          allowed_headers: 
            - "*"

@jpkrohling
Copy link
Member

Sorry, my information was incomplete. We do need an example, and I'm not sure we have all the pieces at this moment, but you need to enable this and use the attribute processor with a "from_context" setting: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/attributesprocessor#attributes-processor

@LolaHectorBiancaHewie
Copy link

Thanks, that was the missing link. We didn't need the allowed_headers in the end. Final snippet from the yaml file is below. Thanks for all the help

receivers:
  jaeger:
    protocols:
      grpc:
      thrift_binary:
      thrift_compact:
      thrift_http:
  otlp:
    protocols:
      grpc:
        include_metadata: true
      http:
        cors:
          allowed_origins:
            - "http://localhost*"
        include_metadata: true
  zipkin:
exporters:
  jaeger:
    endpoint: "jaeger-Development:14250"
    tls:
      insecure: true
  logging:
  zipkin:
    endpoint: "http://zipkin:9411/api/v2/spans"
processors:
  attributes/rp:
    actions:
      - key: http.client_id
        from_context: X-Forwarded-For
        action: upsert
  batch:
extensions:
  health_check:
  pprof:
  zpages:
service:
  extensions: [pprof, zpages, health_check]
  pipelines:  
    traces:
      receivers: [otlp, zipkin]
      exporters: [jaeger, logging]

@jpkrohling
Copy link
Member

Would you mind adding appropriate documentation here: https://github.com/open-telemetry/opentelemetry-collector/tree/main/config/confighttp

@qingbozhang
Copy link
Contributor Author

Thanks for all the help. OK I'll opening a PR documentation it

@LolaHectorBiancaHewie
Copy link

LolaHectorBiancaHewie commented Mar 2, 2022

So, it occurred to me last night this solution wasn't really going to work as X-Forwarded-For header can have multiple IP addresses seperated by comma followed by a space (see https://en.wikipedia.org/wiki/X-Forwarded-For) if multiple proxy servers and/or load balancers are in use. To handle this I have changed the solution to below which will note both X-Forwarded-For, but should only set http.client_id to the first IP address from the header.

receivers:

  jaeger:
    protocols:
      grpc:
      thrift_binary:
      thrift_compact:
      thrift_http:

  otlp:
    protocols:
      grpc:
        include_metadata: true
      http:
        cors:
          allowed_origins:
            - "http://localhost*"
        include_metadata: true
  zipkin:

exporters:
  jaeger:
    endpoint: "jaeger-Development:14250"
    tls:
      insecure: true

  logging:
  zipkin:
    endpoint: "http://zipkin:9411/api/v2/spans"

processors:
  attributes/rp:
    actions:
      # Note the X-Forwarded-For header.
      - key: http.x_forwarded_for
        from_context: X-Forwarded-For
        action: upsert
      # Extract the client IP address from X-Forwarded-For, note cannot use colon in key name so storing in placeholder http_client_ip.
      - key: http.x_forwarded_for
        pattern: ^(?P.*),*
        action: extract
      # Turn the captured client IP address from placeholder into the standard attribute format http.client_ip.
      - key: http.client_ip
        from_attribute: http_client_ip
        action: upsert
      # Remove placeholder http_client_ip.
      - key: http_client_ip
        action: delete
  batch:

extensions:
  health_check:
  pprof:
  zpages:

service:
  extensions: [pprof, zpages, health_check]
  pipelines:  
    traces:
      receivers: [otlp, zipkin]
      exporters: [jaeger, logging]
      processors: [attributes/rp, batch]

@qingbozhang
Copy link
Contributor Author

That's great. but I prefer to store it into DB process it using SQL

@LolaHectorBiancaHewie
Copy link

That was something we were debating about this morning before we came to this solution. For us to simplify the elastic search queries having the http.client_id value stored in the database in a consistent manner regardless of multiple proxy servers, routers, load balancers (or even none in simple setups) was more important to simplify diagnostics and support, but I can certainly see why you would take your suggested approach.

@LolaHectorBiancaHewie
Copy link

Well, just noticed that I specified http.client_id and NOT http.client_ip which is the standard. Corrected the example above, but here it is again below:

receivers:

  jaeger:
    protocols:
      grpc:
      thrift_binary:
      thrift_compact:
      thrift_http:

  otlp:
    protocols:
      grpc:
        include_metadata: true
      http:
        cors:
          allowed_origins:
            - "http://localhost*"
        include_metadata: true
  zipkin:

exporters:
  jaeger:
    endpoint: "jaeger-Development:14250"
    tls:
      insecure: true

  logging:
  zipkin:
    endpoint: "http://zipkin:9411/api/v2/spans"

processors:
  attributes/rp:
    actions:
      # Note the X-Forwarded-For header.
      - key: http.x_forwarded_for
        from_context: X-Forwarded-For
        action: upsert
      # Extract the client IP address from X-Forwarded-For, note cannot use colon in key name so storing in placeholder http_client_ip.
      - key: http.x_forwarded_for
        pattern: ^(?P.*),*
        action: extract
      # Turn the captured client IP address from placeholder into the standard attribute format http.client_ip.
      - key: http.client_ip
        from_attribute: http_client_ip
        action: upsert
      # Remove placeholder http_client_ip.
      - key: http_client_ip
        action: delete
  batch:

extensions:
  health_check:
  pprof:
  zpages:

service:
  extensions: [pprof, zpages, health_check]
  pipelines:  
    traces:
      receivers: [otlp, zipkin]
      exporters: [jaeger, logging]
      processors: [attributes/rp, batch]

@SaschaBrechmannVHV
Copy link

pattern: ^(?P.),

The pattern was wrong. Should be:
pattern: ^(?P<http_client_ip>[\w-.]+).*
The named pattern was missing.

Regards, Sascha

@clintonb
Copy link

clintonb commented Feb 8, 2024

Here is the correct regex: ^(?P<http_client_ip>[\d+.]+).* for IPv4 addresses

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants