Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Http Instrumentation] Make it easier to scrub sensitive URI details from output #1791

Open
stephenjust opened this issue Sep 8, 2021 · 7 comments
Labels
comp:instrumentation.http Things related to OpenTelemetry.Instrumentation.Http enhancement New feature or request

Comments

@stephenjust
Copy link

Feature Request

Before opening a feature request against this repo, consider whether the feature
should/could be implemented in the other OpenTelemetry client
libraries
. If so, please open an issue on
opentelemetry-specification

first.

Is your feature request related to a problem?

If so, provide a concise description of the problem.

The HTTP instrumentation package currently writes out the full request path for most requests, the exception being requests that contain Uri.UserInfo. This is not sufficient for many resource types. For example, the Azure Storage SDK's requests show up in logs with a SAS token appended.

For example:
https://mystorage.blob.core.windows.net/container/myfile.txt?sv=2019-07-07&sr=c&sig=placeholder&se=2021-09-08T22%3A03%3A23Z&sp=r&api-version=2019-07-07&

I can also imagine other scenarios where the URI might otherwise contain sensitive information that cannot be logged.

Describe the solution you'd like:

What do you want to happen instead? What is the expected behavior?

It would be helpful if OpenTelemetry had a configurable high-performance URL scrubber for anywhere URLs could be logged,
to make it as easy as possible for developers to sanitize their log output. This would support (minimally) blocking out any specified query string parameters.

Describe alternatives you've considered.

Which alternative solutions or features have you considered?

In our project today, we configure an enricher to run a Regex replace on all URLs we log, but it feels like there is room for improvement on the default behavior.

Additional Context

Add any other context about the feature request here.

This is definitely in the "nice to have" category. There are ways to solve the problem with the current shipped libraries.

@stephenjust stephenjust added the enhancement New feature or request label Sep 8, 2021
@reyang
Copy link
Member

reyang commented Sep 17, 2021

@cijothomas I think this is not an instrumentation feature (e.g. it probably won't make sense for individual instrumentation libraries to do similar things). Is there a way to move this issue to the contrib repo?

@dpk83
Copy link

dpk83 commented Oct 18, 2021

@reyang @cijothomas
This is an ask that we surely need to be able to use the instrumentation libraries. Here are a few options that would be good to consider (let's take an example of AspNetCore instrumentation library

  • Expose options to enable/disable individual properties being logged by the instrumentation library e.g. http_path, httpUrl etc. and/or
  • Provide a way to scrub sensitive parts from the URL/Path and/or
  • Provided the fact that http_route, http_path and http_url carry duplicate information, only have http_route and add parameters as separate columns (just like how structure logging works) and then provide a way for services to specify the fields they want to scrub/redact.

@reyang
Copy link
Member

reyang commented Oct 19, 2021

Couple things to consider:

  1. scrubbing can be done at different places - instrumentation library, BaseProcessor<T> (for Activity and LogRecord), BaseExporter<T>, agent/collector, ingestion.
  • doing this on agent/collector/ingestion avoids the need to implement the same logic across all programming languages, and is probably better for enforcement, however it doesn't prevent credential / privacy data from leaving the process / local box.
  • doing this in SDK exporter makes it possible to use async processing, which won't block the hot-path (e.g. doing scrubbing in an instrumentation library or processor will slow down every Log/Activity call).
  1. scrubbing might introduce data problems if not done carefully.
  • if the data processed by a scrubber is used for metrics pre-agg, it might cause different points being aggregated to the wrong timeseries (unless the scrubber is doing an exact 1:1 mapping).
  • if the data size increased after processed by a scrubber, it might cause additional data loss due to truncation / limit.

@pellared
Copy link
Member

pellared commented Oct 20, 2021

Maybe it would be good to discuss under https://github.com/open-telemetry/opentelemetry-specification?

For security reasons the db.statement was made conditional. Maybe we could use a similar approach for http sem conv?

In general it should be possible to disable (or sanitize if possible) all attributes that can come from "user input" as they may contain sensitive data.

@vishweshbankwar vishweshbankwar transferred this issue from open-telemetry/opentelemetry-dotnet May 14, 2024
@Kielek Kielek added the comp:instrumentation.http Things related to OpenTelemetry.Instrumentation.Http label May 17, 2024
@Kielek
Copy link
Contributor

Kielek commented May 17, 2024

Closing. Latest released brings OTEL_DOTNET_EXPERIMENTAL_HTTPCLIENT_DISABLE_URL_QUERY_REDACTION to redact all data. It can be disabled by mentioned environmental variable.

@Kielek Kielek closed this as completed May 17, 2024
@swythan
Copy link

swythan commented May 20, 2024

Closing. Latest released brings OTEL_DOTNET_EXPERIMENTAL_HTTPCLIENT_DISABLE_URL_QUERY_REDACTION to redact all data. It can be disabled by mentioned environmental variable.

That isn't equivalent.

  1. It defaults the other way. That setting prevents the reaction of all query parameters that was just defaulted to being enabled. (I'll leave the heated discussions to the other threads on that one.)
  2. What I want is to be able to redact only the sensitive query parameters. The original request says configurable so I don't think I'm alone in that.

@Kielek
Copy link
Contributor

Kielek commented May 20, 2024

@swythan, reopening per your request.

@Kielek Kielek reopened this May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:instrumentation.http Things related to OpenTelemetry.Instrumentation.Http enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants