Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBAPI: Add param to control capturing of db.statement.parameters #1247

Closed

Conversation

stschenk
Copy link
Contributor

@stschenk stschenk commented Oct 15, 2020

Description

Switched DBAPI so that it will not capture parameterized query parameters. Capturing this would be a problem in production environments since they will contain sensitive information such as session tokens and hashed passwords.

Have added a parameter to the instrumentation constructor to allow for this feature to be turned on.

I will mark this as a breaking change since I am removing the db.statement.parameter span attribute and something might be depending on it.

Fixes # (issue)
https://github.com/open-telemetry/opentelemetry-python/issues/1215

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Unit Tests

Checklist:

  • [ x] Followed the style guidelines of this project
  • [ x] Changelogs have been updated
  • [ x] Unit tests have been added
  • Documentation has been updated

@stschenk stschenk requested review from a team, owais and lzchen and removed request for a team October 15, 2020 21:40
@NathanielRN
Copy link
Contributor

Hello! I have a PR to move some files you have in this PR to the Contrib repo, please let me know if this gets merged before the PR in the Contrib repo. Please see https://github.com/open-telemetry/opentelemetry-python-contrib/pulls/87

"testcomponent",
"testtype",
connection_attributes,
capture_parameters=True,
)
mock_connection = async_call(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly check the span attribute existence in this test as well.

@@ -199,7 +199,11 @@ def test_span_succeeded(self):
"user": "user",
}
db_integration = AiopgIntegration(
self.tracer, "testcomponent", "testtype", connection_attributes
self.tracer,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't there a lot of other instrumentations that use dbapi as well (mysql, psycopg2, etc)? I think tests would need to be added for those as well because this is a breaking change.

Copy link
Contributor Author

@stschenk stschenk Oct 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that since the the other instrumentations already are not testing for "db.statement.parameters", we would not need to modify the test to do so now.

Also db.statement.parameters is not even part of the specification: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/database.md#call-level-attributes, why is it being collected at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that since the the other instrumentations already are not testing for "db.statement.parameters", we would not need to modify the test to do so now.

To ensure completeness of the feature, tests should be added for each instrumentation. The original authors should have tested for this, but since your change will effect the behavior of these instrumentations, it would be good to have them there as well.

why is it being collected at all?

That is a good question. Perhaps this should be the discussion topic to address first? @codeboten

@@ -211,6 +211,7 @@ def __init__(
connection_attributes=None,
version: str = "",
tracer_provider: typing.Optional[TracerProvider] = None,
capture_parameters: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this be configured by the user? As well as the instrumentations that depend on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I did not make this configurable for the user. I initially though I was going to but then came across this PR that removes the capturing of "db.statement.parameters" for AsyncPGInstrumentor
#854 Since the approach was accepted I thought it would be fine for DBAPI too.

Would it be better to not have the constructor param and instead introduce a user settable property?

Thanks!

Copy link
Contributor

@lzchen lzchen Oct 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having users be able to configure it is fine. The reason why the AsyncPGInstumentor works is because the configuration property is exposed to the users (because they can call AsyncPGInstrumentor(capture_parameters =True) constructor to instrument). However for this case it is in DatabaseApiIntegration, which users do not interact with (for dbapi, they use trace_integration(...) or for something like mysql, they use MySQLInstrumentor()`.

Copy link
Contributor

@lzchen lzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments.

@stschenk
Copy link
Contributor Author

@lzchen / @owais - I have put together a different more conservative and backward compatible PR that will allow for the user to turn off the collection of db.statement.parameters through the use of an env var: #1306. If this one seems fine, I will go ahead and abandon this PR.

srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020
* feat: adding json over http for collector exporter

* feat: updating readme and adding headers options in config for json over http

* chore: reviews and few small cleanups

* chore: aligning type for headers

* chore: fixing doc

* chore: unifying types for headers

* chore: reviews

* chore: adding validation for headers, and making the types correct this time

* chore: linting

* chore: linting

* chore: fixes after merge

* chore: reviews

* chore: merge branch 'master' into collector_json
@lzchen
Copy link
Contributor

lzchen commented Nov 2, 2020

@stschenk
How come we can't go with the same mechanism that AsyncPGInstrumentor is using? With the env vars, now we have two different ways of setting the same feature, which is confusing for the user.

@stschenk
Copy link
Contributor Author

stschenk commented Nov 2, 2020

@lzchen - Good question... I misinterpreted your previous comment around needing to have the ability for the user to configure the capture of db.statement.parameters as meaning there should be an env prop.

I have changed this PR a bit by doing the following:

  • Added the ability to configure the feature through the trace_integration method. Making the capture off by default for DBAPI
  • Left the other DB instrumentations as they are... They will continue to capture db.statement.parameters by default.

@@ -127,6 +133,7 @@ def wrap_connect_(
connection_attributes=connection_attributes,
version=version,
tracer_provider=tracer_provider,
capture_parameters=capture_parameters,
)
return db_integration.wrapped_connection(wrapped, args, kwargs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe instrument_connection needs to have this parameter as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense.

@@ -98,6 +101,7 @@ def wrap_connect(
connection_attributes: typing.Dict = None,
version: str = "",
tracer_provider: typing.Optional[TracerProvider] = None,
capture_parameters: bool = True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default should be False for all instrumentations that use dbapi.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@codeboten codeboten added the instrumentation Related to the instrumentation of third party libraries or frameworks label Nov 5, 2020
@lzchen
Copy link
Contributor

lzchen commented Nov 5, 2020

@stschenk
We are moving all of our instrumentation code into the contrib repo. Do you mind opening up this PR there and I can close this one?

@stschenk
Copy link
Contributor Author

stschenk commented Nov 6, 2020

Sure, will do it first thing tomorrow

@stschenk
Copy link
Contributor Author

stschenk commented Nov 6, 2020

This has been moved to open-telemetry/opentelemetry-python-contrib#156.

I will go ahead and close this PR

@stschenk stschenk closed this Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
instrumentation Related to the instrumentation of third party libraries or frameworks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants