Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(scrubbing): Scrub breadcrumb.data.http.query & data.other with default scrubbers #1901

Closed
wants to merge 12 commits into from

Conversation

iker-barriocanal
Copy link
Contributor

@iker-barriocanal iker-barriocanal commented Mar 1, 2023

Ref: #1855

breadcrumb.data.http.query may contain sensitive data, which is currently not scrubbed. This PR makes relay to scrub that field with default data scrubbers.

data.other is also scrubbed to be consistent with the current IP address scrubbing in breadcrumb's data. This should not cause any problems with #1593, where the biggest issue was scrubbing span descriptions (which they are still not scrubbed).

@iker-barriocanal iker-barriocanal self-assigned this Mar 1, 2023
@@ -1123,7 +1123,7 @@ pub struct DataElement {
#[metastructure(pii = "maybe")]
pub http: Annotated<HttpElement>,

#[metastructure(additional_properties, retain = "true", pii = "maybe")]
#[metastructure(additional_properties, retain = "true", pii = "true")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change doesn't seem to affect the changes made to Span, and is motivated by this test (which fails otherwise):

fn test_ip_stripped() {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data property in Breadcrumb is also marked as pii == true.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If breadcrumb.data was already pii = true, why do we even need to change its type? Does pii = true not descend into the Value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't dove into the code, but apparently, it does not descend -- the test linked above would have passed on the first attempt. Maybe it acts the other way around, where pii = false means no pii checks on child elements? I'd have to check the code; but I don't think that's an issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the "span" test, I think we now scrub too much in span.data:

"do_not_scrub": "5105105105105100"

An easy way around this would be to have separate SpanData and BreadcrumbData structs, instead of a common DataElement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change

  1. breadcrumb.data.http.query is scrubbed by default
  2. breadcrumb.data.whatever is scrubbed by default
  3. breadcrumb.data.http.whatever is not scrubbed by default

Is this really the behavior we want? Because if we also want the 3rd case to be scrubbed by default (ie. scrub everything in data), we might as well change the type of breadcrumb.data back to Value. test_breadcrumb_data_object_is_scrubbed and test_breadcrumb_data_string_is_scrubbed seem to pass just fine with that simplified schema.

@iker-barriocanal iker-barriocanal marked this pull request as ready for review March 2, 2023 11:17
@iker-barriocanal iker-barriocanal requested a review from a team March 2, 2023 11:17
@iker-barriocanal iker-barriocanal changed the title feat(scrubbing): Scrub breadcrumb.data.http.query with default scrubbers feat(scrubbing): Scrub breadcrumb.data.http.query & data.other with default scrubbers Mar 7, 2023
@@ -1123,7 +1123,7 @@ pub struct DataElement {
#[metastructure(pii = "maybe")]
pub http: Annotated<HttpElement>,

#[metastructure(additional_properties, retain = "true", pii = "maybe")]
#[metastructure(additional_properties, retain = "true", pii = "true")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change

  1. breadcrumb.data.http.query is scrubbed by default
  2. breadcrumb.data.whatever is scrubbed by default
  3. breadcrumb.data.http.whatever is not scrubbed by default

Is this really the behavior we want? Because if we also want the 3rd case to be scrubbed by default (ie. scrub everything in data), we might as well change the type of breadcrumb.data back to Value. test_breadcrumb_data_object_is_scrubbed and test_breadcrumb_data_string_is_scrubbed seem to pass just fine with that simplified schema.

@iker-barriocanal
Copy link
Contributor Author

Closing, see #1915 (comment).

@iker-barriocanal iker-barriocanal deleted the iker/feat/pii-scrub-breadcrumb-query branch April 7, 2023 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants