Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(general): Scrub all fields with IP address #1725

Merged
merged 11 commits into from
Jan 12, 2023
Merged

feat(general): Scrub all fields with IP address #1725

merged 11 commits into from
Jan 12, 2023

Conversation

TBS1996
Copy link
Contributor

@TBS1996 TBS1996 commented Dec 29, 2022

This change will scrub all fields containing an IP address, rather than scrubbing certain fields known to have them

for the unit tests, testing ip scrubbing in sdk and user have been merged as it should scrub all fields.

referencing issue #1693

@TBS1996 TBS1996 marked this pull request as ready for review December 29, 2022 09:45
@TBS1996 TBS1996 requested a review from a team December 29, 2022 09:45
@TBS1996 TBS1996 changed the title Scrub all fields with IP address feat(general): Scrub all fields with IP address Dec 29, 2022
CHANGELOG.md Outdated Show resolved Hide resolved
@@ -39,7 +31,8 @@ pub fn to_pii_config(
}

if datascrubbing_config.scrub_ip_addresses {
applications.insert(KNOWN_IP_FIELDS.clone(), vec!["@anything:remove".to_owned()]);
let wildcard = SelectorSpec::Path(vec![SelectorPathItem::DeepWildcard]);
applications.insert(wildcard, vec!["@ip:replace".to_owned()]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is do we want to keep this consistent with old behaviour and just remove them or actually switch to replacing all the IPs as you did here.

#[test]
fn test_user_ip_stripped() {
let mut data = Event::from_value(
serde_json::json!({
"user": {
"username": "secret",
"username": "73.133.27.120",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be great to test if the scrubbing works also in , e.g. breadcrumbs - if you add the message with ip inside , as it's done in test

fn test_breadcrumb_message() {

the protocol for breadcrumbs is in
pub struct Breadcrumb {
where you can see which fields are PII

relay-general/src/pii/convert.rs Outdated Show resolved Hide resolved
applications.insert(KNOWN_IP_FIELDS.clone(), vec!["@anything:remove".to_owned()]);

// checks actual contents of all fields and scrubs where there is an IP address
let wildcard = SelectorSpec::Path(vec![SelectorPathItem::DeepWildcard]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of pushing to applications with a custom selector spec for IP scrubbing, I would simply push @ip:replace to applied_rules (see code further below). It will then use the same selector as the other default rules, which I think is what we want.

{
"message": "73.133.27.120",
"data": {
"test_data": "73.133.27.120" // test deep wildcard stripping
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we talked in the meeting, can we also make sure to test that the IPs are scrubbed also from inside the strings, e.g.

"test_data": "contains ip's like 73.133.27.120 but also the text"

which should result in something like

"test_data": "contains ip's like [ip] but also the text"

or make sure this actually can work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would create a separate issue for this, as it would also the touch IP scrubbing for users who configured it manually through advanced data scrubbing.

@TBS1996 TBS1996 merged commit 156c2ae into master Jan 12, 2023
@TBS1996 TBS1996 deleted the feat/scrub_ip branch January 12, 2023 08:50
jan-auer added a commit that referenced this pull request Jan 18, 2023
* master: (35 commits)
  ref(actix): Migrate ProjectUpstream to `relay_system::Service` (#1727)
  feat(general): Add unknown SessionStatus variant (#1736)
  ref: Convert integration tests about dropping transactions to unit tests (#1720)
  release: 0.8.16
  ci: Skip redundant self-hosted E2E on library release (#1755)
  doc(changelog): Add relevant changes to python changelog (#1753)
  feat(profiling): Add profile context (#1748)
  release: 23.1.0
  profiling(fix): use an unpadded base64 encoding (#1749)
  Revert "feat(replays): Enable PII scrubbing for all organizations" (#1747)
  feat: Switch from base64 to data-encoding (#1743)
  instr(replays): Add timer metric to recording processing (#1742)
  feat(replays): Use Annotated struct definition for replay-event parsing (#1582)
  feat(sessions): Retire session duration metric (#1739)
  feat(general): Scrub all fields with IP address (#1725)
  feat(replays): Enable PII scrubbing for all organizations (#1678)
  chore(project): Add backoff mechanism for fetching projects (#1726)
  feat(profiling): Add new measurement units for profiling (#1732)
  chore(toolchain): update rust to 1.66.1 (#1735)
  ref(actix): Migrate server actor to the "service" arch (#1723)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants