Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/redaction] Initial implementation #6495

Merged

Conversation

leonsp-ai
Copy link
Contributor

Description: Implement a redaction processor to prevent leaking sensitive or private span attributes or span attribute values for privacy, security, and compliance purposes.

  1. Structure PR: Create the structure of a redaction processor #5274
  2. Implementation PR: this
  3. Metrics PR: TODO
  4. components.go PR: TODO

Link to tracking Issue: #4131

Testing: Test coverage for redaction processor

Documentation: Updates to the redaction processor readme

@leonsp-ai leonsp-ai requested review from a team and djaglowski December 2, 2021 16:12
Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpkrohling have you made an analysis about the difference between this processor and the "attributes" processor? What does this bring more to us? Why do we need two?

Comment on lines 24 to 27
# DryRun mode documents the changes the processor would make without
# deleting your data. You can use it to confirm that the right span
# attributes will be redacted
dry_run: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be achieved by having a "separate" pipeline with only logging/debugging exporter. We don't have this pattern anywhere else because of that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bogdandrutu That makes sense. I've now removed the dry run mode from the PR

@jpkrohling
Copy link
Member

@jpkrohling have you made an analysis about the difference between this processor and the "attributes" processor?

The original issue has some considerations about the design choices:

AttributesProcessor could be enhanced to include these features. Right now its regex support is limited to splitting an existing attribute into multiple attributes or matching a span by one of its values, but it cannot delete all attributes that do not match a list or mask a value that matches a regex. Because this enhancement is not on the roadmap, it is unlikely to be implemented in AttributesProcessor. It also makes more sense to have all redaction or compliance features in one processor.

The last part is what convinced me that another processor might make sense, even if only for a better user experience.

@leonsp-ai
Copy link
Contributor Author

Is the make -C testbed run-tests failure in loadtest (TestBallastMemory|TestLog10kDPS) relevant? This processor shouldn't be part of the main build yet, and I think that test only concerns itself with the main build

@jpkrohling
Copy link
Member

Test restarted.

@leonsp-ai leonsp-ai requested a review from bogdandrutu December 8, 2021 20:17
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Dec 16, 2021
@jpkrohling jpkrohling changed the title Implement the redaction processor [processor/redaction] Add the initial implementation Dec 16, 2021
@jpkrohling jpkrohling changed the title [processor/redaction] Add the initial implementation [processor/redaction] Initial implementation Dec 16, 2021
@jpkrohling
Copy link
Member

@bogdandrutu, should this component be accepted?

@github-actions github-actions bot removed the Stale label Dec 17, 2021
@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Dec 31, 2021
@jpkrohling
Copy link
Member

jpkrohling commented Jan 3, 2022

@leonsp-ai, could you please go over #6926 and see if this component has what is listed there? The discussion about whether this component would be accepted is also kind of covered there: if you can find an approver to act as sponsor, this would be accepted.

@jpkrohling jpkrohling removed the Stale label Jan 3, 2022
@leonsp-ai leonsp-ai force-pushed the lp/implement-redaction-processor branch from be007fd to 87f4df1 Compare January 5, 2022 19:02
@leonsp-ai
Copy link
Contributor Author

@jpkrohling Sure, I can do that. Finding a sponsor is tricky, as I am someone new entering the community. I'll try reaching out on the CNCF Slack. For the first PR, I worked with @mx-psi and @dmitryax

Checklist from #6926:

  • TBD: Mention the sponsor in the PR description
  • ✅ A complete config.go, which will show how your component will look like to users
    README.md with some basic information about your component. It doesn't have to be complete, but should contain at least:
    • ✅ Purpose and use-cases
    • ✅ Telemetry data types supported
  • ✅ versions.yaml
  • ✅ CODEOWNERS with at least the sponsor and the component's author: For now I've listed myself, @jpkrohling, @mx-psi , and @dmitryax
  • ✅ internal/components tests
  • TBD: go.mod at the root of the repository and cmd/configschema
    • I think this would make more sense in the last PR rather than the first or second
  • ✅ dependabot
  • ✅ The skeleton of the component, potentially using the helpers
  • ✅ Makefile

@leonsp-ai leonsp-ai force-pushed the lp/implement-redaction-processor branch from d7f4051 to 2a0ce9f Compare January 12, 2022 15:55
@jpkrohling
Copy link
Member

@bogdandrutu, should this be accepted? I think we talked about this before and we would accept this one even though the transform processor (or some of its parts) could be reused in a future implementation.

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, otherwise LGTM

# the processor adds to the spans when it redacts or masks other
# attributes. In some contexts a list of redacted attributes leaks
# information, while it is valuable when integrating and testing a new
# configuration. Possible values are `debug`, `info`, and `silent`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should describe what's the expected behavior for each value

@jpkrohling
Copy link
Member

ping @bogdandrutu

@github-actions
Copy link
Contributor

github-actions bot commented Feb 2, 2022

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Feb 2, 2022
@dmitryax dmitryax removed the Stale label Feb 2, 2022
@bogdandrutu
Copy link
Member

@jpkrohling who is the sponsor? who is going to maintain this? Why the "translator" is not good (or this becomes "a function" in that)?

@jpkrohling
Copy link
Member

jpkrohling commented Feb 4, 2022

who is the sponsor? who is going to maintain this?

I thought @mx-psi and @dmitryax were volunteers, but seeing myself on that list as well made me realize that those are not sponsors.

@leonsp-ai, were you able to find a sponsor yet?

Why the "translator" is not good (or this becomes "a function" in that)?

Not sure I get what you meant here. Are you referring to the attributes processor? Or is it about the new transform processor?

@bogdandrutu bogdandrutu dismissed their stale review March 16, 2022 16:05

I trust @dmitryax that this will have a consistent query language as the transform processor

@william-tran
Copy link

@dmitryax I updated the README with your feedback. cc @jpkrohling

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Mar 16, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: leonsp-ai / name: Leo Petrazickis (00cc7f098cf0a330ecd09cd2e5d43c2e9069da6d, e64c91f31ee72159d2372cf89469bb6f6e489711, 0d28b4239910c5d2258511702bf5c884fa787b88, 5baab4fbb8a59a7608338526d7ebcd8880e4fd5a, 2d243e3dd57a0fcc3d5a7426887ba43069ee50af, 39a070ea135a9dbced370df21199b33915b09d7e)
  • ✅ login: william-tran / name: Will Tran (ec9378123371ce5bd94aba53dba1ba6ec23c6651)

@jpkrohling
Copy link
Member

CI seems to be stuck. And it seems that I can't rebase this to get a new CI build, so, you'd need to rebase this yourself. In the process, please replace my entry in the code owners with @MovieStoreGuy (current code owner in the version from the main branch).

@william-tran william-tran force-pushed the lp/implement-redaction-processor branch 2 times, most recently from ee5260d to 3f6ec91 Compare March 18, 2022 12:16
@william-tran
Copy link

Looking back I see @MovieStoreGuy took care of some unowned modules (thank you!), but now that we have some significant contributors to this module I've listed them instead.

CHANGELOG.md Outdated Show resolved Hide resolved
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Mar 22, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: leonsp-ai / name: Leo Petrazickis (149c69d0ee1d31da0238833c84d94a47bd0bc52b, ecc635d33880b5445550ddbe630e13870ee07df2, 7676f4a7185c69772fb5d60e65d8b6c81d67eea8, 6017939f960ec3295b7a459e0973cd1071c64bff, 8844a59e27228b332752c45fe744333388b7990c, e9d65b5cc66989e5fb57b2fbc58522598ea0cf93, 77eb8183acb11edc930b95b2214f62411c0c0ee0)

@leonsp-ai
Copy link
Contributor Author

leonsp-ai commented Mar 22, 2022

Edit: @william-tran I'm addressing the EasyCLA regression by squashing the previously covered commits

Will's moving on to exciting and related new things as of March 18

@leonsp-ai leonsp-ai force-pushed the lp/implement-redaction-processor branch 2 times, most recently from 48feb45 to 058dd40 Compare March 22, 2022 18:52
@jpkrohling
Copy link
Member

@leonsp-ai, I appreciate the sentiment, but could you please do the load test changes in a separate PR? Failures on those tests won't prevent this PR from being merged.

@leonsp-ai leonsp-ai force-pushed the lp/implement-redaction-processor branch from a9d82f4 to edb9313 Compare March 22, 2022 19:51
@leonsp-ai
Copy link
Contributor Author

@jpkrohling Sure. I've moved those changes out into #8789

@leonsp-ai leonsp-ai force-pushed the lp/implement-redaction-processor branch from edb9313 to 31c7bf4 Compare March 22, 2022 21:24
@jpkrohling jpkrohling merged commit 8058c75 into open-telemetry:main Mar 23, 2022
@jpkrohling
Copy link
Member

This has been approved a few times already, merging. If there are anything pending, it can be done in a follow-up PR. Thanks!

@leonsp-ai leonsp-ai deleted the lp/implement-redaction-processor branch March 23, 2022 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants