Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend TCF Purpose Override Support #4464

Merged
merged 16 commits into from
Dec 2, 2023

Conversation

pattisdr
Copy link
Contributor

@pattisdr pattisdr commented Nov 28, 2023

Closes #PROD-1426

Description Of Changes

Adds global support for Purpose Overrides.

When enabled, a customer can opt to exclude a purpose altogether or specify a legal basis to be set for that purpose globally. This should affect the contents of the TCF Experience. Instead of storing this override at the System/Privacy Declaration level, we calculate it when building the TCF Experience.

Code Changes

  • Adds a new consent override_vendor_purposes config
  • Adds a tcf_purpose_overrides table to store overrides for the 11 Purposes
  • Adds default override records to the database on startup (which don't have an effect until enabled later), and even then there is no effect unless customers actually specify override details
  • Adds purpose hybrid properties to PrivacyDeclaration since we don't store that directly but this is needed for joins
  • Update the base query that is used to build the TCF Experience. Join a subquery that contains the legal basis override. Update the response that this method returns since we also need to return legal basis filters because they touch columns on the subquery. We can't generate them after-the-fact
  • Adds a separate method on the Privacy Declaration for use in returning this in the System response

Steps to Confirm

Pre-Merge Checklist

…ons that will form the basis of the TCF experience.

- Define hybrid property PrivacyDeclaration.overridden_legal_basis_for_processing where class-level version is used for filtering matching_privacy_declarations, so rows are properly filtered before being handed to the TCF Experience builder.
- Add temporary override_vendor_purposes config although this may need to be updated with other work that is going in.
…h purpose. The default configuration does not actually override any purposes or legal bases. These also aren't used at all if the the config variable override_vendor_purposes is False.
Copy link

vercel bot commented Nov 28, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
fides-plus-nightly ⬜️ Ignored (Inspect) Visit Preview Dec 2, 2023 1:52pm

Copy link

cypress bot commented Nov 28, 2023

Passing run #5452 ↗︎

0 4 0 0 Flakiness 0

Details:

Merge 563d52d into fc490d6...
Project: fides Commit: 92136cc9ff ℹ️
Status: Passed Duration: 00:39 💡
Started: Dec 2, 2023 2:04 PM Ended: Dec 2, 2023 2:04 PM

Review all test suite changes for PR #4464 ↗︎

Copy link

codecov bot commented Nov 28, 2023

Codecov Report

Attention: 11 lines in your changes are missing coverage. Please review.

Comparison is base (fc490d6) 87.12% compared to head (563d52d) 87.11%.

Files Patch % Lines
src/fides/api/app_setup.py 11.11% 8 Missing ⚠️
src/fides/config/consent_settings.py 50.00% 1 Missing and 1 partial ⚠️
src/fides/api/util/consent_util.py 85.71% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4464      +/-   ##
==========================================
- Coverage   87.12%   87.11%   -0.02%     
==========================================
  Files         330      331       +1     
  Lines       20410    20478      +68     
  Branches     2629     2639      +10     
==========================================
+ Hits        17782    17839      +57     
- Misses       2163     2172       +9     
- Partials      465      467       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…ns use async sessions so they can be accessed via existing System API routes.

Current strategy is to stash these on the declaration under a new key since declaring the hybrid_property directly on the Privacy Declaration Schema causes coroutine issues when the PrivacyDeclaration is serialized under "serialize_response"
src/fides/api/app_setup.py Outdated Show resolved Hide resolved
src/fides/api/models/sql_models.py Outdated Show resolved Hide resolved
src/fides/api/models/sql_models.py Outdated Show resolved Hide resolved
src/fides/api/models/sql_models.py Outdated Show resolved Hide resolved
src/fides/api/api/v1/endpoints/system.py Outdated Show resolved Hide resolved
src/fides/api/util/tcf/tcf_experience_contents.py Outdated Show resolved Hide resolved
src/fides/api/util/tcf/tcf_experience_contents.py Outdated Show resolved Hide resolved
src/fides/api/util/tcf/tcf_experience_contents.py Outdated Show resolved Hide resolved
@pattisdr pattisdr changed the title Backend Publisher Purpose Overrides POC Backend Publisher Purpose Overrides Nov 29, 2023
@pattisdr
Copy link
Contributor Author

@adamsachs when you get a chance, could you do an early pass of this on just the approach?

Namely, I add a new hybrid class-level property with all the legal basis override logic that adds the appropriate "case" statement to the Privacy Declaration query used by the TCF Experience.

So as requested, the overrides aren't stored on the System themselves, but the override is still calculated fairly early on in the process so it is used downstream in 1) The TCF Experience 2) Building backend TC strings 3) Calculating "relevant_systems" which are stored for consent reporting when consent is saved.

I like this approach, the main thing I don't like about this PR is the messiness of including the override property in the System response.

It can be tested in this Plus PR https://github.com/ethyca/fidesplus/pull/1240 if interested

Copy link
Contributor

@adamsachs adamsachs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pattisdr broadly, i really like the approach. i'll outline the broad points that i liked, and then the couple of concerns i've got. those aren't meant to be prescriptive, just for consideration!

what i like:

  • the data model for the overrides looks as i'd expect, you've captured the requirements there efficiently and clearly
  • impressive work on the sqlalchemy magic (would it be fair to call it "alchemy"? 😬), especially with some of the async paradigms! i really like how much of this is encapsulated within the model, keeps the logic centralized and easy to parse (or adjust as needed)
  • it's great how few changes are needed to the actual TCF overlay generation code. filtering as early as possible, and putting the computations on the model, keeps us from adding complexity to the overlay generation code itself, and i think that's a big win!

the broad concerns i have:

  • as you've pointed out, performance! i frankly have no idea how all the (composite) hybrid properties affect performance, or really how they get translated into underlying DB queries. this probably deserves its own discussion on the next iteration of the PR -- and i'd like to get a better understanding of the functionality generally - so i won't try to tackle it all here...but generally:
    • i think we can live with some performance hits on the APIs that return System information to be used in the consent mgmt UI, since i think/hope we'll have some paged endpoints to support that UI (need to sync up on that, as a separate matter).
    • i'd focus my initial attention on whether the TCF contents performance is significantly impacted by this update, as that is not an API that we can easily break apart in any way. maybe you've already verified it, or maybe my concern around this is totally misguided because i just don't have a great mental model for how the composite hybrid properties are worked into the db queries in the ORM layer.
  • i've got a slight concern for maintainability purposes that we're overloading the overridden_legal_basis_for_processing property to communicate information about exclusions on the level of a PrivacyDeclaration along with legal basis information. i see there are benefits to how you've implemented this, but (at least in how i'm thinking about it now), i get a bit worried it may be unclear if/when we want to come back to this down the road. some more details in the comments, but would be happy to talk this over separately to get your thoughts.

src/fides/config/consent_settings.py Show resolved Hide resolved
src/fides/api/api/v1/endpoints/system.py Outdated Show resolved Hide resolved
src/fides/api/api/v1/endpoints/system.py Outdated Show resolved Hide resolved
src/fides/api/util/tcf/tcf_experience_contents.py Outdated Show resolved Hide resolved
src/fides/api/models/sql_models.py Outdated Show resolved Hide resolved
src/fides/api/util/tcf/tcf_experience_contents.py Outdated Show resolved Hide resolved
@pattisdr
Copy link
Contributor Author

Thanks very much for your thoughtful comments @adamsachs

…her than relying on hybrid properties.

- Remove all the hybrid properties from the PrivacyDeclaration model except for "purpose" which is still applicable
- Rename/adapt a new method on the PrivacyDeclaration get_publisher_legal_basis_override() for use in supplementing the system response
- For matching privacy declarations return override as "overridden_legal_basis_for_processing"
… filter cannot be defined up front anymore (they need access to the override legal basis subquery), return these filters from the matching privacy declarations method. This method has been renamed to "get_tcf_base_query_and_filters".
- Revert adding overrides to base system endpoints - this will go on another endpoint instead
- Update database annotations
@pattisdr pattisdr changed the title POC Backend Publisher Purpose Overrides Backend TCF Purpose Override Support Nov 30, 2023
@pattisdr pattisdr marked this pull request as ready for review November 30, 2023 23:41
Copy link
Contributor

@adamsachs adamsachs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pattisdr this is also looking great, i like where it's landed! great job on this, the end result is pretty simple given the trickiness involved in fitting this into our existing framework.

just a couple of nits and some noncritical questions that i'd like to understand at some point, but on the whole, i don't think this needs any significant changes!

Comment on lines +561 to +571
Returns the legal basis for processing that factors in global purpose overrides if applicable.

Original legal basis for processing is returned where:
- feature is disabled
- declaration's legal basis is not flexible
- no legal basis override specified

Null is returned where:
- Purpose is excluded (this mimics what we do in the TCF Experience, which causes the purpose to be removed entirely)

Otherwise, we return the override!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very clear and helpful docstring! 🙏

from what i can tell, this is only used in tests - do you see it being used in application code going forward? is this what you imagine we'll end up leveraging for https://ethyca.atlassian.net/browse/PROD-1428?

i do get a little bit concerned about the codepath here being entirely separate from get_legal_basis_override_subquery(), even though they're effectively encoding the same logic. but i realize that it may not be possible to consolidate between the two contexts. even a docstring note that we should keep the logic between these two functions aligned could be helpful to provide a heads up if anyone needs to tweak this down the road?

in general it'd be great to know how you see us supporting https://ethyca.atlassian.net/browse/PROD-1428 - i'm not 100% sure how this will fit in! (but probably me just not seeing what's in front of me 😄)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, left this here in case it was useful for PROD-1428! I haven't had a chance to look at the new endpoint that is relevant for PROD-1428 but I see this going one of two ways!

  1. Use this method for instance-level lookups. For each PrivacyDeclaration, call this method (or something similar) and stash this on the declaration object at runtime, similar to what I was doing in the previous draft of this PR. This assumes that an async session is used in the new endpoint.
  2. Use the get_legal_basis_overide_subquery() for class-level lookups. This subquery could be joined with the system/privacy declaration query for the endpoint.

Before merge, I could dig further, and remove this if we already know that method won't be used.

@@ -514,6 +529,81 @@ def create(
"""Overrides base create to avoid unique check on `name` column"""
return super().create(db=db, data=data, check_name=check_name)

@hybrid_property
def purpose(self) -> Optional[int]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm sort of curious how these get compiled into the generated SQL queries! if you have something handy i'd love to see, but no need to go out of your way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a CASE statement added to the other SELECT attributes, it's not bad! We can explore adding purpose too on the declaration at some point in the future. Cons are keeping data use/purpose in sync though.

SELECT
  ...
  CASE 
    WHEN (privacydeclaration.data_use = %(data_use_1)s) THEN %(param_1)s 
    WHEN (privacydeclaration.data_use = %(data_use_2)s) THEN %(param_2)s 
    WHEN (privacydeclaration.data_use = %(data_use_3)s) THEN %(param_3)s 
    WHEN (privacydeclaration.data_use = %(data_use_4)s) THEN %(param_4)s 
    WHEN (privacydeclaration.data_use = %(data_use_5)s) THEN %(param_5)s 
    WHEN (privacydeclaration.data_use = %(data_use_6)s) THEN %(param_6)s 
    WHEN (privacydeclaration.data_use = %(data_use_7)s) THEN %(param_7)s 
    WHEN (privacydeclaration.data_use = %(data_use_8)s) THEN %(param_8)s 
    WHEN (privacydeclaration.data_use = %(data_use_9)s) THEN %(param_9)s 
    WHEN (privacydeclaration.data_use = %(data_use_10)s) THEN %(param_10)s 
    WHEN (privacydeclaration.data_use = %(data_use_11)s) THEN %(param_11)s 
    WHEN (privacydeclaration.data_use = %(data_use_12)s) THEN %(param_12)s 
    WHEN (privacydeclaration.data_use = %(data_use_13)s) THEN %(param_13)s 
    WHEN (privacydeclaration.data_use = %(data_use_14)s) THEN %(param_14)s 
  END AS purpose,
  ....

@pattisdr
Copy link
Contributor Author

pattisdr commented Dec 1, 2023

Thank you for review @adamsachs and for your earlier comments that helped shape this!

@pattisdr pattisdr merged commit ed32558 into main Dec 2, 2023
41 of 42 checks passed
@pattisdr pattisdr deleted the PROD-1426_publisher_override_support branch December 2, 2023 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants