Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unified Fides Resources Feature #2254

Merged
merged 21 commits into from
Jan 23, 2023
Merged

Unified Fides Resources Feature #2254

merged 21 commits into from
Jan 23, 2023

Conversation

pattisdr
Copy link
Contributor

@pattisdr pattisdr commented Jan 17, 2023

Prerequisite: We need a fideslang release, so we can link to the proper fideslang version here
❗ Before merge:

  • Get up to date with main again and push to this branch to re-trigger CI
  • Double check if downrev needs to be bumped: this branch has several migrations. Bump the first in the series.
  • See if any new saas connectors have been added - their tests will fail because they're using old DatasetConfig.dataset fields
  • Other uses of now-obsolete resources should get flagged via test failures.

❗ If you primarily care about your ctl datasets, migration plan would be to delete your dataset configs prior to upgrading.

Fixes #2262

Code Changes

This feature branch contains the work below from several PR's - they've all been individually reviewed, and I've been keeping this branch up-to-date with main:

  • Unified Fides Resources: Remove several fidesops schemas for DSR's in favor of updated Fideslang schemas #2009
  • Unified Fides Resources: Add new datasetconfig.ctl_dataset_id field to unify fides dataset resources #2046
  • Unified Fides Resources: Remove DatasetConfig.dataset field #2096
  • Unified Fides Resources: Update UI dataset config routes to use new unified routes #2113
  • Unified Fides Resources: Validate request body on crud endpoints on upsert. Validate dataset data categories before save. #2134
  • Unified Fides Resources: Added a dataset dropdown selector when configuring a connector to link an existing dataset to the connector configuration. #2162
  • Unified Fides Resources: Update test env setup and quickstart to use new routes instead of routes we plan to deprecate at some point #2225

Steps to Confirm

  • Run nox -s test_env. Verify in your db that DatasetConfig.dataset column is gone. Instead there's a DatasetConfig.ctl_dataset_id FK to ctl_dataset
fides=# select * from datasetconfig;
                    id                    |          created_at           |          updated_at           |           connection_config_id           |           fides_key           |              ctl_dataset_id              
------------------------------------------+-------------------------------+-------------------------------+------------------------------------------+-------------------------------+------------------------------------------
 dat_153c7bc6-f486-47f7-9acc-b961481e5b55 | 2023-01-17 16:46:47.683558+00 | 2023-01-17 16:46:47.683558+00 | con_d3d5aff5-9365-4b2d-974c-a8a26d83af5c | postgres_example_test_dataset | ctl_8e9491eb-e8b9-4794-a4a1-9ca667393043
 dat_e8d7fb9c-78f6-49d3-be86-05b5ee2b4830 | 2023-01-17 16:46:47.826331+00 | 2023-01-17 16:46:47.826331+00 | con_f6e8acef-1d94-4df3-a61e-c9cae9ce2a8b | mongo_test                    | ctl_1451aa46-3409-4cff-ba16-726aa65a96c4

  • Go the Admin UI > Configure Privacy Requests > Click "Connection Manager" left pane, and then click on Mongo DB Connector three dots > Configure
  • Click on Mongo Dataset configuration tab
  • Verify yaml looks as expected. Note organization_fides_key has been automatically added, and there are fides_meta instead of fidesops_meta keys
  • Go to Privacy Center - submit an access request for [email protected]
  • Return to admin-ui request manager and approve that privacy request
  • Verify results in fides_uploads
  • Experiment with adding and updating more connection configs / dataset configs in the connection manager, including adding saas connectors like mailchimp
    • Note that this operation makes an upsert to ctl Datasets and then a Patch to DatasetConfig to link that Ctl Dataset to the DatasetConfig. Check your Network tab.

Pre-Merge Checklist

Description Of Changes

Merges the unified-fides-resources branch. Big picture, we've been storing Datasets in two separate places with our two separate fides and fidesops repos and this branch consolidates to one location.

The CTL side has been storing this in a ctl_datasets table and the OPS side has been storing in the datasetconfig.dataset table. We move to storing in one place: ctl_datasets, and adding a datasetconfig.ctl_dataset_id FK.

pattisdr and others added 14 commits December 14, 2022 14:24
Co-authored-by: Andrew Jackson <[email protected]>

Update Fides to use the updated Fideslang models which now contain DSR concepts.

- Temporarily pointed fideslang to main.
- Temporarily installed git in the Dockerfiles prior to installing requirements. Revert before merge.
- Updated redis_dataset.yml (which is a dataset describing our redis cache) to not have data_categories at the object level when there are data_categories at the nested field level. This is a constraint introduced from the fidesops-side to keep there from being data category conflicts in nested data structures when we're filtering data on data category before building the DSR package
- Updated the dataset yamls (for Database datasets, not saas datasets to have fides_meta.) Fideslang supports both in the request, but any fidesops_meta supplied is converted to fides_meta. We have both in our yaml files right now.
- Update the db_dataset.yml which maps our fides database to have a fides_meta instead of a fidesctl_meta field on the ctl_datasets table
- Use fideslang FidesKey in favor of FidesOpsKey - FidesKey now allows '>' '<' characters (useful for saas templates) and also bad validation throws a ValueError which is more easily picked up by FastAPI and turned into a ValidationError.
- Remove FidesopsDataset in favor of the new Fideslang Dataset which has the important attributes from FidesopsDataset
- Rename Dataset to be GraphDataset to not get confused with the Fideslang Dataset.
- At the code level - adjust all fidesops_meta variables (at the Dataset, DatasetCollection, and DatasetField levels) to use fides_meta instead. However, leave fidesops_meta keys in the saas-related yaml files for now. If fidesops_meta is supplied, Fideslang converts it to fides_meta for backwards compatibility.
- Remove FidesopsDatasetReference in favor of identical Fideslang FidesDatasetReference
- Add new data category validation for the PATCH /datasetconfig endpoints where we validate against data_categories in the database instead of just the static default taxonomy as users could have added other data categories. 
- Because FidesKey validation errors now throw a FidesValidationError that inherits from ValueError instead of exception, adjust a couple of locations -ctl side where model validation is now throwing a ValidationError.
- Update the ctl_datasets.fidesctl_meta to be named fides_meta
# Conflicts:
#	data/dataset/remote_fides_example_test_dataset.yml
#	requirements.txt
#	src/fides/api/ops/api/v1/endpoints/connection_endpoints.py
#	src/fides/api/ops/api/v1/endpoints/dataset_endpoints.py
#	src/fides/api/ops/api/v1/endpoints/messaging_endpoints.py
#	src/fides/api/ops/api/v1/endpoints/policy_endpoints.py
#	src/fides/api/ops/api/v1/endpoints/policy_webhook_endpoints.py
#	src/fides/api/ops/api/v1/endpoints/saas_config_endpoints.py
#	src/fides/api/ops/api/v1/endpoints/storage_endpoints.py
#	src/fides/api/ops/graph/graph.py
#	src/fides/api/ops/models/datasetconfig.py
#	src/fides/api/ops/models/policy.py
#	src/fides/api/ops/schemas/privacy_request.py
#	src/fides/api/ops/service/connectors/saas/connector_registry_service.py
#	src/fides/api/ops/service/connectors/saas_query_config.py
#	src/fides/api/ops/service/storage/storage_uploader_service.py
#	src/fides/api/ops/task/filter_results.py
#	src/fides/api/ops/task/task_resources.py
#	src/fides/ctl/core/utils.py
#	tests/ops/graph/graph_test_util.py
#	tests/ops/integration_tests/test_execution.py
#	tests/ops/integration_tests/test_integration_email.py
Big picture, start writing to the new location (ctl_dataset) and keep writing to the old location (DatasetConfig.dataset).  Start fetching data from the new location. 

- Adds a dataset migration that copies all DatasetConfig.datasets into new ctl_dataset records, and then links that ctl_dataset as the DatasetConfig.ctl_dataset_id. If there's a conflict with an existing ctl_datasets.fides_key, I error instead of attempting to upsert. The user should manually resolve.
- Added a new API endpoint PATCH {{host}}/connection/{{connection_key}}/datasetconfig that upserts a DatasetConfig and links it to an existing CTL Dataset.
- Update Existing ops PATCH dataset (json and yaml) endpoints to still work.  A raw dataset passed in attempts to upsert both a DatasetConfig and the existing CtlDataset object.  The UI still uses one of these endpoints. 
- Update Creating a saas connector from a template.  When upserting a datasetconfig also upsert a CTL Dataset.
 Lots of test fixtures needed to be changed to create a ctl_dataset before creating a datasetconfig and then linking the two.
…1763] (#2096)

- Remove the DatasetConfig.dataset column
- Throw a 404 if the ctl_dataset_id does not existing when creating a DatasetConfig through the new patch_dataset_configs endpoint
- Add new GET datasetconfig list and detail endpoints that include both the fides_key and nested ctl_dataset in the response.  The DatasetConfig and CTLDataset are different resources and their fides_keys can differ, so both keys need to be in the response.
-  Add a fix to prevent existing upsert from attempting to update the id of an existing resource. Datasets are referenced by DatasetConfigs now, we need them to stay the same.
- Add more validation before linking an existing CTL Dataset to a DatasetConfig
- Update the Ops DatasetConfig Admin UI for parity with existing UI

Co-authored-by: Allison King <[email protected]>
# Conflicts:
#	src/fides/api/ctl/database/crud.py
- Fix crud upsert endpoints so they validate the request against the appropriate pydantic model.
- When creating, updating, or upserting Datasets, validate that the supplied data categories align with those currently in the db. (Note that this has to happen outside of typical Pydantic validation because we need access to a db session)
- Conflicts:
Changelog.md
clients/admin-ui/src/app/store.ts
clients/admin-ui/src/features/datastore-connections/add-connection/forms/YamlEditorForm.tsx

- Fix policy_endpoints - new endpoints were added there, and obsolete FidesOpsKey type annotations added.
- Bump downrev in first unified fides resources migration
# Conflicts:
#	CHANGELOG.md
#	clients/admin-ui/src/app/store.ts
#	clients/admin-ui/src/features/datastore-connections/add-connection/forms/YamlEditorForm.tsx
#	src/fides/api/ops/api/v1/endpoints/policy_endpoints.py
@pattisdr pattisdr changed the title Unified Fides Resources: Linking DatasetConfigs with Ctl Datasets Unified Fides Resources Merge Jan 17, 2023
@pattisdr pattisdr added run unsafe ci checks Runs fides-related CI checks that require sensitive credentials and removed run unsafe ci checks Runs fides-related CI checks that require sensitive credentials labels Jan 17, 2023
@pattisdr pattisdr marked this pull request as ready for review January 17, 2023 17:02
@pattisdr pattisdr requested a review from seanpreston January 17, 2023 17:03
@pattisdr pattisdr changed the title Unified Fides Resources Merge Unified Fides Resources Feature Jan 17, 2023
Copy link
Contributor

@allisonking allisonking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just went through your steps to confirm (thanks for all the details!) and everything worked as expected! 🚀 can we cut fideslang now and then update this PR, or do we need to wait for something else?

@seanpreston
Copy link
Contributor

I've published the fideslang release here

…nstalling git twice in the Dockerfiles so we had it before we installed the requirements.
@pattisdr pattisdr added the do not merge Please don't merge yet, bad things will happen if you do label Jan 17, 2023
@pattisdr
Copy link
Contributor Author

Thanks @seanpreston. Bumped fideslang version but added do not merge label here so we wait for this week's release first.

@pattisdr
Copy link
Contributor Author

Current tests failing on this branch, need to verify if these are on main:

  • saas requests failing with `test_friendbuy_nextgen_erasure_request_task
  • tests/ctl/core/test_system.py::test_scan_system_aws_passes

@pattisdr
Copy link
Contributor Author

Getting this branch up to date now -

# Conflicts:
#	CHANGELOG.md
#	scripts/quickstart.py
#	src/fides/api/ctl/sql_models.py
@pattisdr
Copy link
Contributor Author

Confirmed failing tests are also on main (these are both tests that don't get run as frequently - you need the unsafe labels check)

https://github.com/ethyca/fides/actions/runs/3968715043/jobs/6802446026
https://github.com/ethyca/fides/actions/runs/3968715043/jobs/6802445249

@pattisdr
Copy link
Contributor Author

pattisdr commented Jan 20, 2023

Getting up-to-date again, conflicts with #2249, which had a migration, downrevs need to be bumped too

# Conflicts:
#	CHANGELOG.md
#	src/fides/api/ops/api/v1/endpoints/connection_endpoints.py
#	tests/ops/fixtures/application_fixtures.py

Bump downrev.
@seanpreston
Copy link
Contributor

Hey @pattisdr I gave this some QA this morning, it's looking great!

@pattisdr
Copy link
Contributor Author

@seanpreston thank you!

@pattisdr pattisdr removed the do not merge Please don't merge yet, bad things will happen if you do label Jan 23, 2023
@pattisdr
Copy link
Contributor Author

One last main merge ^

@pattisdr pattisdr merged commit 23d9bbf into main Jan 23, 2023
@pattisdr pattisdr deleted the unified-fides-resources branch January 23, 2023 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run unsafe ci checks Runs fides-related CI checks that require sensitive credentials
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Merge in the Unified Fides Resources feature branch
3 participants