-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unified Fides Resources Feature #2254
Conversation
Co-authored-by: Andrew Jackson <[email protected]> Update Fides to use the updated Fideslang models which now contain DSR concepts. - Temporarily pointed fideslang to main. - Temporarily installed git in the Dockerfiles prior to installing requirements. Revert before merge. - Updated redis_dataset.yml (which is a dataset describing our redis cache) to not have data_categories at the object level when there are data_categories at the nested field level. This is a constraint introduced from the fidesops-side to keep there from being data category conflicts in nested data structures when we're filtering data on data category before building the DSR package - Updated the dataset yamls (for Database datasets, not saas datasets to have fides_meta.) Fideslang supports both in the request, but any fidesops_meta supplied is converted to fides_meta. We have both in our yaml files right now. - Update the db_dataset.yml which maps our fides database to have a fides_meta instead of a fidesctl_meta field on the ctl_datasets table - Use fideslang FidesKey in favor of FidesOpsKey - FidesKey now allows '>' '<' characters (useful for saas templates) and also bad validation throws a ValueError which is more easily picked up by FastAPI and turned into a ValidationError. - Remove FidesopsDataset in favor of the new Fideslang Dataset which has the important attributes from FidesopsDataset - Rename Dataset to be GraphDataset to not get confused with the Fideslang Dataset. - At the code level - adjust all fidesops_meta variables (at the Dataset, DatasetCollection, and DatasetField levels) to use fides_meta instead. However, leave fidesops_meta keys in the saas-related yaml files for now. If fidesops_meta is supplied, Fideslang converts it to fides_meta for backwards compatibility. - Remove FidesopsDatasetReference in favor of identical Fideslang FidesDatasetReference - Add new data category validation for the PATCH /datasetconfig endpoints where we validate against data_categories in the database instead of just the static default taxonomy as users could have added other data categories. - Because FidesKey validation errors now throw a FidesValidationError that inherits from ValueError instead of exception, adjust a couple of locations -ctl side where model validation is now throwing a ValidationError. - Update the ctl_datasets.fidesctl_meta to be named fides_meta
# Conflicts: # data/dataset/remote_fides_example_test_dataset.yml # requirements.txt # src/fides/api/ops/api/v1/endpoints/connection_endpoints.py # src/fides/api/ops/api/v1/endpoints/dataset_endpoints.py # src/fides/api/ops/api/v1/endpoints/messaging_endpoints.py # src/fides/api/ops/api/v1/endpoints/policy_endpoints.py # src/fides/api/ops/api/v1/endpoints/policy_webhook_endpoints.py # src/fides/api/ops/api/v1/endpoints/saas_config_endpoints.py # src/fides/api/ops/api/v1/endpoints/storage_endpoints.py # src/fides/api/ops/graph/graph.py # src/fides/api/ops/models/datasetconfig.py # src/fides/api/ops/models/policy.py # src/fides/api/ops/schemas/privacy_request.py # src/fides/api/ops/service/connectors/saas/connector_registry_service.py # src/fides/api/ops/service/connectors/saas_query_config.py # src/fides/api/ops/service/storage/storage_uploader_service.py # src/fides/api/ops/task/filter_results.py # src/fides/api/ops/task/task_resources.py # src/fides/ctl/core/utils.py # tests/ops/graph/graph_test_util.py # tests/ops/integration_tests/test_execution.py # tests/ops/integration_tests/test_integration_email.py
Big picture, start writing to the new location (ctl_dataset) and keep writing to the old location (DatasetConfig.dataset). Start fetching data from the new location. - Adds a dataset migration that copies all DatasetConfig.datasets into new ctl_dataset records, and then links that ctl_dataset as the DatasetConfig.ctl_dataset_id. If there's a conflict with an existing ctl_datasets.fides_key, I error instead of attempting to upsert. The user should manually resolve. - Added a new API endpoint PATCH {{host}}/connection/{{connection_key}}/datasetconfig that upserts a DatasetConfig and links it to an existing CTL Dataset. - Update Existing ops PATCH dataset (json and yaml) endpoints to still work. A raw dataset passed in attempts to upsert both a DatasetConfig and the existing CtlDataset object. The UI still uses one of these endpoints. - Update Creating a saas connector from a template. When upserting a datasetconfig also upsert a CTL Dataset. Lots of test fixtures needed to be changed to create a ctl_dataset before creating a datasetconfig and then linking the two.
# Conflicts: # CHANGELOG.md
…1763] (#2096) - Remove the DatasetConfig.dataset column - Throw a 404 if the ctl_dataset_id does not existing when creating a DatasetConfig through the new patch_dataset_configs endpoint - Add new GET datasetconfig list and detail endpoints that include both the fides_key and nested ctl_dataset in the response. The DatasetConfig and CTLDataset are different resources and their fides_keys can differ, so both keys need to be in the response. - Add a fix to prevent existing upsert from attempting to update the id of an existing resource. Datasets are referenced by DatasetConfigs now, we need them to stay the same. - Add more validation before linking an existing CTL Dataset to a DatasetConfig - Update the Ops DatasetConfig Admin UI for parity with existing UI Co-authored-by: Allison King <[email protected]>
# Conflicts: # src/fides/api/ctl/database/crud.py
- Fix crud upsert endpoints so they validate the request against the appropriate pydantic model. - When creating, updating, or upserting Datasets, validate that the supplied data categories align with those currently in the db. (Note that this has to happen outside of typical Pydantic validation because we need access to a db session)
- Conflicts: Changelog.md clients/admin-ui/src/app/store.ts clients/admin-ui/src/features/datastore-connections/add-connection/forms/YamlEditorForm.tsx - Fix policy_endpoints - new endpoints were added there, and obsolete FidesOpsKey type annotations added. - Bump downrev in first unified fides resources migration
# Conflicts: # CHANGELOG.md # clients/admin-ui/src/app/store.ts # clients/admin-ui/src/features/datastore-connections/add-connection/forms/YamlEditorForm.tsx # src/fides/api/ops/api/v1/endpoints/policy_endpoints.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just went through your steps to confirm (thanks for all the details!) and everything worked as expected! 🚀 can we cut fideslang now and then update this PR, or do we need to wait for something else?
I've published the |
…nstalling git twice in the Dockerfiles so we had it before we installed the requirements.
Thanks @seanpreston. Bumped fideslang version but added do not merge label here so we wait for this week's release first. |
Current tests failing on this branch, need to verify if these are on main:
|
Getting this branch up to date now - |
# Conflicts: # CHANGELOG.md # scripts/quickstart.py # src/fides/api/ctl/sql_models.py
Confirmed failing tests are also on main (these are both tests that don't get run as frequently - you need the unsafe labels check) https://github.com/ethyca/fides/actions/runs/3968715043/jobs/6802446026 |
Getting up-to-date again, conflicts with #2249, which had a migration, downrevs need to be bumped too |
# Conflicts: # CHANGELOG.md # src/fides/api/ops/api/v1/endpoints/connection_endpoints.py # tests/ops/fixtures/application_fixtures.py Bump downrev.
Hey @pattisdr I gave this some QA this morning, it's looking great! |
@seanpreston thank you! |
One last main merge ^ |
❗
Prerequisite: We need a fideslang release, so we can link to the proper fideslang version here❗ Before merge:
❗ If you primarily care about your ctl datasets, migration plan would be to delete your dataset configs prior to upgrading.
Fixes #2262
Code Changes
This feature branch contains the work below from several PR's - they've all been individually reviewed, and I've been keeping this branch up-to-date with main:
datasetconfig.ctl_dataset_id
field to unify fides dataset resources #2046DatasetConfig.dataset field
#2096Steps to Confirm
nox -s test_env
. Verify in your db that DatasetConfig.dataset column is gone. Instead there's a DatasetConfig.ctl_dataset_id FK toctl_dataset
organization_fides_key
has been automatically added, and there arefides_meta
instead offidesops_meta
keys[email protected]
fides_uploads
Pre-Merge Checklist
CHANGELOG.md
Description Of Changes
Merges the
unified-fides-resources
branch. Big picture, we've been storing Datasets in two separate places with our two separate fides and fidesops repos and this branch consolidates to one location.The CTL side has been storing this in a
ctl_datasets
table and the OPS side has been storing in thedatasetconfig.dataset
table. We move to storing in one place: ctl_datasets, and adding a datasetconfig.ctl_dataset_id FK.