-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HJ-23 Add connection_type to NamespaceMeta #5387
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
fides Run #10511
Run Properties:
|
Project |
fides
|
Run status |
Passed #10511
|
Run duration | 00m 39s |
Commit |
6e70867336 ℹ️: Merge 88b77a6f8adaaf38eb579561591f0e67db8efd7c into cb0579859b31e8f3b8c8ef5c232f...
|
Committer | erosselli |
View all properties for this run ↗︎ |
Test results | |
---|---|
Failures |
0
|
Flaky |
0
|
Pending |
0
|
Skipped |
0
|
Passing |
4
|
Upgrade your plan to view test results. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5387 +/- ##
=======================================
Coverage 85.58% 85.58%
=======================================
Files 379 379
Lines 23984 23989 +5
Branches 2623 2623
=======================================
+ Hits 20526 20531 +5
Misses 2906 2906
Partials 552 552 ☔ View full report in Codecov by Sentry. |
""" | ||
update ctl_datasets | ||
set fides_meta = jsonb_set(fides_meta::jsonb, '{namespace, connection_type}', '"bigquery"', true) | ||
where fides_meta::jsonb ? 'namespace'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to call out that I am a bit concerned about the performance of this migration -- opening the json to check whether it has a namespace
key seems like it might not scale well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay tested on 500,000 rows and it took like 7s and on 1,120,000 rows it took 20s. we should be okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing that, makes for much higher confidence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks perfect to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we get these back out of the database? Does the app just parse everything in fides_meta
already?
) | ||
|
||
# If the namespace dict has a database_instance_id, we know it's an RDS MySQL dataset. | ||
op.execute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing context from the ticket, sorry, but do we care about any other dialects? We'll fill those in later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fidesplus PR https://github.com/ethyca/fidesplus/pull/1673 (not ready for review yet) makes the necessary changes to start storing it for all datasets created through D&D . This migration just takes advantage of the fact that BQ and RDS MySQL were both already using the namespace
key so we have a reliable way to identify existing datasets from these integrations. for other integrations we'll start storing the connection_type from now on but we don't have a way to populate their values for existing datasets
@thingscouldbeworse yes, fides_meta is a JSON column so it gets parsed as a dictionary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice 👍
fides Run #10512
Run Properties:
|
Project |
fides
|
Run status |
Passed #10512
|
Run duration | 00m 40s |
Commit |
ca8a0950d3: HJ-23 Add connection_type to NamespaceMeta (#5387)
|
Committer | erosselli |
View all properties for this run ↗︎ |
Test results | |
---|---|
Failures |
0
|
Flaky |
0
|
Pending |
0
|
Skipped |
0
|
Passing |
4
|
Upgrade your plan to view test results. |
Part of HJ-23
Description Of Changes
Migration to add the
connection_type
field to all existingnamespace
dicts in thefides_meta
field of theDataset
model.Code Changes
connection_type
for existing datasets with a namespaceSteps to Confirm
namespace
key in itsfides_meta
fieldnamespace
key and value{ "project_id": "some-id" }
(simulates a BigQuery dataset created through D&D)namespace
key and value{ "database_instance_id": "some-id" }
(simulates an RDS MySQL dataset created through D&D)namespace
key and value{ "other": "test" }
connection_type
field with valuebigquery
as part of itsnamespace
connection_type
field with valuerds_mysql
as part of itsnamespace
Pre-Merge Checklist
CHANGELOG.md
main
downgrade()
migration is correct and works