Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dashboards to migration progress dashboard #3314

Merged
merged 133 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
133 commits
Select commit Hold shift + click to select a range
de82509
Set LineageAtom.other to dict by default
JCZuurmond Dec 10, 2024
daa0394
Add dashboard progress encoder
JCZuurmond Dec 10, 2024
07f608c
Get table failures from historical table snapshot
JCZuurmond Dec 10, 2024
0aa8b4f
Allow other to be None
JCZuurmond Dec 10, 2024
cbff571
Remove cached properties from dashboard progress encoder
JCZuurmond Dec 11, 2024
00b32db
Add first integration test for dashboard progress encoder
JCZuurmond Dec 11, 2024
c49aa9f
Test dashboard progress encoder without failures
JCZuurmond Dec 11, 2024
932d1b1
Test dashboard failure coming from query problem
JCZuurmond Dec 11, 2024
d930203
Test dashboard failure coming from dfsa
JCZuurmond Dec 11, 2024
3099f81
Rewrite tests to assert on historical rows
JCZuurmond Dec 11, 2024
600d73a
Test used table failure from hive metastore table
JCZuurmond Dec 11, 2024
b9d6b0d
Merge tests
JCZuurmond Dec 11, 2024
3ac499a
Test Table.from_historical
JCZuurmond Dec 11, 2024
eba3a69
Format
JCZuurmond Dec 11, 2024
1ebe6a5
Assert row in integration test
JCZuurmond Dec 11, 2024
8179518
Add default attributes to expected data
JCZuurmond Dec 16, 2024
b6922ce
Add id attributes to dashboard
JCZuurmond Dec 16, 2024
252c978
Add dashboard ownership to GlobalContext
JCZuurmond Dec 16, 2024
7d643a9
Add dashboard progress encoder to runtime context
JCZuurmond Dec 16, 2024
71577c0
Force key word argument in dashboard progress encoder
JCZuurmond Dec 16, 2024
fc63ef2
Update dashboard progress encoder integration test
JCZuurmond Dec 16, 2024
30c19cc
Expect DFSA message to come from query problem
JCZuurmond Dec 16, 2024
d2f0e76
Add from table info to Table
JCZuurmond Dec 16, 2024
d52877e
Isort
JCZuurmond Dec 16, 2024
d983e36
Remove used tables crawler from TableProgressEncoder
JCZuurmond Dec 16, 2024
a38ec1d
Pass failure from used table to dashboard
JCZuurmond Dec 16, 2024
7daf8dd
Persist dashboard migration progress in workflow
JCZuurmond Dec 16, 2024
8e4d2c7
Force run integration test in CI
JCZuurmond Dec 16, 2024
684cb1e
Test Redash ownership is me
JCZuurmond Dec 17, 2024
b0fd8f7
Test ownership of directory
JCZuurmond Dec 17, 2024
af1fa79
Support ownership of directory
JCZuurmond Dec 17, 2024
38d29b2
Test owner of directory
JCZuurmond Dec 17, 2024
51b061d
Test retrieving ownership for invalid path
JCZuurmond Dec 17, 2024
c7c57a6
Handle invalid path
JCZuurmond Dec 17, 2024
b91f135
Add missing type hints
JCZuurmond Dec 17, 2024
67a1007
Test warn about unsupported object type
JCZuurmond Dec 17, 2024
6f2b272
Warn about unsupported object type
JCZuurmond Dec 17, 2024
bf7a9fb
Test Lakeview dashboard ownership is me
JCZuurmond Dec 17, 2024
95e2229
Test getting user name
JCZuurmond Dec 17, 2024
2659aec
Handle resource does not exists
JCZuurmond Dec 17, 2024
d0a5915
Handle None types
JCZuurmond Dec 17, 2024
3ab57ad
Remove legacy test
JCZuurmond Dec 17, 2024
3a15024
Update unit test
JCZuurmond Dec 17, 2024
9d85f2f
Rename me to current_user
JCZuurmond Dec 17, 2024
b6a6a51
The owner of an invalid path should fallback on the workspace admin
JCZuurmond Dec 20, 2024
6b4829f
Improve assert message
JCZuurmond Dec 20, 2024
d82de72
Skip test when running in debug
JCZuurmond Dec 20, 2024
093845f
Copy changes from #3112
JCZuurmond Nov 15, 2024
e45b1b6
Add integration test
JCZuurmond Nov 15, 2024
91672a3
Add dashboard fixture
JCZuurmond Dec 17, 2024
474c485
Append dashboard inventory snapshot
JCZuurmond Dec 17, 2024
412daeb
Revert storing UsedTable, QueryProblem and DFSA snapshots
JCZuurmond Dec 17, 2024
4a074fd
Store QueryProblem into table
JCZuurmond Dec 17, 2024
93123d8
Add TODO
JCZuurmond Dec 17, 2024
3972876
Remove DFSA, QueryProblem and UsedTable queries
JCZuurmond Dec 17, 2024
af2211f
Rename code section to Dashboards
JCZuurmond Dec 17, 2024
11b2138
Update data asset to dashboard references
JCZuurmond Dec 17, 2024
8e625ff
Avoid redefinition
JCZuurmond Dec 17, 2024
018c3cd
Fix partition by
JCZuurmond Dec 18, 2024
1d915f2
Fix check substring
JCZuurmond Dec 18, 2024
9dd40b5
Fix name for code compatability issues
JCZuurmond Dec 18, 2024
480ad8b
Create query for dashboard
JCZuurmond Dec 18, 2024
02ae469
Link query problems with dashboard
JCZuurmond Dec 18, 2024
77442e2
Swap arguments in contains
JCZuurmond Dec 18, 2024
a899f33
Swap failure and dashboard_type columns
JCZuurmond Dec 18, 2024
bfc4cc3
Remove query problems widget
JCZuurmond Dec 18, 2024
dbae8a8
Add dashboard with dfsa
JCZuurmond Dec 18, 2024
f88ab28
Add dashboard that is correct
JCZuurmond Dec 18, 2024
40e028d
Add query problem with dfsa
JCZuurmond Dec 18, 2024
ace2dcc
Add dashboard with Hive table
JCZuurmond Dec 18, 2024
451c7a7
Reuse tables in used tables
JCZuurmond Dec 18, 2024
0f65ee6
Add used table for query
JCZuurmond Dec 18, 2024
2c5a658
Persist used tables in queries
JCZuurmond Dec 18, 2024
6f35290
Fix missing comma in query problem
JCZuurmond Dec 18, 2024
b40f100
Fix reference wrong query id
JCZuurmond Dec 18, 2024
001d825
Swap columns
JCZuurmond Dec 18, 2024
94eb90f
Use non-migrated table
JCZuurmond Dec 18, 2024
b1ca975
Assert None fields
JCZuurmond Dec 18, 2024
33e82b7
Shorten variable name
JCZuurmond Dec 18, 2024
cd0ae18
Move workflow run into for loop
JCZuurmond Dec 18, 2024
4377ed4
Rename table migration statuses
JCZuurmond Dec 18, 2024
2db0232
Rename table migration statuses pending migration
JCZuurmond Dec 18, 2024
73fcf0e
Rename variable
JCZuurmond Dec 18, 2024
1da55ba
Verify right dashboard is chosen
JCZuurmond Dec 18, 2024
a1595f8
Improve asserts in fixtures
JCZuurmond Dec 18, 2024
186e4a5
Fix asserts
JCZuurmond Dec 18, 2024
329a5fb
Add type hint
JCZuurmond Dec 18, 2024
29b5704
Fix reference to variable
JCZuurmond Dec 18, 2024
c4d1e72
Add type hinting
JCZuurmond Dec 18, 2024
6829645
Make query problems dynamic using dashboards
JCZuurmond Dec 18, 2024
a415ee7
Separate dashboard with Hive table out
JCZuurmond Dec 18, 2024
af055f2
Let dashboard reference all Hive tables
JCZuurmond Dec 18, 2024
d140dcd
Use dashboard with Hive table
JCZuurmond Dec 18, 2024
a0b7db4
Handle None attributes
JCZuurmond Dec 18, 2024
ce64297
Reuse tables migrated
JCZuurmond Dec 18, 2024
91ee989
Move job with and without failures to separate fixtures
JCZuurmond Dec 18, 2024
d5247a9
Link job with failures to workflow problem
JCZuurmond Dec 18, 2024
0c37e55
Add docstring
JCZuurmond Dec 18, 2024
345af27
Reuse job wit and without failures
JCZuurmond Dec 18, 2024
9c636ac
Remove redundant UsedTable LineageAtoms
JCZuurmond Dec 18, 2024
1adb5fc
Handle None
JCZuurmond Dec 18, 2024
29e979b
Move dbfs location to separate fixture
JCZuurmond Dec 18, 2024
23f5609
Split used tables
JCZuurmond Dec 18, 2024
494c5e5
Format
JCZuurmond Dec 18, 2024
9ccacf7
Create a catalog and schema for migrated tables
JCZuurmond Dec 18, 2024
edec9fc
Create the tables
JCZuurmond Dec 18, 2024
b92fbdb
Add dashboard with UC tables fixture
JCZuurmond Dec 18, 2024
bf48e27
Add dashboard with UC tables to dashboards
JCZuurmond Dec 18, 2024
426d978
Add used tables to dashboard
JCZuurmond Dec 18, 2024
c361d07
Update table ownership
JCZuurmond Dec 18, 2024
75f42b8
Fix query names in test
JCZuurmond Dec 18, 2024
c89bd0e
Fix number of dashboards pending migration
JCZuurmond Dec 18, 2024
cf890b9
Fix expected row for dashboard pending migration
JCZuurmond Dec 18, 2024
b6b8cfc
Move distinct failure per object type to the bottom
JCZuurmond Dec 18, 2024
df9dedb
Add dashboard to overall progress
JCZuurmond Dec 18, 2024
a5569a9
Add dashboard migration progress counter
JCZuurmond Dec 18, 2024
bc04f60
Move fixture rows around
JCZuurmond Dec 18, 2024
edfe3e5
Sort by failure
JCZuurmond Dec 18, 2024
1facdd2
Test subset of dashboards pending migration
JCZuurmond Dec 18, 2024
afd30df
Rename tests
JCZuurmond Dec 18, 2024
98dc332
Format
JCZuurmond Dec 18, 2024
d99be35
Fix total percentage
JCZuurmond Dec 18, 2024
8ea471c
Add missing field
JCZuurmond Dec 18, 2024
b7df08e
Exclude owner from Redash checks
JCZuurmond Dec 20, 2024
c3026ee
Force commit
JCZuurmond Dec 20, 2024
debcabe
Merge branch 'main' into feat/add-code-migration-to-migration-progres…
JCZuurmond Jan 8, 2025
7043908
Fix order of test rows
JCZuurmond Jan 9, 2025
e9da265
Match job with dfsa
JCZuurmond Jan 9, 2025
af90eda
Match DFSA with Dashboard
JCZuurmond Jan 9, 2025
1a6d37c
Bump job id
JCZuurmond Jan 9, 2025
ba242d8
Merge branch 'main' into feat/add-code-migration-to-migration-progres…
JCZuurmond Jan 10, 2025
8098bc0
Merge branch 'main' into feat/add-code-migration-to-migration-progres…
JCZuurmond Jan 10, 2025
f675a26
Merge branch 'main' into feat/add-code-migration-to-migration-progres…
gueniai Jan 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
SELECT
ROUND(100 * try_divide(COUNT_IF(SIZE(failures) = 0), COUNT(*)), 2) AS percentage
FROM ucx_catalog.multiworkspace.objects_snapshot
WHERE object_type IN ('ClusterInfo', 'Grant', 'JobInfo', 'PipelineInfo', 'PolicyInfo', 'Table', 'Udf')
WHERE object_type IN ('ClusterInfo', 'Grant', 'Dashboard', 'JobInfo', 'PipelineInfo', 'PolicyInfo', 'Table', 'Udf')
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
/* --title 'Table migration progress (%)' --width 2 */
/* --title 'Table migration progress (%)' */
SELECT
ROUND(100 * TRY_DIVIDE(COUNT_IF(SIZE(failures) = 0), COUNT(*)), 2) AS percentage
FROM ucx_catalog.multiworkspace.objects_snapshot
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
/* --title 'Dashboard progress (%)' */
SELECT
ROUND(100 * TRY_DIVIDE(COUNT_IF(SIZE(failures) = 0), COUNT(*)), 2) AS percentage
FROM ucx_catalog.multiworkspace.objects_snapshot
WHERE object_type = "Dashboard"
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/* --title 'Overview' --description 'Tables and views migration' --width 5 */
WITH migration_statuses AS (
SELECT *
SELECT owner, failures
FROM ucx_catalog.multiworkspace.objects_snapshot
WHERE object_type = 'Table'
)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Dashboards

This section shows Unity Catalog compatability issues found while linting dashboards. There are two kinds of changes to
perform:
- Data asset reference, i.e. references to Hive metastore tables and views or direct filesystem access (dfsa), these
references should be updated to refer to their Unity Catalog counterparts.
- Linting compatability issues, e.g. using RDDs or directly accessing the Spark context, these issues should be resolved
by following the instructions stated with the issue.
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
/* --title 'Dashboards pending migration' --height 6 */
SELECT COUNT(*) AS count
FROM ucx_catalog.multiworkspace.objects_snapshot
WHERE object_type = 'Dashboard' AND SIZE(failures) > 0
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
/*
--title 'Dashboards pending migration'
--width 5
--overrides '{"spec": {
"version": 3,
"widgetType": "bar",
"encodings": {
"x": {"fieldName": "owner", "scale": {"type": "categorical"}, "displayName": "owner"},
"y": {"fieldName": "count", "scale": {"type": "quantitative"}, "displayName": "count"}
}
}}'
*/
WITH owners_with_failures AS (
SELECT owner
FROM ucx_catalog.multiworkspace.objects_snapshot
WHERE object_type = 'Dashboard' AND SIZE(failures) > 0
)

SELECT
owner,
COUNT(1) AS count
FROM owners_with_failures
GROUP BY owner
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
/* --title 'Dashboards migrated' --height 6 */
SELECT COUNT(*) AS count
FROM ucx_catalog.multiworkspace.objects_snapshot
WHERE object_type = 'Dashboard' AND SIZE(failures) == 0
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
/* --title 'Dashboard pending migration' --width 5 */
WITH migration_statuses AS (
SELECT owner, failures
FROM ucx_catalog.multiworkspace.objects_snapshot
WHERE object_type = 'Dashboard'
)

SELECT
owner,
DOUBLE(CEIL(100 * COUNT_IF(SIZE(failures) = 0) / SUM(COUNT(*)) OVER (PARTITION BY owner), 2)) AS percentage,
COUNT(*) AS total,
COUNT_IF(SIZE(failures) = 0) AS total_migrated,
COUNT_IF(SIZE(failures) > 0) AS total_not_migrated
FROM migration_statuses
GROUP BY owner
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
/*
--title 'Dashboards pending migration'
--width 6
--overrides '{"spec":{
"encodings":{
"columns": [
{"fieldName": "workspace_id", "booleanValues": ["false", "true"], "type": "string", "displayAs": "string", "title": "workspace_id"},
{"fieldName": "owner", "booleanValues": ["false", "true"], "type": "string", "displayAs": "string", "title": "owner"},
{"fieldName": "name", "title": "Name", "type": "string", "displayAs": "link", "linkUrlTemplate": "{{ dashboard_link }}", "linkTextTemplate": "{{ @ }}", "linkTitleTemplate": "{{ @ }}", "linkOpenInNewTab": true, "booleanValues": ["false", "true"]},
{"fieldName": "dashboard_type", "booleanValues": ["false", "true"], "type": "string", "displayAs": "string", "title": "dashboard_type"},
{"fieldName": "failure", "booleanValues": ["false", "true"], "type": "string", "displayAs": "string", "title": "failure"}
]},
"invisibleColumns": [
{"fieldName": "dashboard_link", "title": "dashboard_link", "type": "string", "displayAs": "string", "booleanValues": ["false", "true"]}
]
}}'
*/
SELECT
workspace_id,
owner,
data.name AS name,
CASE
-- Simple heuristic to differentiate between Redash and Lakeview dashboards
WHEN CONTAINS(data.id, '-') THEN 'Redash'
ELSE 'Lakeview'
END AS dashboard_type,
EXPLODE(failures) AS failure,
-- Below are invisible column(s) used in links url templates
CASE
WHEN CONTAINS(data.id, '-') THEN CONCAT('/sql/dashboards/', data.id)
ELSE CONCAT('/dashboardsv3/', data.id, '/published')
END AS dashboard_link
FROM ucx_catalog.multiworkspace.objects_snapshot
WHERE object_type = 'Dashboard' AND SIZE(failures) > 0
ORDER BY workspace_id, owner, name, failure
1 change: 1 addition & 0 deletions tests/integration/progress/test_workflows.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
@retried(on=[NotFound, InvalidParameterValue], timeout=dt.timedelta(minutes=12))
def test_running_real_migration_progress_job(installation_ctx: MockInstallationContext) -> None:
"""Ensure that the migration-progress workflow can complete successfully."""

# Limit the resources crawled by the assessment
source_schema = installation_ctx.make_schema()
installation_ctx.make_table(schema_name=source_schema.name)
Expand Down
Loading
Loading