fix: do not drop calculated column on metadata sync #11731

villebro · 2020-11-17T23:57:21Z

SUMMARY

PR #10645 introduced a regression that caused calculated columns to be dropped when syncing datasource metadata in the legacy CRUD list. When saving the dataset in the legacy CRUD view, metadata was also automatically synced, causing loss of calculated columns. Previously the metadata refresh was triggered every time the dataset was saved in the legacy CRUD view, but the effect of this was less noticeable as the previous metadata refresh only added new columns, updated existing types and left non-present physical columns in the dataset metadata.

This changes behavior in the following way:

Metadata is no longer automatically synced when saved in the legacy CRUD view.
Calculated columns are no longer dropped when synced in the legacy CRUD list.
Calculated columns are replaced with its physical counterpart if a physical column with the same name has been added to the dataset.

TEST PLAN

Local testing + CI + new tests

ADDITIONAL INFORMATION

villebro · 2020-11-17T23:58:20Z

superset/connectors/base/models.py

+    columns: List["BaseColumn"] = []
+    metrics: List["BaseMetric"] = []


Bycatch: not sure why these weren't typed (did the same for SQLA and Druid)

villebro · 2020-11-18T00:01:14Z

superset/connectors/sqla/models.py

+        self.columns.extend(
+            [col for col in old_columns_by_name.values() if col.expression]
+        )


This ensures that calculated columns are not dropped.

junlincc · 2020-11-18T00:04:20Z

thank you for the quick fix.🙏.it's 2am in Finland please get some good rest!

codecov-io · 2020-11-18T00:06:48Z

Codecov Report

Merging #11731 (18dc486) into master (01d15f5) will decrease coverage by 0.04%.
The diff coverage is 92.30%.

@@            Coverage Diff             @@
##           master   #11731      +/-   ##
==========================================
- Coverage   63.06%   63.02%   -0.05%     
==========================================
  Files         895      897       +2     
  Lines       43340    43444     +104     
  Branches     4015     4015              
==========================================
+ Hits        27334    27381      +47     
- Misses      15828    15885      +57     
  Partials      178      178

Flag	Coverage Δ
javascript	`62.82% <ø> (ø)`
python	`63.14% <92.30%> (-0.07%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
superset/connectors/sqla/views.py	`67.85% <66.66%> (+0.59%)`	⬆️
superset/connectors/base/models.py	`89.78% <100.00%> (ø)`
superset/connectors/druid/models.py	`82.18% <100.00%> (+0.04%)`	⬆️
superset/connectors/sqla/models.py	`90.81% <100.00%> (-0.37%)`	⬇️
superset/db_engine_specs/sqlite.py	`65.62% <0.00%> (-9.38%)`	⬇️
superset/db_engine_specs/presto.py	`73.65% <0.00%> (-8.64%)`	⬇️
superset/utils/celery.py	`82.14% <0.00%> (-3.58%)`	⬇️
superset/examples/world_bank.py	`97.10% <0.00%> (-2.90%)`	⬇️
superset/examples/birth_names.py	`96.51% <0.00%> (-2.33%)`	⬇️
superset/result_set.py	`96.69% <0.00%> (-1.66%)`	⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 01d15f5...18dc486. Read the comment docs.

villebro · 2020-11-18T00:14:07Z

superset/connectors/sqla/views.py

-        self.post_add(item, flash_message=False)
+        self.post_add(item, flash_message=False, fetch_metadata=False)


I'm not sure why we were syncing metadata on post_update, so I disabled it.

@dpgaspar do you know why we call post_add every time we post_update the dataset? Previously we were calling fetch_metadata and create_table_permissions every time we updated, but after this change we'll only call create_table_permissions. Is this just to make sure any potentially stale perms get synced?

To update the permissions if the dataset table_name or database change, yet if that is done it will leave a stale permission name behind

Ok, then I'll just leave this like this, as I believe we won't be hitting post_edit with the new CRUD.

etr2460

this lgtm, although @john-bodley could be a good other reviewer

etr2460 · 2020-11-18T00:39:55Z

tests/datasource_tests.py

+from typing import Dict

 from superset import db
-from superset.connectors.sqla.models import SqlaTable
+from superset.connectors.sqla.models import SqlaTable, TableColumn


why did these get added if the rest of the file didn't change?

Oh thanks for catching, I implemented the tests here first, but then decided the other test file was the correct place. These imports were left behind from that move.

villebro · 2020-11-18T07:25:40Z

superset/connectors/sqla/models.py

-        old_columns_by_name = {col.column_name: col for col in old_columns}
+        old_columns_by_name: Dict[str, TableColumn] = {
+            col.column_name: col for col in old_columns
+        }


Not sure if it's just my IDE, but PyCharm wasn't able to infer the type without the explicit type hint.

etr2460 · 2020-11-18T16:32:59Z

fyi, John said this looks good, so I'm going to go ahead and merge since we do our weekly deploy today and it would be really good to have this change in

(cherry picked from commit 7ae8cd0)

superset-github-bot bot added the preset-io label Nov 17, 2020

pull-request-size bot added the size/M label Nov 17, 2020

villebro requested a review from etr2460 November 17, 2020 23:57

villebro commented Nov 17, 2020

View reviewed changes

villebro force-pushed the villebro/fix-sql-colsync branch from 49cfd30 to b48a38d Compare November 17, 2020 23:59

villebro commented Nov 18, 2020

View reviewed changes

villebro force-pushed the villebro/fix-sql-colsync branch from b48a38d to 07ee06f Compare November 18, 2020 00:18

etr2460 approved these changes Nov 18, 2020

View reviewed changes

villebro force-pushed the villebro/fix-sql-colsync branch from 07ee06f to 0d501aa Compare November 18, 2020 07:19

villebro commented Nov 18, 2020

View reviewed changes

villebro requested review from dpgaspar and john-bodley November 18, 2020 07:32

fix: do not drop calculated column on metadata sync

18dc486

villebro force-pushed the villebro/fix-sql-colsync branch from 0d501aa to 18dc486 Compare November 18, 2020 07:48

etr2460 merged commit 7ae8cd0 into apache:master Nov 18, 2020

villebro deleted the villebro/fix-sql-colsync branch November 18, 2020 16:58

craig-rueda pushed a commit to preset-io/superset that referenced this pull request Nov 18, 2020

fix: do not drop calculated column on metadata sync (apache#11731)

c95345b

(cherry picked from commit 7ae8cd0)

auxten pushed a commit to auxten/incubator-superset that referenced this pull request Nov 20, 2020

fix: do not drop calculated column on metadata sync (apache#11731)

70a4750

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.0.0 labels Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: do not drop calculated column on metadata sync #11731

fix: do not drop calculated column on metadata sync #11731

villebro commented Nov 17, 2020 •

edited

Loading

villebro Nov 17, 2020

villebro Nov 18, 2020

junlincc commented Nov 18, 2020

codecov-io commented Nov 18, 2020 •

edited

Loading

villebro Nov 18, 2020

villebro Nov 18, 2020

dpgaspar Nov 18, 2020

villebro Nov 18, 2020

etr2460 left a comment

etr2460 Nov 18, 2020

villebro Nov 18, 2020

villebro Nov 18, 2020

etr2460 commented Nov 18, 2020

		columns: List["BaseColumn"] = []
		metrics: List["BaseMetric"] = []

		self.post_add(item, flash_message=False)
		self.post_add(item, flash_message=False, fetch_metadata=False)

fix: do not drop calculated column on metadata sync #11731

fix: do not drop calculated column on metadata sync #11731

Conversation

villebro commented Nov 17, 2020 • edited Loading

SUMMARY

TEST PLAN

ADDITIONAL INFORMATION

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junlincc commented Nov 18, 2020

codecov-io commented Nov 18, 2020 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

etr2460 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

etr2460 commented Nov 18, 2020

villebro commented Nov 17, 2020 •

edited

Loading

codecov-io commented Nov 18, 2020 •

edited

Loading