feat(chart-data): add rowcount, timegrain and column result types #13271

villebro · 2021-02-22T10:22:27Z

SUMMARY

This PR adds some features to /api/v1/chart/data needed for native filters, namely:

support for retrieval of the total row count of a query (needed for calculating page count when doing server side pagination; see related PR here)
support for retrieval of time grains
support for retrieval of datasource columns
support for retrieving data from multiple datasources in the same request (not needed quite yet, but this adds support for the feature as it will come up later)

To avoid adding more clutter to QueryContext, the result actions have been moved to a separate module superset/common/actions.py to avoid having query_context.py grow further in size. In addition, some general refactoring has been done.

Going forward I believe we should consier splitting up /api/v1/chart/data into multiple endpoints (e.g. api/v1/chart/data/annotation, api/v1/chart/data/query, api/v1/chart/data/metadata etc), or consider moving certain endpoints to GraphQL to make it easier to retrieve nested data in one request.

TEST PLAN

CI + added tests. I have also tested this with GLOBAL_ASYNC_QUERIES feature flag, and it works fine for all request types.

ADDITIONAL INFORMATION

villebro · 2021-02-22T10:29:39Z

superset/common/query_object.py

+        if self.datasource:
+            cache_dict["datasource"] = self.datasource.uid
+        if self.result_type:
+            cache_dict["result_type"] = self.result_type


These are only added to the dict if defined to avoid invalidating cache keys of currently cached objects.

codecov-io · 2021-02-22T12:01:56Z

Codecov Report

Merging #13271 (9146bb8) into master (e37c2bf) will increase coverage by 15.72%.
The diff coverage is 89.76%.

@@             Coverage Diff             @@
##           master   #13271       +/-   ##
===========================================
+ Coverage   64.23%   79.95%   +15.72%     
===========================================
  Files         971      300      -671     
  Lines       45178    24381    -20797     
  Branches     4129        0     -4129     
===========================================
- Hits        29019    19495     -9524     
+ Misses      16159     4886    -11273

Flag	Coverage Δ
cypress	`?`
python	`79.95% <89.76%> (+12.59%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
superset/utils/cache.py	`76.34% <ø> (ø)`
superset/connectors/druid/models.py	`82.06% <50.00%> (-0.08%)`	⬇️
superset/common/query_context.py	`81.52% <75.00%> (-3.28%)`	⬇️
superset/common/query_object.py	`90.27% <81.25%> (-1.27%)`	⬇️
superset/connectors/sqla/models.py	`89.63% <87.50%> (-0.94%)`	⬇️
superset/utils/core.py	`88.45% <92.30%> (+0.18%)`	⬆️
superset/common/actions.py	`95.38% <95.38%> (ø)`
superset/charts/schemas.py	`100.00% <100.00%> (ø)`
superset/sql_validators/postgres.py	`50.00% <0.00%> (-50.00%)`	⬇️
superset/views/database/views.py	`62.69% <0.00%> (-24.88%)`	⬇️
... and 698 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e37c2bf...9146bb8. Read the comment docs.

dpgaspar

Looks good, left a couple of comments

superset/common/query_context.py

dpgaspar · 2021-02-22T17:07:46Z

superset/common/query_context.py

@@ -152,13 +152,39 @@ def get_single_payload(
        self, query_obj: QueryObject, force_cached: Optional[bool] = False,
    ) -> Dict[str, Any]:
        """Return results payload for a single quey"""
-        if self.result_type == utils.ChartDataResultType.QUERY:
+        result_type = query_obj.result_type or self.result_type


nit: could make sense to document the constructor since, not passing result_type assumes it is FULL or passed on the query_obj

superset/common/query_object.py

superset/connectors/sqla/models.py

superset/utils/core.py

dpgaspar

Nice! loved the refactor also

ktmud · 2021-03-03T18:44:00Z

superset/common/query_actions.py

+    ChartDataResultType.QUERY: _get_query,
+    ChartDataResultType.SAMPLES: _get_samples,
+    ChartDataResultType.FULL: _get_full,
+    ChartDataResultType.RESULTS: _get_results,


@villebro Sorry for being late to the party, but it there a reason why is_rowcount is not implemented as another ChartDataResultType?

That's the way I thought about implementing it first, but I felt it better to implement it as a QueryObject property, as it has to be passed to the SQLA model as it affects the rendered query (I feel QueryObject should be mostly 1-1 mapped to the query dict that's unpacked into get_sqla_query).

Wouldn't you still have that mapping if you just create a _get_rowcount function that sets QueryObjects's is_rowcount to True? I see no difference in setting is_rowcount in a _get_rowcount function than manipulating other fields of QueryObject in _get_samples and _get_query.

The reason why I think a new result type makes more sense is that from the client-side point of view, ResultType.SAMPLES, ResultType.QUERY, and ResultType.ROW_COUNT are mutually exclusive, i.e., you cannot have a QueryObject with both result_type = ResultType.SAMPLES and is_rowcount = True.

I don't mind adding a result type as a convenience method for fetching the row count. However, the preferred method for retrieving row counts should IMO be explicitly setting is_rowcount on QueryObject, as it will make it possible to set ResultType.QUERY on QueryContext and by doing so fetch only the queries for all QueryObjects (we don't yet support showing queries for multiple queries, but I hope we can add it soon):

superset/superset-frontend/src/explore/components/DisplayQueryButton.jsx

Line 170 in ca27b00

beforeOpen={() => beforeOpen('query')}

superset-github-bot bot added the preset-io label Feb 22, 2021

pull-request-size bot added the size/L label Feb 22, 2021

villebro commented Feb 22, 2021

View reviewed changes

villebro requested a review from dpgaspar February 22, 2021 11:40

dpgaspar reviewed Feb 22, 2021

View reviewed changes

junlincc requested review from amitmiran137, suddjian and ktmud February 23, 2021 04:10

junlincc added the dashboard:native-filters Related to the native filters of the Dashboard label Feb 23, 2021

junlincc removed the request for review from ktmud February 23, 2021 04:13

feat(chart-data): add rowcount, timegrain and column result types

7a0a6ce

villebro force-pushed the villebro/new_result_types branch from c875131 to f295794 Compare February 23, 2021 12:16

villebro commented Feb 23, 2021

View reviewed changes

superset/utils/core.py Show resolved Hide resolved

break out actions from query_context

9146bb8

villebro force-pushed the villebro/new_result_types branch from f295794 to 9146bb8 Compare February 23, 2021 12:49

villebro requested a review from dpgaspar February 23, 2021 14:00

amitmiran137 approved these changes Feb 23, 2021

View reviewed changes

dpgaspar approved these changes Feb 23, 2021

View reviewed changes

rename module

1aa52a5

villebro force-pushed the villebro/new_result_types branch from 9d92d93 to 1aa52a5 Compare February 24, 2021 05:07

villebro merged commit 0a00153 into apache:master Feb 24, 2021

villebro deleted the villebro/new_result_types branch February 24, 2021 05:44

ktmud reviewed Mar 3, 2021

View reviewed changes

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.2.0 labels Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(chart-data): add rowcount, timegrain and column result types #13271

feat(chart-data): add rowcount, timegrain and column result types #13271

villebro commented Feb 22, 2021 •

edited

Loading

villebro Feb 22, 2021

codecov-io commented Feb 22, 2021 •

edited

Loading

dpgaspar left a comment

dpgaspar Feb 22, 2021

dpgaspar left a comment

ktmud Mar 3, 2021

villebro Mar 3, 2021

ktmud Mar 3, 2021

villebro Mar 4, 2021

feat(chart-data): add rowcount, timegrain and column result types #13271

feat(chart-data): add rowcount, timegrain and column result types #13271

Conversation

villebro commented Feb 22, 2021 • edited Loading

SUMMARY

TEST PLAN

ADDITIONAL INFORMATION

villebro Feb 22, 2021

Choose a reason for hiding this comment

codecov-io commented Feb 22, 2021 • edited Loading

Codecov Report

dpgaspar left a comment

Choose a reason for hiding this comment

dpgaspar Feb 22, 2021

Choose a reason for hiding this comment

dpgaspar left a comment

Choose a reason for hiding this comment

ktmud Mar 3, 2021

Choose a reason for hiding this comment

villebro Mar 3, 2021

Choose a reason for hiding this comment

ktmud Mar 3, 2021

Choose a reason for hiding this comment

villebro Mar 4, 2021

Choose a reason for hiding this comment

villebro commented Feb 22, 2021 •

edited

Loading

codecov-io commented Feb 22, 2021 •

edited

Loading