Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(chart-data): add rowcount, timegrain and column result types #13271

Merged
merged 3 commits into from
Feb 24, 2021

Conversation

villebro
Copy link
Member

@villebro villebro commented Feb 22, 2021

SUMMARY

This PR adds some features to /api/v1/chart/data needed for native filters, namely:

  • support for retrieval of the total row count of a query (needed for calculating page count when doing server side pagination; see related PR here)
  • support for retrieval of time grains
  • support for retrieval of datasource columns
  • support for retrieving data from multiple datasources in the same request (not needed quite yet, but this adds support for the feature as it will come up later)

To avoid adding more clutter to QueryContext, the result actions have been moved to a separate module superset/common/actions.py to avoid having query_context.py grow further in size. In addition, some general refactoring has been done.

Going forward I believe we should consier splitting up /api/v1/chart/data into multiple endpoints (e.g. api/v1/chart/data/annotation, api/v1/chart/data/query, api/v1/chart/data/metadata etc), or consider moving certain endpoints to GraphQL to make it easier to retrieve nested data in one request.

TEST PLAN

CI + added tests. I have also tested this with GLOBAL_ASYNC_QUERIES feature flag, and it works fine for all request types.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

Comment on lines +291 to +295
if self.datasource:
cache_dict["datasource"] = self.datasource.uid
if self.result_type:
cache_dict["result_type"] = self.result_type
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are only added to the dict if defined to avoid invalidating cache keys of currently cached objects.

@villebro villebro requested a review from dpgaspar February 22, 2021 11:40
@codecov-io
Copy link

codecov-io commented Feb 22, 2021

Codecov Report

Merging #13271 (9146bb8) into master (e37c2bf) will increase coverage by 15.72%.
The diff coverage is 89.76%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #13271       +/-   ##
===========================================
+ Coverage   64.23%   79.95%   +15.72%     
===========================================
  Files         971      300      -671     
  Lines       45178    24381    -20797     
  Branches     4129        0     -4129     
===========================================
- Hits        29019    19495     -9524     
+ Misses      16159     4886    -11273     
Flag Coverage Δ
cypress ?
python 79.95% <89.76%> (+12.59%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset/utils/cache.py 76.34% <ø> (ø)
superset/connectors/druid/models.py 82.06% <50.00%> (-0.08%) ⬇️
superset/common/query_context.py 81.52% <75.00%> (-3.28%) ⬇️
superset/common/query_object.py 90.27% <81.25%> (-1.27%) ⬇️
superset/connectors/sqla/models.py 89.63% <87.50%> (-0.94%) ⬇️
superset/utils/core.py 88.45% <92.30%> (+0.18%) ⬆️
superset/common/actions.py 95.38% <95.38%> (ø)
superset/charts/schemas.py 100.00% <100.00%> (ø)
superset/sql_validators/postgres.py 50.00% <0.00%> (-50.00%) ⬇️
superset/views/database/views.py 62.69% <0.00%> (-24.88%) ⬇️
... and 698 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e37c2bf...9146bb8. Read the comment docs.

Copy link
Member

@dpgaspar dpgaspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, left a couple of comments

superset/common/query_context.py Outdated Show resolved Hide resolved
@@ -152,13 +152,39 @@ def get_single_payload(
self, query_obj: QueryObject, force_cached: Optional[bool] = False,
) -> Dict[str, Any]:
"""Return results payload for a single quey"""
if self.result_type == utils.ChartDataResultType.QUERY:
result_type = query_obj.result_type or self.result_type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could make sense to document the constructor since, not passing result_type assumes it is FULL or passed on the query_obj

superset/common/query_object.py Show resolved Hide resolved
superset/connectors/sqla/models.py Outdated Show resolved Hide resolved
@junlincc junlincc added the dashboard:native-filters Related to the native filters of the Dashboard label Feb 23, 2021
@junlincc junlincc removed the request for review from ktmud February 23, 2021 04:13
@villebro villebro force-pushed the villebro/new_result_types branch from c875131 to f295794 Compare February 23, 2021 12:16
@villebro villebro force-pushed the villebro/new_result_types branch from f295794 to 9146bb8 Compare February 23, 2021 12:49
@villebro villebro requested a review from dpgaspar February 23, 2021 14:00
Copy link
Member

@dpgaspar dpgaspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! loved the refactor also

@villebro villebro force-pushed the villebro/new_result_types branch from 9d92d93 to 1aa52a5 Compare February 24, 2021 05:07
@villebro villebro merged commit 0a00153 into apache:master Feb 24, 2021
@villebro villebro deleted the villebro/new_result_types branch February 24, 2021 05:44
ChartDataResultType.QUERY: _get_query,
ChartDataResultType.SAMPLES: _get_samples,
ChartDataResultType.FULL: _get_full,
ChartDataResultType.RESULTS: _get_results,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@villebro Sorry for being late to the party, but it there a reason why is_rowcount is not implemented as another ChartDataResultType?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the way I thought about implementing it first, but I felt it better to implement it as a QueryObject property, as it has to be passed to the SQLA model as it affects the rendered query (I feel QueryObject should be mostly 1-1 mapped to the query dict that's unpacked into get_sqla_query).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't you still have that mapping if you just create a _get_rowcount function that sets QueryObjects's is_rowcount to True? I see no difference in setting is_rowcount in a _get_rowcount function than manipulating other fields of QueryObject in _get_samples and _get_query.

The reason why I think a new result type makes more sense is that from the client-side point of view, ResultType.SAMPLES, ResultType.QUERY, and ResultType.ROW_COUNT are mutually exclusive, i.e., you cannot have a QueryObject with both result_type = ResultType.SAMPLES and is_rowcount = True.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind adding a result type as a convenience method for fetching the row count. However, the preferred method for retrieving row counts should IMO be explicitly setting is_rowcount on QueryObject, as it will make it possible to set ResultType.QUERY on QueryContext and by doing so fetch only the queries for all QueryObjects (we don't yet support showing queries for multiple queries, but I hope we can add it soon):

beforeOpen={() => beforeOpen('query')}

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.2.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels dashboard:native-filters Related to the native filters of the Dashboard preset-io size/L 🚢 1.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants