Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(dashboard): make loading datasets non-blocking #15699

Merged
merged 4 commits into from
Jul 15, 2021

Conversation

ktmud
Copy link
Member

@ktmud ktmud commented Jul 15, 2021

SUMMARY

This will be one of the many PRs to address perf issues raised in #15518 .

Currently the Dashboard page will wait for both datasets and charts data to start rendering, however the datasets data are only used in filter indicators and filter boxes and should not block us from rendering other charts. In our case, the datasets endpoint is often much slower because we have a giant single-source-of-truth datasource with thousands of metrics and columns.

Unblocking rendering from the datasets endpoint should speed up the page rendering a little bit. In our use case, it's often a matter of a full second.

It's not easy to simply remove datasources from Dashboard state as that would involve letting the filter indicators and filter box fetching datasource metadata themselves.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before

Dashboard renders after /datasets is loaded:

before-pr.mp4

After

Dashboard renders as soon as /dashboard/[id] and /dashboard/[id]/charts are loaded:

after-pr.mp4

TESTING INSTRUCTIONS

Go to dashboard page, make sure everything still renders correctly.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@@ -43,53 +43,38 @@ export type ControlProps = {
renderTrigger?: boolean;
};

export default class Control extends React.PureComponent<
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor from class component to functional. Not related to this PR but may be useful in the future.

@ktmud ktmud force-pushed the dashboard-async-datasets branch from df6860d to 92fda2d Compare July 15, 2021 01:14
@codecov
Copy link

codecov bot commented Jul 15, 2021

Codecov Report

Merging #15699 (4ec7416) into master (b489cff) will increase coverage by 0.00%.
The diff coverage is 33.82%.

❗ Current head 4ec7416 differs from pull request most recent head f6a090b. Consider uploading reports for the commit f6a090b to get more accurate results
Impacted file tree graph

@@           Coverage Diff           @@
##           master   #15699   +/-   ##
=======================================
  Coverage   76.82%   76.83%           
=======================================
  Files         983      983           
  Lines       51585    51581    -4     
  Branches     6974     6979    +5     
=======================================
  Hits        39632    39632           
+ Misses      11729    11727    -2     
+ Partials      224      222    -2     
Flag Coverage Δ
javascript 71.35% <33.82%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset-frontend/src/chart/Chart.jsx 46.29% <ø> (ø)
...ontend/src/common/hooks/apiResources/dashboards.ts 40.00% <0.00%> (ø)
...set-frontend/src/dashboard/containers/Dashboard.ts 0.00% <0.00%> (ø)
...rontend/src/dashboard/containers/DashboardPage.tsx 0.00% <0.00%> (ø)
...rset-frontend/src/dashboard/actions/datasources.ts 25.00% <30.00%> (ø)
superset-frontend/src/chart/ChartRenderer.jsx 37.97% <50.00%> (+0.31%) ⬆️
...perset-frontend/src/explore/components/Control.tsx 76.47% <71.42%> (+1.47%) ⬆️
...set-frontend/src/dashboard/reducers/datasources.ts 75.00% <75.00%> (ø)
superset-frontend/src/dashboard/actions/hydrate.js 1.70% <100.00%> (ø)
.../src/dashboard/components/gridComponents/Chart.jsx 80.37% <100.00%> (+0.18%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b489cff...f6a090b. Read the comment docs.

@ktmud ktmud force-pushed the dashboard-async-datasets branch from 21d3e92 to e3ccef1 Compare July 15, 2021 02:28
}

export const FETCH_DATASOURCE_FAILED = 'FETCH_DATASOURCE_FAILED';
export function fetchDatasourceFailed(error, key) {
return { type: FETCH_DATASOURCE_FAILED, error, key };
Copy link
Member Author

@ktmud ktmud Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned up FETCH_DATASOURCE_STARTED and FETCH_DATASOURCE_FAILED because there are no corresponding reducers for them.

@@ -19,7 +19,7 @@
import Owner from './Owner';
import Role from './Role';

type Dashboard = {
export interface Dashboard {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this change for? Dashboard doesn't have methods why prefer interface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interface is always preferred other than in React props.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw i think there's still a bunch of discrepancies across Superset for this. I don't really care one way or another (consistency is better in my mind) but noting that even after this change there will still be a bunch of inconsistencies

@graceguo-supercat
Copy link

this is a reasonable trade-off we can try. Is it possible that filter_box started query before dataset metadata is not ready, will that cause any issue?

@ktmud
Copy link
Member Author

ktmud commented Jul 15, 2021

this is a reasonable trade-off we can try. Is it possible that filter_box started query before dataset metadata is not ready, will that cause any issue?

Time Column and Granularity control in filter boxes may not have select choices. Filter indicators may show inexplicable filter values. But those are expected to be temporary and not very noticeable.

In case datasets loading fails, in previous case, the whole page won't render, while after this change you still have a largely functional Dashboard page.

Copy link
Member

@kgabryje kgabryje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with native filters and with datasets hardcoded to null - everything works perfectly fine, including filters and filter indicators. The only difference is that we can't create a new native filter as it requires selecting a dataset; we can use the already existing native filters though. LGTM!

Copy link
Member

@michael-s-molina michael-s-molina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the performance and typescript improvements.

Copy link
Member

@etr2460 etr2460 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few nits, but nothing blocking. thanks for tackling this work so quickly!


// gets the datasets for a dashboard
// important: this endpoint only returns the fields in the dataset
// that are necessary for rendering the given dashboard
export const useDashboardDatasets = (idOrSlug: string | number) =>
useApiV1Resource(`/api/v1/dashboard/${idOrSlug}/datasets`);
useApiV1Resource<Datasource[]>(`/api/v1/dashboard/${idOrSlug}/datasets`);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: perhaps we should rename the Datasource type to Dataset? Or at least alias it for now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many places where the Datasource/Dataset type is defined---including two places in superset-ui. I think we need a separate PR to consolidate those. Although they do have small differences depending on the context.

this.setState({ hovered });
const ControlComponent = typeof type === 'string' ? controlMap[type] : type;
if (!ControlComponent) {
return <>Unknown controlType: {type}</>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need a fragment here? Couldn't this be:

Suggested change
return <>Unknown controlType: {type}</>;
return `Unknown controlType: ${type}`;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

React typing would complain if returning a plain string. Something about expecting a ComponentType to always return ReactElement.

@@ -19,7 +19,7 @@
import Owner from './Owner';
import Role from './Role';

type Dashboard = {
export interface Dashboard {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw i think there's still a bunch of discrepancies across Superset for this. I don't really care one way or another (consistency is better in my mind) but noting that even after this change there will still be a bunch of inconsistencies

@suddjian
Copy link
Member

suddjian commented Jul 15, 2021

Have we verified yet that none of the chart plugins use datasets? If so we should also stop passing datasets to the charts.

If any of the plugins do use datasets, this might break them.

@ktmud
Copy link
Member Author

ktmud commented Jul 15, 2021

Have we verified yet that none of the chart plugins use datasets? If so we should also stop passing datasets to the charts.

Thanks for pointing this out. I did more vetting and realized that some charts do use info on datasource, mostly non-critical info such as column name verboseMap here: https://github.com/apache-superset/superset-ui/blob/61accc6ce9c546331b7407cde21d6ec163fe191d/plugins/legacy-plugin-chart-pivot-table/src/transformProps.js

Fixed a re-render issue now that charts will be updated accordingly once datasets are loaded:

chart-rerender.mp4

Notice the name of the first column. It uses verbose_map and updated after /datasets is fully loaded. Note that I artificially slowed down the /datasets API. For most users, this minor update may not be even noticeable.

@ktmud ktmud merged commit e305f2a into apache:master Jul 15, 2021
@ktmud ktmud deleted the dashboard-async-datasets branch July 15, 2021 22:21
cccs-RyanS pushed a commit to CybercentreCanada/superset that referenced this pull request Dec 17, 2021
QAlexBall pushed a commit to QAlexBall/superset that referenced this pull request Dec 29, 2021
cccs-rc pushed a commit to CybercentreCanada/superset that referenced this pull request Mar 6, 2024
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.3.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels v1.3 🚢 1.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants