Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SQL Lab] Removing display limit #6942

Conversation

john-bodley
Copy link
Member

@john-bodley john-bodley commented Feb 25, 2019

We're run into issues were users in SQL Lab are expecting the number of records to be displayed to equal that of the LIMIT defined either in i) the SQL query, or ii) the UI limit component, however this is truncated to display only the default limit, and there's no mention of this to the user. Note when exporting to CSV the display limit is not applied.

Given that the LIMIT has already been applied during the compute (defaulting to the DEFAULT_SQLLAB_LIMIT if neither the query or UI component have been augmented) it seems prudent to not re-apply any additional limit when displaying the results.

to: @jeffreythewang @kristw @michellethomas @mistercrunch @timifasubaa

@john-bodley john-bodley added !deprecated-label:bug Deprecated label - Use #bug instead v0.30 labels Feb 25, 2019
@mistercrunch
Copy link
Member

There's been some back and forth on this and a history of regressions. I think I see a bad merge conflict between
#4941 and #5866

My intention originally was to provide the administrator with a hard limit on 1- SQL_MAX_ROW how many rows can be exported out of the database and ultimately stored on the results_backend and 2- DISPLAY_MAX_ROW how many rows will be returned to the user's browser.

Now #4941 was created before #5866 and merged after it and appears to deprecate DISPLAY_MAX_ROW in favor of something new named DEFAULT_SQLLAB_LIMIT. That looks wrong to me. Naming-wise DEFAULT_SQLLAB_LIMIT should just be the default in the LIMIT component, and the user should be able to up it up until it reaches the admin-setup DISPLAY_MAX_ROW (which is now gone...).

My point is we probably need all 3 configuration settings here. We need to prevent the user from shooting themselves in the foot and return enough row to crash their browser.

@john-bodley
Copy link
Member Author

john-bodley commented Feb 26, 2019

@mistercrunch thanks for the context. I think there may be a few issues with the DISPLAY_MAX_ROW:

  1. It's not apparent to the user that the display is potentially showing only a preview (there's no verbiage in the UI to inform the user of this). This can be problematic if a user is searching/ordering the preview expecting to see a certain value and/or is mislead regarding the cardinality of a specific column.
  2. It's not apparent that the preview/export to CSV could return different row counts.
  3. How does a user override the display limit, i.e., say they want to see more than the configured DISPLAY_MAX_ROW and don't want to export the results to a CSV file.

I would be concerned that a third configuration setting would surface more confusion. I also feel that #4941 makes it quite explicit that a limit is being applied and I would speculate that the user would expect that number of records (if applicable) rendered.

@john-bodley john-bodley force-pushed the john-bodley--sqllab-remove-display-limit branch from f73f5b8 to e98594a Compare February 26, 2019 06:15
@codecov-io
Copy link

codecov-io commented Feb 26, 2019

Codecov Report

Merging #6942 into master will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6942      +/-   ##
==========================================
+ Coverage   63.98%   63.99%   +0.01%     
==========================================
  Files         423      423              
  Lines       20558    20554       -4     
  Branches     2235     2235              
==========================================
  Hits        13154    13154              
+ Misses       7272     7268       -4     
  Partials      132      132
Impacted Files Coverage Δ
superset/views/core.py 75.12% <ø> (+0.21%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c2e3d7...e98594a. Read the comment docs.

@john-bodley john-bodley removed the v0.30 label Mar 3, 2019
@john-bodley
Copy link
Member Author

@jeffreythewang et al. do you have any thoughts on this?

@michellethomas
Copy link
Contributor

I agree that whatever limit is getting applied should be shown to the user because we are changing the data that's getting returned. I was debugging this issue when someone surfaced a discrepancy between sqllab results and csv download results, and it's pretty difficult to tell if you are seeing a subset or the full result of your query even if you are aware that there could be a difference.

If you need to have a DISPLAY_MAX_ROW maybe it could be set to none by default.

@john-bodley @mistercrunch

@mistercrunch
Copy link
Member

mistercrunch commented Mar 15, 2019

There's something here displaying a message to the user:
https://github.com/apache/incubator-superset/blob/ba9523c7c4d6e11876455eb0a1bf090e1ddacc7f/superset/assets/src/SqlLab/components/SqlEditor.jsx#L278

I don't have time right this moment to dig in to see whether it still works or if the linkage has been broken with the many PRs in this area. It should show up right next to the query timer as a warning label with a tooltip.

We really need to be careful around letting the user crash their browser as it's tricky to fix. There are some intricate issues possible there:

  • localstorage is too small, in which case refreshing the page will probably work
  • small enough to fit in localstorage, but more than the browser can actually deal with, then you have to go and clear your localstorage in your cookie/browser settings to fix SQL Lab. You also loose all your tabs/queries then

@stale
Copy link

stale bot commented May 14, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label May 14, 2019
@stale stale bot closed this May 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
!deprecated-label:bug Deprecated label - Use #bug instead inactive Inactive for >= 30 days
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants