Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn on imposing max rows and fix for curve #1408

Merged
merged 4 commits into from
Sep 20, 2024
Merged

Warn on imposing max rows and fix for curve #1408

merged 4 commits into from
Sep 20, 2024

Conversation

ahuang11
Copy link
Collaborator

@ahuang11 ahuang11 commented Sep 9, 2024

Closes #1406

Adds a warning and checks if kind == curve to prevent random sampling; only sample head.

@ahuang11 ahuang11 requested a review from maximlt September 9, 2024 16:28
Copy link

codecov bot commented Sep 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.81%. Comparing base (6c96c7e) to head (058bf41).
Report is 27 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1408      +/-   ##
==========================================
+ Coverage   87.39%   88.81%   +1.41%     
==========================================
  Files          50       51       +1     
  Lines        7490     7618     +128     
==========================================
+ Hits         6546     6766     +220     
+ Misses        944      852      -92     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ahuang11 ahuang11 added this to the 0.11.0 milestone Sep 10, 2024
Copy link
Member

@maximlt maximlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I feel like MAX_ROWS could be increased, 10000 is kind of low specially considering that now HoloViews uses the webGL backend by default for Bokeh. How about 100000?
  • Just for discussion for now, do you think we should expose this kind of setting in the explorer? Maybe in an advanced tab?

hvplot/ui.py Outdated
@@ -688,10 +688,22 @@ def _plot(self):
if len(df) > MAX_ROWS and not (
self.kind in KINDS['stats'] or kwargs.get('rasterize') or kwargs.get('datashade')
):
df = df.sample(n=MAX_ROWS)
if self.kind == 'line':
param.main.param.warning(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about displaying these warnings in the alert box in addition/in place of as a programmatic warning?

Copy link

@hagaishalevaei hagaishalevaei Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to do df = df.sample(n=MAX_ROWS).sort_index(). Otherwise the line plot will not work as it should.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with sort_index; the x may not always be index.

Copy link

@hagaishalevaei hagaishalevaei Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. is there an option to add sort_index as an option?
something like df.hvplot.explorer(x='x', y='y', kind='line', sort_index=True)
where the default is sort_index=False ?

Copy link
Collaborator Author

@ahuang11 ahuang11 Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it probably makes more sense to do sort_values(selected_x) when sampling for line, but I still think head is better, and perhaps a slider for sample size

hvplot/tests/testui.py Outdated Show resolved Hide resolved
@maximlt maximlt merged commit bd3e590 into main Sep 20, 2024
9 checks passed
@maximlt maximlt deleted the fix_max_rows branch September 20, 2024 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hvplot Explorer is giving nuisance Line plot when using over 10,000 points
3 participants