Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Widget.from_values #7033

Merged
merged 3 commits into from
Jan 21, 2025
Merged

Implement Widget.from_values #7033

merged 3 commits into from
Jan 21, 2025

Conversation

philippjfr
Copy link
Member

@philippjfr philippjfr commented Jul 28, 2024

Implements #6071

  • Add docs
  • Update examples
  • Add tests

Copy link

codecov bot commented Jul 28, 2024

Codecov Report

Attention: Patch coverage is 98.15951% with 3 lines in your changes missing coverage. Please review.

Project coverage is 86.78%. Comparing base (2c51805) to head (1b0dac7).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
panel/widgets/select.py 94.44% 2 Missing ⚠️
panel/widgets/base.py 96.15% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7033      +/-   ##
==========================================
+ Coverage   86.25%   86.78%   +0.53%     
==========================================
  Files         346      346              
  Lines       52052    52216     +164     
==========================================
+ Hits        44897    45318     +421     
+ Misses       7155     6898     -257     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jbednar
Copy link
Member

jbednar commented Aug 14, 2024

This is a good start towards #7003; thanks! A few questions:

  1. What starting value is used by default? Seems like you'd want a midpoint value by default, not just the start value?
  2. For a Pandas Series, it should be cheap to calculate the min, avg, and max in a single pass (e.g. df['column_name'].agg(['min', 'average', 'max'])) to get the start, value, and end; is that a useful special case for a Series?
  3. Range widgets seem like they'd need different behavior, since a range should normally default to the start and end of the range. Will that already happen automatically?

@philippjfr
Copy link
Member Author

What starting value is used by default? Seems like you'd want a midpoint value by default, not just the start value?

Honestly, while that's the logic for something like interact I'm not convinced that this is actually a better default.

For a Pandas Series, it should be cheap to calculate the min, avg, and max in a single pass (e.g. df['column_name'].agg(['min', 'average', 'max'])) to get the start, value, and end; is that a useful special case for a Series?

Afaik, that isn't the case. It's no cheaper than doing the individual queries.

Range widgets seem like they'd need different behavior, since a range should normally default to the start and end of the range. Will that already happen automatically?

Will take a look.

@jbednar
Copy link
Member

jbednar commented Jan 21, 2025

What starting value is used by default? Seems like you'd want a midpoint value by default, not just the start value?
Honestly, while that's the logic for something like interact I'm not convinced that this is actually a better default.

Hmm. I think I'd typically rather have a middle value mainly because choosing the lower or the upper seems arbitrary, and is also by definition an extreme value. Still, I guess it's not worth a lot of extra work. Plus using a midpoint could be problematic in some cases, e.g. when only a finite enumerable list of values is allowed, without values in between.

For a Pandas Series, it should be cheap to calculate the min, avg, and max in a single pass (e.g. df['column_name'].agg(['min', 'average', 'max'])) to get the start, value, and end; is that a useful special case for a Series?
Afaik, that isn't the case. It's no cheaper than doing the individual queries.

Drat. I saw some docs that implied they were performed together, but it didn't actually state that explicitly, and briefly clicking around in the source code suggests that it's decomposed into a separate call per agg function. Alas! You do need to run min and max to get the range bounds, but I guess running average would be an additional cost.

@philippjfr
Copy link
Member Author

Hmm. I think I'd typically rather have a middle value mainly because choosing the lower or the upper seems arbitrary, and is also by definition an extreme value

I can think of quite a few common scenarios the middle value is arbitrary and the lower bound isn't. One relatively common case is when filtering where you treat the single value slider as a >= filter, in which case the lower bound will include everything and a middle value will arbitrarily exclude half the data.

@philippjfr philippjfr merged commit 8d70c4b into main Jan 21, 2025
17 of 18 checks passed
@philippjfr philippjfr deleted the widget_from_values branch January 21, 2025 14:20
@jbednar
Copy link
Member

jbednar commented Jan 22, 2025

I can think of quite a few common scenarios the middle value is arbitrary and the lower bound isn't. One relatively common case is when filtering where you treat the single value slider as a >= filter, in which case the lower bound will include everything and a middle value will arbitrarily exclude half the data.

I think filtering should normally be done with a range widget (where there is no notion of a middle value), as discussed in #7003 (comment) and #7033 (comment) . In fact I suspect that filtering is the most typical and common use for this functionality, so I'd argue that returning a range widget by default is appropriate.

In that case, a user will need to be explicit about requesting a single-value widget instead, and once they do that they might well be expected to specify explicitly that they want a midpoint default anyway.

@jbednar
Copy link
Member

jbednar commented Jan 22, 2025

In a side channel, @philippjfr pointed out that this particular code is only called once a concrete widget type has been selected, i.e. it's Widget.from_values (for a particular Widget subclass in any given call), while pn.widget() is the one where classes are inferred. Good point, and so please ignore anything I said about choosing a range widget type, since in this code the type is already known. But that said, once we know we aren't instantiating a range widget, I still hold that a midpoint value is more useful as a default than an arbitrarily extreme value like the minimum value.

Still, I defer to Philipp here, since (a) computing the midpoint isn't free, (b) computing the midpoint can be ambiguous (e.g. should it be the actual midpoint (which might not be a valid value or one that exists in this dataset), or should it be the nearest value actually existing in this dataset, and (c) Philipp is the one implementing it and using more than I am. :-)

@philippjfr
Copy link
Member Author

Thanks for summarizing that here so it's recorded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants