-
-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IndexSelector: use dynamic options #219
IndexSelector: use dynamic options #219
Conversation
Cool, really nice approach! I actually noticed that dynamic dropdown thing in the documentation the other day as well. Very nifty! Wonder if there are any performance tradeoffs though such that for small datasets (or rather maybe memory usage of But if it is perfomative enough than we can just default to the dynamic one... |
Probably makes sense to wrap this into an |
so this seems to work (check the feature input component index selector on the whatif tab)... |
hmm, tests seem to fail on ubuntu. May check tomorrow... |
Glad you like it. Why not use the Maybe there's something I'm missing in the code. It's likely that I actually haven't found all index selection locations. The limit of options returned (1000) is probably best to make configurable... a bit like |
Btw, very pleased with the load time of the application utilising 1M data points. |
Good question, and answer is that I forgot I had already written |
I was looking at the |
Okay, I now extended the IndexSelector to all existing ExplainerComponents, and also added a functionality that that when the index is set directly the options simply get set to Since you have ready access to a million row index dataset, could you test performance of this I would like search to be case-insensitive, so if islice is much faster on big datasets, then we could add an |
bumped sklearn to 1.1 and python to 3.8 and voila tests are passing :) |
Thanks for keeping up the momentum on tbis and #217.
I'm keen on trying it out, but realistically i might have a chance to get
to it mid next week.
The next bottle necks for 100s of thousands of data points are the ROC and
PR AUC plots, esp the latter.
I haven't had a closer look for the reasons.
Not having the index repeatedly/at all in the JSON layout file is a great
step forward.
Cheers!
…On Mon, 23 May 2022, 20:46 Oege Dijk, ***@***.***> wrote:
bumped sklearn to 1.1 and python to 3.8 and voila tests are passing :)
—
Reply to this email directly, view it on GitHub
<#219 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTGKNRT3VFSBKULN77XUIDVLPHBVANCNFSM5VVCII6Q>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I have reviewed the changes. I propose some changes to the IndexSelector class from a software-engineering perspective, mainly from a consistency perspective and a bug in the conditions between the layout and the callback creation. There's one difference in the use of Is a purely callback-based option too slow in some cases? If not, I'd suggest keeping only that one. All in all, I'm worried about startup performance (and hitting a timeout with JSON becoming too slow to load) rather than runtime performance. Do we want two different thresholds? I am not too concerned with the CPU performance difference of |
I think some subtle bug may have been introduced with the cleanup, but will have a closer look tomorrow.. thanks again for the work! |
Alright, looks good to me, let's merge it! Thanks again for this awesome initiave! Will give you a shout out on linkedin once I'll release this version... |
just released it as version 0.4.0 |
addresses loading timeouts for large datasets as described in #205
This is rough and ready, keen to get some feedback.