-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: tab completion with a large index #18587
Comments
One may have a very large index with few distinct values. I would suggest to limit the number of values returned rather than the size of the index. (It seems that the delay is due to the handling of the results rather than the computation of additions = set([c for c in self._info_axis.get_level_values(0).unique()[:100]
if isinstance(c, string_types) and isidentifier(c)]) Anyway, I think I can address this issue in #16326 ; the topics are quite related. |
Do we know why |
you can use |
I don't know exactly, but the slowdown seem to come from the IHM: When I create a large Series (
Yes thanks, that's an awesome new feature. |
Was this fixed by #20834? Tab completion on the following seems quick
|
I would like to add to this issue. My team often works with data sets that have hundreds of columns. The reduction in the number of columns available for tab-completion to 100 has been a hindrance. I am fine with capping the number for the sake of performance, just the choice of 100 seems arbitrary. Suggestion:Increase the cap on AnalysisI performed the following benchmarks on tab-completion timings using Laptop 1
Pandas 0.25.0, modified generics.py
Laptop 2Intel Core i3-3110M at 2.4Ghz
Pandas 0.25.0, modified generics.py
Even on my 10 year old laptop, the time for tab-completion with 1000 columns is under 30ms. Still very responsive. |
Additionally, it may be worth issuing a warning to the user when It goes against the philosophy of Python to not let the user know, in my opinion. |
from #16326 (comment)
If you have a very large index,
_dir_additions
(for tab completion) actually takes quite a bit of timeSo what I would do is if the index is say < 100, use the currently _dir_addition, otherwise return an empty list! (its essentially too big to use tab completion for anyhow). can you make this change and add an asv for this (could be a separate PR as well)
The text was updated successfully, but these errors were encountered: