-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add column length checks to expensive operations #1845
Comments
This is my point 😄. Thank you! |
Here is a benchmark of checking cost for 20,000 columns (which I think is a typically reasonable maximum in practice):
So given this the question is for what functions we should add this check. |
20,000 columns is definately an upper bound. I think it should be called if you do
Which will catch things during in interactive use.
|
It is tempting to check it also with every call of I would also add:
|
If we do, then we should maybe define: |
Agreed. Mostly this other form will be needed internally only so this should be OK (the key place where we will probably use it is creation of |
I feel like idk though it might be too magic -- people are used to |
Doing checks from |
The working rule I have in my head now for performing of this check is that: As for
|
I think maing |
Good point. So we should have such function internally anyway. |
Sounds reasonable. Though I'm not sure we need a very clear rule, since throwing an error if a data frame is corrupt will always be OK, and missing a check would be OK too. What matters it that we add as many checks we can without affecting performance. |
So here is my list of functions that should do the checks. Please add/remove from it and then we can make a PR:
In particular I left out: |
Add |
Fixed by #1887 |
From: #1844 (comment)
In short: checking column lengths is cheap,
We should do that a bunch of operations,
even of we do think we have patched all holes that let a user resize columns out of sync,
we can still catch bugs in internal methods via this (defensive programming)
The text was updated successfully, but these errors were encountered: