-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: accepting Manager objects in DataFrame/Series #52419
Conversation
@jorisvandenbossche looks like this deprecation would affect pyarrow. Any thoughts? |
That would certainly be a problem for pyarrow. There is also #52132 which adds a separate method to construct from a manager. I would assume we would need to do something like that first? @jbrockmendel it would help if you could add some more context in the top post of your PRs, short explainer, link to relevant PRs, or open an issue about this that can be linked to from the different PRs |
We'd need to have that by the time we enforce the deprecation, yes. |
+1 here I agree that we still need a way for a power user to construct from collections of 2D arrays of homogenized dtypes. So that begs the question why don't we just do this? we have from_arrays why not from_2darrays (or better name)? this pretty much solves the downstream problem with efficiency |
should use DeprecationWarning instead of FutureWarning? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
On main, I'm now seeing:
|
Yes, pyarrow needs to update their usage. |
I think this is way too noisy if every user sees this. I would propose to revert this for now, and only put it back in when there is a released version of pyarrow that doesn't trigger those warnings. |
Without this deprecation, will pyarrow ever actually start using public APIs? Isn't the point of using a DeprecationWarning instead of FutureWarning to assure appropriate visibility? If we do revert, it should be just before release, not immediate. |
Yeah, that's true, but for some reason it doesn't seem to be working here. See Richard's snippet above, which I can reproduce on main without any special filterwarnings settings. So something we should investigate why this is happening.
I am planning to simply move pyarrow to whatever new private API is needed to achieve the same (I suppose |
So it's actually quite simple ;) We need to set the stacklevel correctly for how this is used, i.e. we want to warn if someone passes a manager to a DataFrame themselves, and so for that use case, the stacklevel is always two (this deprecation doesn't live deeply nested where it can be reached in multiple ways, it's simply in PR to fix this -> #55591 |
I also wasn't aware of this drawback of using cc @pandas-dev/pandas-core see @jorisvandenbossche's comment immediately above. |
Yes, we will need to re-evaluate other DeprecationWarnings as well |
After further working on #55591, I discovered that correcting the stacklevel unfortunately does not fix all cases (eg calling Although this usage won't be as widespread as read_parquet, this might still be a reason to revert it temporarily .. |
Another option is maybe to, initially, set the stacklevel to 1. That would circumvent the cython issue (and avoid the question of reverting temporarily), and ensure the warning isn't seen by pandas/pyarrow users. It would also result in not seeing the warning when doing it manually (i.e. user doing |
I implemented the stacklevel=1 idea in my PR at #55591 |
I'm also seeing this warning hit by subclassing a pandas DataFrame: Lines 648 to 654 in 2d2d67d
It's triggered by the test |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.