-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add shim modules with deprecation warnings to ensure backward compat #16140
Conversation
…bility The following shim modules are included: - `pandas.computation` - `pandas.tools.hashing` - `pandas.types` Fixes pandas-dev#16138
My testing of the 0.20.0 release candidate uncovered a large number of breakages in both dask and odo. I completely understand the need for refactoring but I really don't think it's good practice to simply break non-deprecated apis without any warning especially with a library as far down the stack as pandas. I think this is especially true when it's a fairly simple task to provide a shim module with a warning to at least give 3rd parties a chance to make the necessary changes. This PR provides the necessary shims for me to get both dask and odo working. |
absolutely not |
How were 3rd party libraries supposed to determine that they were all private modules? |
none of these were ever in the top level exposed nor api namespaces which was announced in 0.19.0 if you are using it then it is by definition unstable if packages didn't follow the guidelines then they are out of luck |
i would take this only for the change in hashing |
@dhirschfeld can you give a sense of how much odo was relying on these subpackages? e.g. a CI run with test failures? |
That may be the case, but when the api namespace was announced those modules should have been deprecated so that existing code would give a warning that it was deprecated. You can say that everyone should have known but the reality is that code which worked on No one likes maintaining useless code but if it prevents users code from simply breaking without warning I don't think it's too great of a burden to do so for a short period whilst 3rd party libraries adjust. With this PR pandas complains loudly about deprecated usages:
|
@TomAugspurger - I think the only issue in odo was the use of PR #15998 did provide a shim for that module with a deprecation warning but it also simultaneously broke the previous api hence rendering the shim/warning ineffective This PR effectively does the same thing for the other moved modules. |
At the end of the day, it's your call - I just got a nasty surprise when trying the rc. It might just be dask/odo which were using the private/deprecated modules but with 3rd party code there's really no way to tell. So long as dask gets out a release before pandas maybe it'll be fine... If you are considering providing the shims I can fix up the failing |
The reality is that the exact public API for pandas has been ambiguous. It's great that we're finally clearing this up, but there's no prize for needlessly breaking downstream libraries, and doing this without even a deprecation warning seems needlessly punitive -- especially when there is literally no cost to maintaining this for now. That said, there is no place for deprecation warnings that suggest switching to import modules in pandas.core, the entirety of which is private API. The messages should instruct users to switch to supported public APIs. |
Sure, if there's an appropriate public api they should be pointed to that. I just wasn't clear in pandas where that line is drawn. I can appreciate though that there is an effort to document it and make it clearer - e.g. #16087 |
Indeed, all this effort is to try to make the distinction between public and private functionality clearer, to prevent such problems in the future. We already added some shims (eg for lib, tslib, json, parser, tools.merge, tools.plotting), but based on feedback we can certainly a) expand the shims and b) discuss whether some functionality should be exposed in a public location (eg It might be the case that dask is a bit a special case in that they used new, private functionality. But it is difficult to be sure about that. I tried using github search to look for repo's that currently use private functionality, but that github search is almost completely useless as I 'find' all the (huge amount of) repo's that have vendored the full pandas codebase .. So if people know better ways to search for this, all ears! |
I think Stephan was talking specifically about |
That's fine for |
I think Stephan was talking in general (or at least, that is my point of vue). So in general, the deprecation messages should just say that the module is deprecated and will be removed in the future, without any alternative import location. Only for some specific functions (like I added a comment in #13634 (comment) to discuss whether we want to expose some of those publicly (I think |
|
||
warnings.warn( | ||
"The pandas.computation module is deprecated and will be removed in a future " | ||
"version. Please import from the pandas.core.computation module instead.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the "Please import from the pandas.core.computation module instead." should be removed IMO.
|
||
warnings.warn( | ||
"The pandas.types module is deprecated and will be removed in a future " | ||
"version. Please import from the pandas.core.dtypes module instead.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here as well, the "Please import from the pandas.core.dtypes module instead." should be removed.
There is part of these functions available in pandas.api.types
, but not everything. But maybe we can point to that in general?
@@ -58,6 +58,7 @@ | |||
from pandas.io.api import * | |||
from pandas.util._tester import test | |||
import pandas.testing | |||
from . import computation # TODO: Remove when deprecated shim module is removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will trigger the deprecation warning already (so showing it on just importing pandas), so we have to have a workaround this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can possibly be solved by using the _DeprecatedModule
some lines below
@jorisvandenbossche - thanks for the review I can take a look at pushing the suggested fixes in the morning. |
what project is using this? this is also a private API. It should not be exposed at all. But I can see a config option to do this instead. everything else that is listed here I am not inclined to deprecate at all. This is just way to much reaching in by external users. This is the entire point of the announcement in 0.19.0. if there are specific cases I am all ears. |
Which means exposing it? But if we want to expose the ability to set the use of numexpr, an option indeed seems like the way to go (if that has no perf penalties). Eg |
@jreback I think |
i will expose use_numexpr as an option |
this was already moved (and updated in the docs). http://pandas-docs.github.io/pandas-docs-travis/generated/pandas.api.types.union_categoricals.html?highlight=union_categoricals#pandas.api.types.union_categoricals see right below the Privacy section. |
…ype.union_categoricals closes pandas-dev#16140
* DEPR: allow options for using bottleneck/numexpr deprecate pd.computation.expressions.set_use_numexpr() * DEPR: pandas.types.concat.union_categoricals in favor of pandas.api.type.union_categoricals closes pandas-dev#16140
The following shim modules are included:
pandas.computation
pandas.tools.hashing
pandas.types
Fixes #16138