Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Deprecate Series/Dataframe.to_dense/to_sparse #26684

Merged
merged 37 commits into from
Jun 19, 2019
Merged

DEPR: Deprecate Series/Dataframe.to_dense/to_sparse #26684

merged 37 commits into from
Jun 19, 2019

Conversation

VikramjeetD
Copy link
Contributor

@VikramjeetD VikramjeetD commented Jun 6, 2019

Marks the methods mentioned in #26557 as deprecated, with proper messages.

@codecov
Copy link

codecov bot commented Jun 6, 2019

Codecov Report

Merging #26684 into master will increase coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26684      +/-   ##
==========================================
+ Coverage   91.87%    91.9%   +0.02%     
==========================================
  Files         174      174              
  Lines       50661    50916     +255     
==========================================
+ Hits        46547    46796     +249     
- Misses       4114     4120       +6
Flag Coverage Δ
#multiple 90.45% <100%> (+0.04%) ⬆️
#single 41.79% <36.36%> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/core/sparse/frame.py 96.59% <100%> (+0.96%) ⬆️
pandas/core/frame.py 97% <100%> (-0.12%) ⬇️
pandas/core/series.py 93.62% <100%> (ø) ⬆️
pandas/core/sparse/series.py 95.75% <100%> (+0.85%) ⬆️
pandas/io/gbq.py 78.94% <0%> (-10.53%) ⬇️
pandas/core/dtypes/common.py 96.62% <0%> (+0.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 891a419...39230d4. Read the comment docs.

@codecov
Copy link

codecov bot commented Jun 6, 2019

Codecov Report

Merging #26684 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26684      +/-   ##
==========================================
- Coverage   91.87%   91.86%   -0.01%     
==========================================
  Files         180      180              
  Lines       50712    50719       +7     
==========================================
+ Hits        46590    46593       +3     
- Misses       4122     4126       +4
Flag Coverage Δ
#multiple 90.46% <100%> (ø) ⬆️
#single 41.11% <66.66%> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/core/groupby/ops.py 96% <100%> (ø) ⬆️
pandas/core/frame.py 96.89% <100%> (-0.12%) ⬇️
pandas/core/generic.py 94.2% <100%> (ø) ⬆️
pandas/core/series.py 93.66% <100%> (+0.01%) ⬆️
pandas/io/gbq.py 88.88% <0%> (-11.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update baa77c3...9f888c5. Read the comment docs.

@jorisvandenbossche jorisvandenbossche added Sparse Sparse Data Type Deprecate Functionality to remove in pandas labels Jun 7, 2019
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Some additional comments:

pandas/core/frame.py Outdated Show resolved Hide resolved
pandas/core/frame.py Outdated Show resolved Hide resolved
pandas/core/series.py Outdated Show resolved Hide resolved
pandas/core/sparse/frame.py Outdated Show resolved Hide resolved
pandas/core/sparse/frame.py Outdated Show resolved Hide resolved
pandas/core/sparse/frame.py Outdated Show resolved Hide resolved
@VikramjeetD
Copy link
Contributor Author

VikramjeetD commented Jun 7, 2019

@jorisvandenbossche Thanks for the review. I committed the stylistic changes that you suggested.

Working on writing the tests. Could you please guide me on where the tests for Series and Dataframe (not sparse) need to be written? Should I create a new file or write it in an existing file.

@@ -1939,6 +1940,9 @@ def to_sparse(self, fill_value=None, kind='block'):
>>> type(sdf) # doctest: +SKIP
<class 'pandas.core.sparse.frame.SparseDataFrame'>
"""
warnings.warn("DataFrame.to_sparse is deprecated and will be removed "
"in a future version", FutureWarning, stacklevel=2)

from pandas.core.sparse.api import SparseDataFrame
return SparseDataFrame(self._series, index=self.index,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wont' this also trigger the SDF warnings? (should this just be changed to create a DF here?

Copy link
Contributor Author

@VikramjeetD VikramjeetD Jun 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the SDF deprecation warning? If thats the case, yes, but this was requested in the issue that this PR will address ( #26557 ).
Also, this might be helpful if someone has already hit the SDF warning through a different route, as a gentle reminder. Either way we could decide whats best and update accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it already triggers the SparseDataFrame warning. But, it is still good to explicitly deprecate this method as well (in the docs, and by giving a clearer warning message, although the other one will also still be present).

I don't think we should change the behaviour, since we are deprecating it (it would also be a backwards incompatible change, as DataFrame with sparse and SparseDataFrame are not fully interchangeable).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we can agree to keep both the warnings and resolve this conversation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually i would suppress the SDF
warning here as the user facing to_sparse is already good enough

@pep8speaks
Copy link

pep8speaks commented Jun 7, 2019

Hello @intell1gent! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-06-18 04:51:35 UTC

@VikramjeetD
Copy link
Contributor Author

VikramjeetD commented Jun 7, 2019

...

  • There is an additional to_dense method in generic.py that should also be deprecated.
    ...

@jorisvandenbossche

  • Deprecating to_dense in generic.py breaks some tests for other classes. (I still haven't pushed that part). I could modify those tests but can't figure out how to ignore the FutureWarning. @pytest.mark.filterwarnings('ignore::FutureWarning') doesn't seem to help.
  • The other deprecations that I did raise a lot of warnings in many of the tests (but they pass)

Also, I'll change the code style issues in the next push.

@jorisvandenbossche
Copy link
Member

@intell1gent I would need to look myself to say for sure (but now away for the weekend), but if other tests raise those warnings, then it might be because the methods are still used in certain places in pandas. In such a case, you will need to check those places and see if it can be replaced with something else (or if it was just used in the tests directly, then you can add a filterwarnings decorator).

@VikramjeetD
Copy link
Contributor Author

VikramjeetD commented Jun 8, 2019

@intell1gent I would need to look myself to say for sure (but now away for the weekend), but if other tests raise those warnings, then it might be because the methods are still used in certain places in pandas. In such a case, you will need to check those places and see if it can be replaced with something else (or if it was just used in the tests directly, then you can add a filterwarnings decorator).

Turns out changing UserWarning to Warning, check_stacklevel=False lets them pass.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, I don't want blanket warning filters at all. These are hiding errors.

pandas/core/sparse/frame.py Outdated Show resolved Hide resolved
pandas/core/sparse/series.py Outdated Show resolved Hide resolved
pandas/tests/frame/test_deprecations.py Outdated Show resolved Hide resolved
pandas/tests/generic/test_generic.py Show resolved Hide resolved
@jorisvandenbossche
Copy link
Member

again, I don't want blanket warning filters at all. These are hiding errors.

For the old sparse tests (so not speaking about the groupby tests, there we indeed should certainly not hide the warnings), we decided earlier to not rewrite those tests but just ignore the warnings (until all functionality + tests can be removed). So you will see there are already existing filterwarnings for Sparse. For those cases (again, tests about the old, deprecated SparseDataFrame/Series, I think it is best to also ignore those new warnings).

@jorisvandenbossche
Copy link
Member

we still have to come to an unanimous decision whether to keep the double deprecation warning for Series/DataFrame.to_sparse. None of this will be needed if we deicide to stick with only 1 warning for the main deprecation.

@intell1gent I think it is fine to have the double deprecation. How would you propose to only have a single warning? I personally think we certainly need a warning that mentions to_sparse, so that would mean suppressing the other warning from inside to_sparse?

Update DataFrame/Series.to_dense deprecation message

Co-Authored-By: Joris Van den Bossche <[email protected]>
@jreback
Copy link
Contributor

jreback commented Jun 17, 2019

again, I don't want blanket warning filters at all. These are hiding errors.

For the old sparse tests (so not speaking about the groupby tests, there we indeed should certainly not hide the warnings), we decided earlier to not rewrite those tests but just ignore the warnings (until all functionality + tests can be removed). So you will see there are already existing filterwarnings for Sparse. For those cases (again, tests about the old, deprecated SparseDataFrame/Series, I think it is best to also ignore those new warnings).

right this case is ok, but we need to be really clear on this, e.g. on pandas/tests/sparse should have these types of filters

@jorisvandenbossche
Copy link
Member

Not added by me. Also present on upstream.

@intell1gent for future reference: you are correct that it was not added by you, but in general, the diff that github shows on the PR needs to be correct. So if you noticed unrelated changes in the diff here from master, that means that something went wrong when you updated this branch to the latest master. And it is still up to you to fix that. See my earlier comment on how that could be done.

@jorisvandenbossche
Copy link
Member

right this case is ok, but we need to be really clear on this, e.g. on pandas/tests/sparse should have these types of filters

@jreback yes, that sounds correct, that we should in principle only do this general filterwarnings in the pandas/tests/sparse. However, it seems that with the original SparseDataFrame deprecation, such a general filterwarning was added in several other places as well (@TomAugspurger ? eg the pandas/tests/arrays/sparse/test_arithmetic.py do you remember why the general filterwarnings on Sparse was needed there?)

@jreback
Copy link
Contributor

jreback commented Jun 17, 2019

right this case is ok, but we need to be really clear on this, e.g. on pandas/tests/sparse should have these types of filters

@jreback yes, that sounds correct, that we should in principle only do this general filterwarnings in the pandas/tests/sparse. However, it seems that with the original SparseDataFrame deprecation, such a general filterwarning was added in several other places as well (@TomAugspurger ? eg the pandas/tests/arrays/sparse/test_arithmetic.py do you remember why the general filterwarnings on Sparse was needed there?)

right, let's comment these warning to be clear where they come from if they are not in pandas/tests/sparse (and actually maybe there too)

@VikramjeetD
Copy link
Contributor Author

VikramjeetD commented Jun 17, 2019

we still have to come to an unanimous decision whether to keep the double deprecation warning for Series/DataFrame.to_sparse. None of this will be needed if we deicide to stick with only 1 warning for the main deprecation.

@intell1gent I think it is fine to have the double deprecation. How would you propose to only have a single warning? I personally think we certainly need a warning that mentions to_sparse, so that would mean suppressing the other warning from inside to_sparse?

Yes, I changed to_sparse to ignore the SDF/SS warnings and only show a single warning. The main warning shows in other cases related to SDF/SS.

@jorisvandenbossche jorisvandenbossche changed the title Deprecate Series/Dataframe.to_dense/to_sparse DEPR: Deprecate Series/Dataframe.to_dense/to_sparse Jun 17, 2019
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

pandas/core/sparse/frame.py Outdated Show resolved Hide resolved
pandas/tests/frame/test_deprecations.py Outdated Show resolved Hide resolved
with tm.assert_produces_warning(FutureWarning,
check_stacklevel=False):
result = df.to_sparse()
tm.assert_frame_equal(result, sparse_df)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you put this in a class where those warnings you want to check for are generally ignored. Does that work? (just wondering)

Maybe we can also put it at the end of the file as a function instead of method, to avoid possible concerns about that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But since the tests are passing, this is probably fine ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually works fine, tm.assert_produces_warning doesn't seem to care about filterwarnings. Remove the warning from to_sparse and this test fails.

I wanted to keep both (test_dense_to_sparse and test_deprecated_dense_to_sparse) close for easy reference, thats all. Open to shifting it to the end though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking, then that sounds good!

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good!

@jreback you're fine with the current filterwarnings? They are only added there were we already had a filterwarning for other sparse warnings as well

cc @TomAugspurger

@jreback
Copy link
Contributor

jreback commented Jun 18, 2019

i will review at some point again

@jreback jreback added this to the 0.25.0 milestone Jun 19, 2019
@jreback jreback merged commit 376a05e into pandas-dev:master Jun 19, 2019
@jreback
Copy link
Contributor

jreback commented Jun 19, 2019

thanks @intell1gent !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DEPR: deprecate Series/DataFrame.to_dense/to_sparse
4 participants