Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix boolean array skipna=False for .any() and .all() #33284

Merged
merged 13 commits into from
Apr 10, 2020

Conversation

linxiaow
Copy link
Contributor

@linxiaow linxiaow commented Apr 4, 2020

I added "not self._mask.any()" to check if there is no missing value for both .any() and .all() when skipna is False. It looks fine for the example you provided.
any_all

@WillAyd
Copy link
Member

WillAyd commented Apr 4, 2020

Thanks for the PR - does this close a particular issue? Can you add tests?

@linxiaow
Copy link
Contributor Author

linxiaow commented Apr 4, 2020

@WillAyd I think yes, and I intend to close the issue here #33253. It is my first time in open source so I am not familiar with PR.

I am not familiar with testing, but I can have a try. Any suggestions before I get started

@jorisvandenbossche
Copy link
Member

Don't worry about Travis that is failing (that's a general issue, there is already another PR trying to fix that)

@jorisvandenbossche
Copy link
Member

The any/all tests are here:

@pytest.mark.parametrize(
"values, exp_any, exp_all, exp_any_noskip, exp_all_noskip",
[
([True, pd.NA], True, True, True, pd.NA),
([False, pd.NA], False, False, pd.NA, False),
([pd.NA], False, True, pd.NA, pd.NA),
([], False, True, False, True),
],
)
def test_any_all(values, exp_any, exp_all, exp_any_noskip, exp_all_noskip):
# the methods return numpy scalars
exp_any = pd.NA if exp_any is pd.NA else np.bool_(exp_any)
exp_all = pd.NA if exp_all is pd.NA else np.bool_(exp_all)
exp_any_noskip = pd.NA if exp_any_noskip is pd.NA else np.bool_(exp_any_noskip)
exp_all_noskip = pd.NA if exp_all_noskip is pd.NA else np.bool_(exp_all_noskip)
for con in [pd.array, pd.Series]:
a = con(values, dtype="boolean")
assert a.any() is exp_any
assert a.all() is exp_all
assert a.any(skipna=False) is exp_any_noskip
assert a.all(skipna=False) is exp_all_noskip
assert np.any(a.any()) is exp_any
assert np.all(a.all()) is exp_all

so you can add to that test some new cases for the all-True and all-False cases from the issue.

@jreback jreback added Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Apr 5, 2020
@jreback jreback added this to the 1.1 milestone Apr 5, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a whatsnew note in bug fixes Missing section for 1.1

can you also add a test case in panda/tests/reductions/test_reductions.py

def test_all_any_params(self) is the relevant test

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@linxiaow thanks for the updates. I suggested some rewording on the added whatsnew/comment

@linxiaow linxiaow requested a review from jreback April 9, 2020 01:15
@jorisvandenbossche jorisvandenbossche merged commit 20474d5 into pandas-dev:master Apr 10, 2020
@jorisvandenbossche
Copy link
Member

Thanks @linxiaow !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: BooleanArray.any with all False values and skipna=False is buggy
4 participants