Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update value_counts to return correct name in pandas 2.0 #9919

Merged
merged 3 commits into from
Feb 6, 2023

Conversation

j-bennet
Copy link
Contributor

@j-bennet j-bennet commented Feb 4, 2023

Fix for an upstream failure:

2023-02-03T20:50:29.9817830Z FAILED dask/dataframe/tests/test_dataframe.py::test_value_counts_with_normalize_and_dropna - AssertionError: ('proportion', 'count')
2023-02-03T20:50:29.9817961Z assert 'proportion' == 'count'
2023-02-03T20:50:29.9818058Z   - count
2023-02-03T20:50:29.9818135Z   + proportion

In pandas 2.0, the result Series produced by value_counts() has a different name.

See https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html#value-counts-sets-the-resulting-name-to-count.

Xref #9736.
Xref pandas-dev/pandas#49912.

  • Tests added / passed
  • Passes pre-commit run --all-files

Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @j-bennet! Just one minor comment, but overall this looks good

@@ -1270,24 +1270,25 @@ def test_value_counts_with_normalize():


@pytest.mark.skipif(not PANDAS_GT_110, reason="dropna implemented in pandas 1.1.0")
def test_value_counts_with_normalize_and_dropna():
@pytest.mark.parametrize("normalize", [True, False])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time expand test coverage here

dask/dataframe/methods.py Outdated Show resolved Hide resolved
Co-authored-by: James Bourbeau <[email protected]>
Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks @j-bennet

@jrbourbeau jrbourbeau changed the title In value_counts, resulting series name changed, pandas 2.0 compatibility Update value_counts to return correct name in pandas 2.0 Feb 6, 2023
@jrbourbeau jrbourbeau merged commit c189ac5 into dask:main Feb 6, 2023
@j-bennet j-bennet deleted the j-bennet/9736-series-value-counts branch February 6, 2023 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants