Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add_suffix and add_prefix for DataFrames and Series #9846

Merged
merged 41 commits into from
Dec 8, 2021

Conversation

mayankanand007
Copy link
Contributor

@mayankanand007 mayankanand007 commented Dec 6, 2021

This PR fixes #9590, by adding add_suffix and add_prefix for cudf.DataFrame and cudf.Series.

To make things concise, we unify the docstrings of these methods in both Series and DataFrame by defining them within IndexedFrame (with a unified docstring and raising NotImplementedError, asking the user to refer to the implementations in Series or DataFrame)

Its preferred to raise NotImplementedError so that if someone later creates another class by inheriting from IndexedFrame, it clarifies that they must reimplement add_suffix and add_prefix

@mayankanand007 mayankanand007 self-assigned this Dec 6, 2021
@github-actions github-actions bot added the Python Affects Python cuDF API. label Dec 6, 2021
@mayankanand007 mayankanand007 added 2 - In Progress Currently a work in progress feature request New feature or request non-breaking Non-breaking change labels Dec 6, 2021
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/series.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/series.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/series.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/tests/test_dataframe.py Outdated Show resolved Hide resolved
Copy link
Contributor

@brandon-b-miller brandon-b-miller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general comment, I think it might be cleaner to move this up a level to IndexedFrame. That way we can have a unified docstring as in pandas instead of one for series and one for dataframe.

python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/series.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/series.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/series.py Outdated Show resolved Hide resolved
@mayankanand007
Copy link
Contributor Author

As a general comment, I think it might be cleaner to move this up a level to IndexedFrame. That way we can have a unified docstring as in pandas instead of one for series and one for dataframe.

@brandon-b-miller Thanks for the comment/reviews, I was following Dataframe.rename and Series.rename, where I saw two different implementations in each of dataframe.py and series.py. Is there a reason for that to be separate? The additional reasoning I had in mind was, the way add_suffix and add_prefix are implemented on Series vs DataFrame are quite different.

Let me know your thoughts, happy to discuss more on this.

@brandon-b-miller
Copy link
Contributor

As a general comment, I think it might be cleaner to move this up a level to IndexedFrame. That way we can have a unified docstring as in pandas instead of one for series and one for dataframe.

@brandon-b-miller Thanks for the comment/reviews, I was following Dataframe.rename and Series.rename, where I saw two different implementations in each of dataframe.py and series.py. Is there a reason for that to be separate? The additional reasoning I had in mind was, the way add_suffix and add_prefix are implemented on Series vs DataFrame are quite different.

Let me know your thoughts, happy to discuss more on this.

Apologies, I didn't notice that the series version of this appends prefixes/suffixes to the series index and the dataframe one appends to the column names. That makes it hard to make one implementation in IndexedFrame that is general enough for both cases.

Still, I think it might be worth creating IndexedFrame.add_prefix and IndexedFrame.add_suffix even if they raise NotImplementedError, and moving the docstring there.

python/cudf/cudf/core/indexed_frame.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/dataframe.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/indexed_frame.py Outdated Show resolved Hide resolved
@mayankanand007 mayankanand007 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Dec 8, 2021
@galipremsagar
Copy link
Contributor

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 4579d23 into rapidsai:branch-22.02 Dec 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge feature request New feature or request non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add_suffix and add_prefix for DataFrames and Series
4 participants