Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Internal python _is_homogeneous property #7067

Closed
brandon-b-miller opened this issue Jan 4, 2021 · 2 comments · Fixed by #8299
Closed

[FEA] Internal python _is_homogeneous property #7067

brandon-b-miller opened this issue Jan 4, 2021 · 2 comments · Fixed by #8299
Labels
feature request New feature or request good first issue Good for newcomers Python Affects Python cuDF API.

Comments

@brandon-b-miller
Copy link
Contributor

brandon-b-miller commented Jan 4, 2021

Is your feature request related to a problem? Please describe.
It might be useful to have a singular clean and performant way to check if all the columns of a dataframe are of the same dtype, such as a DataFrame property _is_homogeneous. This comes up in a lot of places, such as where we might want to dispatch to a cupy matrix implementation (Transpose, some row wise reductions I believe and some other places), as well as many places we might want to error or warn.

Describe the solution you'd like
A property cudf.DataFrame._is_homogeneous that returns True if all of the datatypes of all of the columns are equal else false. This should be plumbed into the places where we currently use a loop or other means. This might be a good first issue being (hopefully) straightforward and requiring some somewhat broad exploration of the cuDF python layer.

Describe alternatives you've considered
Writing a possibly slightly different loop every time

Additional context
Add any other context, code examples, or references to existing implementations about the feature request here.

@brandon-b-miller brandon-b-miller added feature request New feature or request good first issue Good for newcomers Python Affects Python cuDF API. labels Jan 4, 2021
@github-actions
Copy link

This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d.

@shaneding
Copy link
Contributor

Seems straightforward, if no one is working on this I can take it on :)

rapids-bot bot pushed a commit that referenced this issue May 24, 2021
This PR closes #7067.
This was implemented by adding the `_is_homogeneous` property to `DataFrame`. Included are appropriate test cases.

Authors:
  - https://github.com/shaneding

Approvers:
  - https://github.com/brandon-b-miller
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #8299
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request good first issue Good for newcomers Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants