Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Misc optimizations in cudf #9203

Merged
merged 7 commits into from
Sep 14, 2021

Conversation

galipremsagar
Copy link
Contributor

@galipremsagar galipremsagar commented Sep 9, 2021

This PR removes some inefficient code-paths in cudf & dask_cudf which were spotted in internal profiling results.

@galipremsagar galipremsagar added the 2 - In Progress Currently a work in progress label Sep 9, 2021
@galipremsagar galipremsagar self-assigned this Sep 9, 2021
@github-actions github-actions bot added the Python Affects Python cuDF API. label Sep 9, 2021
@galipremsagar galipremsagar changed the title [WIP] Misc optimizations in cudf [REVIEW] Misc optimizations in cudf Sep 13, 2021
@galipremsagar galipremsagar marked this pull request as ready for review September 13, 2021 20:27
@galipremsagar galipremsagar requested review from a team as code owners September 13, 2021 20:27
@galipremsagar galipremsagar added 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer and removed 2 - In Progress Currently a work in progress labels Sep 13, 2021
@galipremsagar galipremsagar added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 13, 2021
Copy link
Contributor

@isVoid isVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion for further improvement, but it looks good.

python/dask_cudf/dask_cudf/io/parquet.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Sep 13, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.10@eb09d14). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 452c3d2 differs from pull request most recent head f11a180. Consider uploading reports for the commit f11a180 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.10    #9203   +/-   ##
===============================================
  Coverage                ?   10.84%           
===============================================
  Files                   ?      115           
  Lines                   ?    19172           
  Branches                ?        0           
===============================================
  Hits                    ?     2080           
  Misses                  ?    17092           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eb09d14...f11a180. Read the comment docs.

Copy link
Member

@rjzamora rjzamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @galipremsagar - These changes all make sense to me. My only (optional) suggestion is to add the test from #9159 so we can close that PR.

python/dask_cudf/dask_cudf/io/parquet.py Show resolved Hide resolved
@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs Dask Reviewer labels Sep 14, 2021
@galipremsagar
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit cbc6ebb into rapidsai:branch-21.10 Sep 14, 2021
rapids-bot bot pushed a commit that referenced this pull request Sep 30, 2021
A similar fix for this problem was recently submitted in #9159 and closed in favor of #9203. It seems that the test added in the latter PR was not actually capturing the original problem. However, after [dask#8072](dask/dask#8072) is merged, the new test will certainly start failing.

Authors:
  - Richard (Rick) Zamora (https://github.com/rjzamora)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Benjamin Zaitlen (https://github.com/quasiben)

URL: #9314
@vyasr vyasr added dask Dask issue and removed dask-cudf labels Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge dask Dask issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants