Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor dask_cudf sort_values optimization for single partitions #4000

Merged
merged 3 commits into from
Jan 30, 2020

Conversation

rjzamora
Copy link
Member

Closes #3873 (Although we do still want to improve general performance -- In progress)

This PR adds a small change to effectively "short circuit" the usual batcher-sortnet algorithm when there is only a single partition.

I also modified the behavior of ignore_index, because it seemed to be "incorrect" (however, I'm happy to revert this if I'm mistaken)

@rjzamora rjzamora requested a review from a team as a code owner January 29, 2020 22:08
@codecov
Copy link

codecov bot commented Jan 30, 2020

Codecov Report

❗ No coverage uploaded for pull request base (branch-0.13@fc73d7b). Click here to learn what that means.
The diff coverage is 87.5%.

Impacted file tree graph

@@              Coverage Diff               @@
##             branch-0.13    #4000   +/-   ##
==============================================
  Coverage               ?   86.67%           
==============================================
  Files                  ?       50           
  Lines                  ?     9724           
  Branches               ?        0           
==============================================
  Hits                   ?     8428           
  Misses                 ?     1296           
  Partials               ?        0
Impacted Files Coverage Δ
python/dask_cudf/dask_cudf/core.py 68.4% <87.5%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fc73d7b...cf9decc. Read the comment docs.

@kkraus14 kkraus14 added 5 - Ready to Merge Testing and reviews complete, ready to merge Python Affects Python cuDF API. dask Dask issue labels Jan 30, 2020
@kkraus14 kkraus14 merged commit 61c310b into rapidsai:branch-0.13 Jan 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge dask Dask issue Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Overhead in sorting single partition frames with dask_cudf
2 participants