Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example code for blog on new row comparators #13795
Example code for blog on new row comparators #13795
Changes from 33 commits
34b8fb4
9059909
5f3b0da
a15dfef
776b10d
3c2c0ee
fa6f928
d0da110
e1f6c7c
df45de2
c12bc71
496c28a
647604d
345ed77
4fa8e3f
3668dec
8675190
2b2ebdd
3ba32ae
a15e993
391140c
8d8d1ac
2f0f0e8
e113913
f2c42be
27db5c2
9a3c852
a6cf0ee
1948b10
f7e96a9
2bfd415
9e2020d
45d6496
57ccb51
4d6ac25
b3fee13
6cf9f58
02438d1
26ed48c
57d68bd
43c0fc3
9625037
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do count aggregations return
int32_t
orsize_type
? I think it's supposed to besize_type
but not 100% sure.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a dummy column to force usage of hash aggregation instead of sort aggregation, it does not hold the return values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I understand. We usually want to rely on hash aggregation anytime we can, since it tends to be faster than sort aggregations. Can you point to the behavior that you're avoiding, and why this is the right solution (as opposed to fixing the aggregation dispatch to use hashing instead of sorting)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cudf/cpp/src/groupby/hash/groupby.cu
Lines 654 to 656 in abc0d41
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the right solution. It should definitely be fixed at the core of the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't force hash aggregations do we still have at least some? As long as the example is demonstrating the all the comparator functionality that we want I think it's totally fine if it isn't the most performant right now (and we've removed timings anyway). If there are improvements to be made to libcudf, we should open a separate issue for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it, what do you mean by "at least have some"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's right. If an aggregation only has a sort-based implementation then it will fall back to a sort aggregation.
In this particular case though, a hash-based aggregation is available and it needs to be "forced" due to a limitation.
Here's the issue: #14412
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bdice are you OK with marking this thread as resolved for now? The issue looks like it has a reasonable plan for how to move forward, but I don't think we should block this PR on getting that enabled.