Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Support reduction for more than 2^31 items #589

Conversation

gevtushenko
Copy link
Collaborator

This PR partially addresses the following issue. Supporting larger offsets for ArgMin and ArgMax would have to wait, since it's a bit more involving. There also could be a bit better testing, but it might wait till Catch2 rewrite.

The main purpose of this PR is to unblock original issue by providing an example of what we expect from such a change. The main direction of this modification can be found here.

@gevtushenko gevtushenko requested a review from elstehle November 1, 2022 21:06
@gevtushenko gevtushenko added the type: bug: functional Does not work as intended. label Nov 1, 2022
@gevtushenko gevtushenko added this to the 2.1.0 milestone Nov 1, 2022
@gevtushenko gevtushenko added the P1: should have Necessary, but not critical. label Nov 1, 2022
gevtushenko added a commit to gevtushenko/thrust that referenced this pull request Nov 1, 2022
@gevtushenko gevtushenko requested a review from kaatish November 1, 2022 21:10
Copy link
Collaborator

@elstehle elstehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gevtushenko gevtushenko force-pushed the enh-main/github/reduce_with_more_than_2_32_items branch from 1790296 to 6153c10 Compare November 13, 2022 11:04
@gevtushenko
Copy link
Collaborator Author

Because ChooseOffsetT leads to unsigned offsets usage the codegen changed (14% increase in kernel size). Benchmarks show about 1% slowdown for reduction due to that. I've changed ChooseOffsetT to provide singed offsets instead, which leads to the same code generated. Also benchmarked radix sort, since it's affected by the change and got up to 4% speedup there.

@gevtushenko gevtushenko force-pushed the enh-main/github/reduce_with_more_than_2_32_items branch from 6153c10 to 5c64430 Compare November 13, 2022 11:39
gevtushenko added a commit to gevtushenko/thrust that referenced this pull request Nov 13, 2022
@gevtushenko gevtushenko force-pushed the enh-main/github/reduce_with_more_than_2_32_items branch from 5c64430 to 8b9f0b0 Compare November 13, 2022 16:21
gevtushenko added a commit to gevtushenko/thrust that referenced this pull request Nov 13, 2022
@gevtushenko gevtushenko merged commit 5ae7439 into NVIDIA:main Nov 14, 2022
@leofang
Copy link
Member

leofang commented Nov 14, 2022

Hi @senior-zero, glad to see this patch landed! Would it be backported to the 1.17.x branch?

@gevtushenko
Copy link
Collaborator Author

Hello @leofang! I completely forgot to reply about backporting this PR. Sorry about that!

I wouldn't like to backport this. Is there something stopping you from using the newer version of CUB. Please let us know if we can help switch to a more recent version of CUB or fix potential issues.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P1: should have Necessary, but not critical. type: bug: functional Does not work as intended.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants