Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix target counting in strings char-parallel replace #16017

Merged
merged 4 commits into from
Jun 17, 2024

Conversation

davidwendt
Copy link
Contributor

Description

Replace thrust::count_if call across int64 characters to use a custom kernel instead.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added bug Something isn't working 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels Jun 13, 2024
@davidwendt davidwendt self-assigned this Jun 13, 2024
@davidwendt
Copy link
Contributor Author

No change in performance according to the benchmark

Benchmark                                                      Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------
StringReplace/scalar/4096/32/manual_time                    -0.0002         -0.0008             0             0             0             0
StringReplace/scalar/4096/64/manual_time                    +0.0068         +0.0054             0             0             0             0
StringReplace/scalar/4096/128/manual_time                   +0.0081         +0.0067             0             0             0             0
StringReplace/scalar/4096/256/manual_time                   +0.0065         +0.0056             0             0             0             0
StringReplace/scalar/4096/512/manual_time                   -0.0188         -0.0181             0             0             0             0
StringReplace/scalar/4096/1024/manual_time                  -0.0043         -0.0043             0             0             0             0
StringReplace/scalar/4096/2048/manual_time                  -0.0013         -0.0010             1             1             1             1
StringReplace/scalar/4096/4096/manual_time                  -0.0001         -0.0001             1             1             1             1
StringReplace/scalar/4096/8192/manual_time                  +0.0068         +0.0068             2             2             2             2
StringReplace/scalar/32768/32/manual_time                   +0.0067         +0.0068             0             0             0             0
StringReplace/scalar/32768/64/manual_time                   +0.0162         +0.0155             0             0             0             0
StringReplace/scalar/32768/128/manual_time                  +0.0103         +0.0099             0             0             0             0
StringReplace/scalar/32768/256/manual_time                  +0.0140         +0.0135             0             0             0             0
StringReplace/scalar/32768/512/manual_time                  +0.0099         +0.0097             1             1             1             1
StringReplace/scalar/32768/1024/manual_time                 +0.0053         +0.0054             1             1             1             1
StringReplace/scalar/32768/2048/manual_time                 +0.0166         +0.0164             2             2             2             2
StringReplace/scalar/32768/4096/manual_time                 +0.0154         +0.0153             3             3             3             3
StringReplace/scalar/32768/8192/manual_time                 +0.0166         +0.0166             6             6             6             6
StringReplace/scalar/262144/32/manual_time                  +0.0052         +0.0047             0             0             0             0
StringReplace/scalar/262144/64/manual_time                  +0.0091         +0.0080             0             0             0             0
StringReplace/scalar/262144/128/manual_time                 +0.0172         +0.0165             1             1             1             1
StringReplace/scalar/262144/256/manual_time                 +0.0064         +0.0064             2             2             2             2
StringReplace/scalar/262144/512/manual_time                 +0.0121         +0.0121             4             4             4             4
StringReplace/scalar/262144/1024/manual_time                +0.0068         +0.0068             8             8             8             8
StringReplace/scalar/262144/2048/manual_time                -0.0083         -0.0083            22            21            22            21
StringReplace/scalar/262144/4096/manual_time                -0.0056         -0.0055            47            46            47            46
StringReplace/scalar/2097152/32/manual_time                 +0.0104         +0.0099             1             1             1             1
StringReplace/scalar/2097152/64/manual_time                 +0.0102         +0.0100             2             2             2             2
StringReplace/scalar/2097152/128/manual_time                +0.0122         +0.0120             4             4             4             4
StringReplace/scalar/2097152/256/manual_time                +0.0066         +0.0066            13            13            13            13
StringReplace/scalar/2097152/512/manual_time                +0.0161         +0.0161            26            27            26            27
StringReplace/scalar/16777216/32/manual_time                +0.0086         +0.0086             5             5             5             5
StringReplace/scalar/16777216/64/manual_time                +0.0140         +0.0140            13            13            13            13

@github-actions github-actions bot added the CMake CMake build issue label Jun 13, 2024
@vuule
Copy link
Contributor

vuule commented Jun 13, 2024

what's the issue with count_if?

@davidwendt
Copy link
Contributor Author

what's the issue with count_if?

The thrust::count_if available to us only supports a max-int32 range of values. That is, thrust::distance(begin,end) < max-int32
The int64 ranged version is a separate implementation in thrust and which is one of the few we currently compile out to save space in libcudf.so.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Jun 13, 2024
@davidwendt davidwendt marked this pull request as ready for review June 13, 2024 21:11
@davidwendt davidwendt requested a review from a team as a code owner June 13, 2024 21:11
auto const total = block_reduce(temp_storage).Reduce(count, cub::Sum());

if ((lane_idx == 0) && (total > 0)) {
cuda::atomic_ref<int64_t, cuda::thread_scope_block> ref{*d_output};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of thread_scope_block here is surprising to me. Aren't threads from different blocks updating the same output here? I'd expect that thread_scope_device is required.
I'm probably missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds right to me. Fixed in 530de96

Copy link
Contributor

@srinivasyadav18 srinivasyadav18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@vuule
Copy link
Contributor

vuule commented Jun 17, 2024

/merge

@rapids-bot rapids-bot bot merged commit 56e8442 into rapidsai:branch-24.08 Jun 17, 2024
73 checks passed
@davidwendt davidwendt deleted the fix-replace-count branch June 18, 2024 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants