Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port thrust::transform to use cub::DeviceTransform #2263

Closed
5 of 6 tasks
Tracked by #2404
bernhardmgruber opened this issue Aug 19, 2024 · 0 comments · Fixed by #2389
Closed
5 of 6 tasks
Tracked by #2404

Port thrust::transform to use cub::DeviceTransform #2263

bernhardmgruber opened this issue Aug 19, 2024 · 0 comments · Fixed by #2389
Assignees
Labels
thrust For all items related to Thrust.

Comments

@bernhardmgruber
Copy link
Contributor

bernhardmgruber commented Aug 19, 2024

Once #2086 is merged, thrust::transform should be ported to use cub::DeviceTransform.

Tasks

  • Design an opt-in for function objects to express they do not require address stability
  • Port thrust::transform to cub::DeviceTransform
  • Ensure zip_iterators passed to thrust::transform are decomposed and optimized
  • Ensure we have enough benchmarks for thrust::transform, including BabelStream
  • Discuss and finalize the design of the address stability opt-in
  • [BUG]: CUB device_transform breaks nvc++ -stdpar #2402
@github-project-automation github-project-automation bot moved this to Todo in CCCL Aug 19, 2024
@bernhardmgruber bernhardmgruber added the thrust For all items related to Thrust. label Aug 19, 2024
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 6, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 6, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
@bernhardmgruber bernhardmgruber linked a pull request Sep 6, 2024 that will close this issue
7 tasks
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 6, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 8, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Sep 8, 2024
@bernhardmgruber bernhardmgruber self-assigned this Sep 9, 2024
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Sep 9, 2024
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 9, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 9, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 9, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Sep 10, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Oct 29, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Oct 30, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Nov 4, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Nov 4, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Nov 5, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Nov 5, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Review to In Progress in CCCL Nov 5, 2024
bernhardmgruber added a commit to bernhardmgruber/cccl that referenced this issue Nov 6, 2024
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious

Fixes: NVIDIA#2263
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Nov 6, 2024
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Nov 6, 2024
pciolkosz pushed a commit to pciolkosz/cccl that referenced this issue Nov 6, 2024
* Add transform benchmark requiring a stable address
* Make thrust::transform use cub::DeviceTransform
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious
* Optimize prefetch cub::DeviceTransform for small problems

Fixes: NVIDIA#2263
fbusato pushed a commit to fbusato/cccl that referenced this issue Nov 9, 2024
* Add transform benchmark requiring a stable address
* Make thrust::transform use cub::DeviceTransform
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious
* Optimize prefetch cub::DeviceTransform for small problems

Fixes: NVIDIA#2263
pciolkosz added a commit that referenced this issue Nov 11, 2024
* copy pasted sample

* First draft

* Kernel functor and some other things

* Clean up and break up long main function

* Needs launch fix

* Switch to copy_bytes and cleanups

* Missing include

* Add exception print and waive value

* Adjust copy count

* Add license and switch benchmark streams

* Remove a function left as a mistake

* Update copyright date

Co-authored-by: Eric Niebler <[email protected]>

* Setup cudax examples. (#2697)

* Move the sample to new location and fix warning

* build fixes and 0 return code on waive

* Some new MSVC errors

* explicit cast

* Rename enable/disable peer access and separate the sample loop

* Add `cuda::minimum` and `cuda::maximum` (#2681)

* Add cuda::minimum and cuda::maximum

* Various fixes to cub::DeviceTransform (#2709)

* Workaround non-copyable iterators
* Use a named constant for SMEM
* Cast to raw reference 2
* Fix passing non-copy-assignable iterators to transform_kernel via kernel_arg

* Make `thrust::transform` use `cub::DeviceTransform` (#2389)

* Add transform benchmark requiring a stable address
* Make thrust::transform use cub::DeviceTransform
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious
* Optimize prefetch cub::DeviceTransform for small problems

Fixes: #2263

* Ensure that we only use the inline variable trait when it is actually available (#2712)

* Ensure that we only use the inline variable trait when it is actually available

* Use the right define for internal traits

* [CUDAX] Rename memory resource and memory pool from async to device (#2710)

* Rename the type

* Update tests

* Rename async memory pool

* Rename the tests

* Change name in the docs

* Generalise the memory_pool_properties name

* Fix docs

---------

Co-authored-by: Michael Schellenberger Costa <[email protected]>

* Update memory resource name

---------

Co-authored-by: Eric Niebler <[email protected]>
Co-authored-by: Allison Piper <[email protected]>
Co-authored-by: Jacob Faibussowitsch <[email protected]>
Co-authored-by: Bernhard Manfred Gruber <[email protected]>
Co-authored-by: Michael Schellenberger Costa <[email protected]>
fbusato pushed a commit to fbusato/cccl that referenced this issue Nov 12, 2024
* copy pasted sample

* First draft

* Kernel functor and some other things

* Clean up and break up long main function

* Needs launch fix

* Switch to copy_bytes and cleanups

* Missing include

* Add exception print and waive value

* Adjust copy count

* Add license and switch benchmark streams

* Remove a function left as a mistake

* Update copyright date

Co-authored-by: Eric Niebler <[email protected]>

* Setup cudax examples. (NVIDIA#2697)

* Move the sample to new location and fix warning

* build fixes and 0 return code on waive

* Some new MSVC errors

* explicit cast

* Rename enable/disable peer access and separate the sample loop

* Add `cuda::minimum` and `cuda::maximum` (NVIDIA#2681)

* Add cuda::minimum and cuda::maximum

* Various fixes to cub::DeviceTransform (NVIDIA#2709)

* Workaround non-copyable iterators
* Use a named constant for SMEM
* Cast to raw reference 2
* Fix passing non-copy-assignable iterators to transform_kernel via kernel_arg

* Make `thrust::transform` use `cub::DeviceTransform` (NVIDIA#2389)

* Add transform benchmark requiring a stable address
* Make thrust::transform use cub::DeviceTransform
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious
* Optimize prefetch cub::DeviceTransform for small problems

Fixes: NVIDIA#2263

* Ensure that we only use the inline variable trait when it is actually available (NVIDIA#2712)

* Ensure that we only use the inline variable trait when it is actually available

* Use the right define for internal traits

* [CUDAX] Rename memory resource and memory pool from async to device (NVIDIA#2710)

* Rename the type

* Update tests

* Rename async memory pool

* Rename the tests

* Change name in the docs

* Generalise the memory_pool_properties name

* Fix docs

---------

Co-authored-by: Michael Schellenberger Costa <[email protected]>

* Update memory resource name

---------

Co-authored-by: Eric Niebler <[email protected]>
Co-authored-by: Allison Piper <[email protected]>
Co-authored-by: Jacob Faibussowitsch <[email protected]>
Co-authored-by: Bernhard Manfred Gruber <[email protected]>
Co-authored-by: Michael Schellenberger Costa <[email protected]>
fbusato pushed a commit to fbusato/cccl that referenced this issue Jan 9, 2025
* Add transform benchmark requiring a stable address
* Make thrust::transform use cub::DeviceTransform
* Introduces address stability detection and opt-in in libcu++
* Mark lambdas in Thrust BabelStream benchmark address oblivious
* Optimize prefetch cub::DeviceTransform for small problems

Fixes: NVIDIA#2263
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
thrust For all items related to Thrust.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant