-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port thrust::transform
to use cub::DeviceTransform
#2263
Labels
thrust
For all items related to Thrust.
Comments
19 tasks
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Sep 6, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Sep 6, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
7 tasks
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Sep 6, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Sep 8, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Sep 9, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Sep 9, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Sep 9, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Sep 10, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
2 tasks
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Oct 29, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Oct 30, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Nov 4, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Nov 4, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Nov 5, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Nov 5, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
bernhardmgruber
added a commit
to bernhardmgruber/cccl
that referenced
this issue
Nov 6, 2024
* Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious Fixes: NVIDIA#2263
pciolkosz
pushed a commit
to pciolkosz/cccl
that referenced
this issue
Nov 6, 2024
* Add transform benchmark requiring a stable address * Make thrust::transform use cub::DeviceTransform * Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious * Optimize prefetch cub::DeviceTransform for small problems Fixes: NVIDIA#2263
fbusato
pushed a commit
to fbusato/cccl
that referenced
this issue
Nov 9, 2024
* Add transform benchmark requiring a stable address * Make thrust::transform use cub::DeviceTransform * Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious * Optimize prefetch cub::DeviceTransform for small problems Fixes: NVIDIA#2263
pciolkosz
added a commit
that referenced
this issue
Nov 11, 2024
* copy pasted sample * First draft * Kernel functor and some other things * Clean up and break up long main function * Needs launch fix * Switch to copy_bytes and cleanups * Missing include * Add exception print and waive value * Adjust copy count * Add license and switch benchmark streams * Remove a function left as a mistake * Update copyright date Co-authored-by: Eric Niebler <[email protected]> * Setup cudax examples. (#2697) * Move the sample to new location and fix warning * build fixes and 0 return code on waive * Some new MSVC errors * explicit cast * Rename enable/disable peer access and separate the sample loop * Add `cuda::minimum` and `cuda::maximum` (#2681) * Add cuda::minimum and cuda::maximum * Various fixes to cub::DeviceTransform (#2709) * Workaround non-copyable iterators * Use a named constant for SMEM * Cast to raw reference 2 * Fix passing non-copy-assignable iterators to transform_kernel via kernel_arg * Make `thrust::transform` use `cub::DeviceTransform` (#2389) * Add transform benchmark requiring a stable address * Make thrust::transform use cub::DeviceTransform * Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious * Optimize prefetch cub::DeviceTransform for small problems Fixes: #2263 * Ensure that we only use the inline variable trait when it is actually available (#2712) * Ensure that we only use the inline variable trait when it is actually available * Use the right define for internal traits * [CUDAX] Rename memory resource and memory pool from async to device (#2710) * Rename the type * Update tests * Rename async memory pool * Rename the tests * Change name in the docs * Generalise the memory_pool_properties name * Fix docs --------- Co-authored-by: Michael Schellenberger Costa <[email protected]> * Update memory resource name --------- Co-authored-by: Eric Niebler <[email protected]> Co-authored-by: Allison Piper <[email protected]> Co-authored-by: Jacob Faibussowitsch <[email protected]> Co-authored-by: Bernhard Manfred Gruber <[email protected]> Co-authored-by: Michael Schellenberger Costa <[email protected]>
fbusato
pushed a commit
to fbusato/cccl
that referenced
this issue
Nov 12, 2024
* copy pasted sample * First draft * Kernel functor and some other things * Clean up and break up long main function * Needs launch fix * Switch to copy_bytes and cleanups * Missing include * Add exception print and waive value * Adjust copy count * Add license and switch benchmark streams * Remove a function left as a mistake * Update copyright date Co-authored-by: Eric Niebler <[email protected]> * Setup cudax examples. (NVIDIA#2697) * Move the sample to new location and fix warning * build fixes and 0 return code on waive * Some new MSVC errors * explicit cast * Rename enable/disable peer access and separate the sample loop * Add `cuda::minimum` and `cuda::maximum` (NVIDIA#2681) * Add cuda::minimum and cuda::maximum * Various fixes to cub::DeviceTransform (NVIDIA#2709) * Workaround non-copyable iterators * Use a named constant for SMEM * Cast to raw reference 2 * Fix passing non-copy-assignable iterators to transform_kernel via kernel_arg * Make `thrust::transform` use `cub::DeviceTransform` (NVIDIA#2389) * Add transform benchmark requiring a stable address * Make thrust::transform use cub::DeviceTransform * Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious * Optimize prefetch cub::DeviceTransform for small problems Fixes: NVIDIA#2263 * Ensure that we only use the inline variable trait when it is actually available (NVIDIA#2712) * Ensure that we only use the inline variable trait when it is actually available * Use the right define for internal traits * [CUDAX] Rename memory resource and memory pool from async to device (NVIDIA#2710) * Rename the type * Update tests * Rename async memory pool * Rename the tests * Change name in the docs * Generalise the memory_pool_properties name * Fix docs --------- Co-authored-by: Michael Schellenberger Costa <[email protected]> * Update memory resource name --------- Co-authored-by: Eric Niebler <[email protected]> Co-authored-by: Allison Piper <[email protected]> Co-authored-by: Jacob Faibussowitsch <[email protected]> Co-authored-by: Bernhard Manfred Gruber <[email protected]> Co-authored-by: Michael Schellenberger Costa <[email protected]>
fbusato
pushed a commit
to fbusato/cccl
that referenced
this issue
Jan 9, 2025
* Add transform benchmark requiring a stable address * Make thrust::transform use cub::DeviceTransform * Introduces address stability detection and opt-in in libcu++ * Mark lambdas in Thrust BabelStream benchmark address oblivious * Optimize prefetch cub::DeviceTransform for small problems Fixes: NVIDIA#2263
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Once #2086 is merged,
thrust::transform
should be ported to usecub::DeviceTransform
.Tasks
thrust::transform
tocub::DeviceTransform
thrust::transform
are decomposed and optimizedthrust::transform
, including BabelStreamThe text was updated successfully, but these errors were encountered: