Feature Request: Multiway count_if/copy_if #799

upsj · 2022-03-18T10:28:02Z

I frequently run across the need for an algorithm that can copy multiple disjoint subsets from a vector to another vector (or set of vectors?) based on some kind of classification function that tells me for each element which subset it belongs to, if any. For now, let's just imagine it as a lambda returning a non-negative integer, or -1 if it's not part of any subset.

For a small number of subsets, this could be implemented as a number of copy_ifs, for a larger number of subsets, it could be implemented via stable_sort by subset ID followed by finding the end of the -1 subset. But both of those approaches are far from a single-pass or at worst two-pass approach that a hand-written implementation would require.

I believe a count_if/copy_if pair like this could be really useful for many applications, e.g.

histogram calculation
element classification
generalized partition_copy (if we don't have -1 values)

This could even be used to implement a badly tuned version of Radixsort or Samplesort by yourself.

Interface considerations for this include

How to represent the classification? Lambda returning an integer would be straightforward, but that leaves us open to somebody returning out-of-bounds subset IDs (just discard all of the elements?)
How to represent the output ranges? If the number of classes is known statically, a variadic templated parameter/tuple matching the number of output ranges would work, otherwise this would be related to the RangeOfRanges approach discussed in Consider support for segmented reductions and sorts specified by count-value representation #676, or everything is just being written to a single contiguous output range (like partition_copy), together with an offset array specifying their sizes.
How to return the output sizes/iterators from the multiway copy_if? This could be a tuple of individual iterators for the statically known size, or a range of ranges for the dynamic case

The text was updated successfully, but these errors were encountered:

alliepiper · 2022-03-21T18:32:16Z

This sounds somewhat related to NVIDIA/cub#297.

jrhemstad added the thrust For all items related to Thrust. label Feb 22, 2023

jarmak-nv assigned ericniebler Feb 23, 2023

github-project-automation bot added this to CCCL Nov 8, 2023

jarmak-nv transferred this issue from NVIDIA/thrust Nov 8, 2023

github-project-automation bot moved this to Todo in CCCL Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Multiway count_if/copy_if #799

Feature Request: Multiway count_if/copy_if #799

upsj commented Mar 18, 2022 •

edited

Loading

alliepiper commented Mar 21, 2022

Feature Request: Multiway count_if/copy_if #799

Feature Request: Multiway count_if/copy_if #799

Comments

upsj commented Mar 18, 2022 • edited Loading

alliepiper commented Mar 21, 2022

upsj commented Mar 18, 2022 •

edited

Loading