You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I frequently run across the need for an algorithm that can copy multiple disjoint subsets from a vector to another vector (or set of vectors?) based on some kind of classification function that tells me for each element which subset it belongs to, if any. For now, let's just imagine it as a lambda returning a non-negative integer, or -1 if it's not part of any subset.
For a small number of subsets, this could be implemented as a number of copy_ifs, for a larger number of subsets, it could be implemented via stable_sort by subset ID followed by finding the end of the -1 subset. But both of those approaches are far from a single-pass or at worst two-pass approach that a hand-written implementation would require.
I believe a count_if/copy_if pair like this could be really useful for many applications, e.g.
histogram calculation
element classification
generalized partition_copy (if we don't have -1 values)
This could even be used to implement a badly tuned version of Radixsort or Samplesort by yourself.
Interface considerations for this include
How to represent the classification? Lambda returning an integer would be straightforward, but that leaves us open to somebody returning out-of-bounds subset IDs (just discard all of the elements?)
How to represent the output ranges? If the number of classes is known statically, a variadic templated parameter/tuple matching the number of output ranges would work, otherwise this would be related to the RangeOfRanges approach discussed in Consider support for segmented reductions and sorts specified by count-value representation #676, or everything is just being written to a single contiguous output range (like partition_copy), together with an offset array specifying their sizes.
How to return the output sizes/iterators from the multiway copy_if? This could be a tuple of individual iterators for the statically known size, or a range of ranges for the dynamic case
The text was updated successfully, but these errors were encountered:
I frequently run across the need for an algorithm that can copy multiple disjoint subsets from a vector to another vector (or set of vectors?) based on some kind of classification function that tells me for each element which subset it belongs to, if any. For now, let's just imagine it as a lambda returning a non-negative integer, or -1 if it's not part of any subset.
For a small number of subsets, this could be implemented as a number of copy_ifs, for a larger number of subsets, it could be implemented via
stable_sort
by subset ID followed by finding the end of the -1 subset. But both of those approaches are far from a single-pass or at worst two-pass approach that a hand-written implementation would require.I believe a count_if/copy_if pair like this could be really useful for many applications, e.g.
This could even be used to implement a badly tuned version of Radixsort or Samplesort by yourself.
Interface considerations for this include
The text was updated successfully, but these errors were encountered: