Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

transform_output_iterator (and transform_input_output_iterator) unusable with reduce_by_key due to missing default constructor #1804

Closed
harrism opened this issue Oct 5, 2022 · 5 comments · Fixed by #1805

Comments

@harrism
Copy link
Contributor

harrism commented Oct 5, 2022

I have been unable to use a thrust::transform_output_iterator as the output iterator for thrust::reduce_by_key. After much head scratching and digging, I found the cause:

Unlike all other Thrust iterators, transform_output_iterator and transform_input_output_iterator have no default constructor. And reduce_by_key constructs an empty pair or iterators for its result:

pair<KeysOutputIt, ValuesOutputIt> result{};

If one of the types in the pair is a transform_output_iterator, compilation will fail with: error: no instance of constructor "thrust::transform_output_iterator<UnaryFunction, OutputIterator>::transform_output_iterator [with UnaryFunction = ...]" matches the argument list.

Here's a repro: https://godbolt.org/z/P98YYh1x1

We have found two ways to fix this. The simple way is to add the default constructor. Here's a demo of that. This has a namespace thrust_fix which has its own transform_output_iterator. https://godbolt.org/z/rK3Ec73bz


Another way is to change the scary self-referential macro THRUST_INDEX_TYPE_DISPATCH so that it doesn't require the "Initialize Then Modify" antipattern. Its approach of assigning to the result requires pre-declaring that result, which means auto can't be used.

#define THRUST_INDEX_TYPE_DISPATCH(status, call, count, arguments) \
if (count <= thrust::detail::integer_traits<thrust::detail::int32_t>::const_max) { \
auto THRUST_PP_CAT2(count, _fixed) = static_cast<thrust::detail::int32_t>(count); \
status = call arguments; \
} \
else { \
auto THRUST_PP_CAT2(count, _fixed) = static_cast<thrust::detail::int64_t>(count); \
status = call arguments; \
}

(Not to mention that requiring the caller to inject the renamed count_fixed variable into the arguments list is uuuugly.)

This could be done differently using an IILE:

#define THRUST_INDEX_TYPE_DISPATCH_NEW(call, count, arguments)                          \
  [&]() {                                                                               \
    if (count > thrust::detail::integer_traits<thrust::detail::int32_t>::const_max) {  \
      auto THRUST_PP_CAT2(count, _fixed) = static_cast<thrust::detail::int32_t>(count); \
      return call arguments;                                                          \
    } else {                                                                            \
      auto THRUST_PP_CAT2(count, _fixed) = static_cast<thrust::detail::int64_t>(count); \
      return call arguments;
    }                                                                                   \
  }();

This means we no longer have to pre-declare a pair of iterators in reduce_by_key. We just return directly:

return THRUST_INDEX_TYPE_DISPATCH(reduce_by_key_dispatch,
                                    num_items,
                                    (policy,
                                     keys_first,
                                     num_items_fixed,
                                     values_first,
                                     keys_output,
                                     values_output,
                                     equality_op,
                                     reduction_op));

There may be limitations of the lambda capture by reference used here that I haven't thought about, and this would impact a lot of Thrust code. So I recommend just adding the default constructors. :)

CC @isVoid

@jrhemstad
Copy link
Collaborator

I think you're right that the more fundamental problem is that reduce_by_key should not be attempting to default construct an instance of the output iterator.

There's no requirement on the output_iterator concept to be default constructible, so this is a problem in the algorithm.

Furthermore, making transform_output_iterator default constructible smells off to me. What would the semantics be for an iterator adaptor that doesn't have an iterator to adapt?

@harrism
Copy link
Contributor Author

harrism commented Oct 6, 2022

Furthermore, making transform_output_iterator default constructible smells off to me.

Upon reflection, it smells OK to me. Here's why.

What would the semantics be for an iterator adaptor that doesn't have an iterator to adapt?

I think the semantics are well defined as long as the adapted types and member data types are default constructible.

transform_output_iterator is the same as transform_iterator in this regard: As long as the adapted iterator type and the unary function type are default constructible, the semantics are clear.

I would argue it's not uncommon for functors to be default constructible. Let's look at whether it's common for Thrust iterators to be default constructible.

  • constant_iterator: default constructible (as long as value_type is default constructible)
  • counting_iterator: default constructible (as long as value_type is default constructible)
  • discard_iterator: default constructible
  • permutation_iterator: default constructible (as long as the ElementIterator is default constructible)
  • reverse_iterator: default constructible (as long as the adapted iterator type is default constructible)
  • zip_iterator: default constructible (as long as the iterator tuple is default constructible)
  • transform_iterator`: default constructible (as long as Iterator and AdaptableUnaryFunction are default constructible)
  • transform_input_output_iterator: not yet default constructible, but the semantics seem valid as long as Iterator and AdaptableUnaryFunction are default constructible
  • transform_output_iterator: not yet default constructible, but the semantics seem valid as long as Iterator and AdaptableUnaryFunction are default constructible

@jrhemstad
Copy link
Collaborator

I think the semantics are well defined as long as the adapted types and member data types are default constructible.

Maybe, but this feels wrong:

#include <thrust/iterator/transform_iterator.h>
#include <thrust/functional.h>

int main(){
  thrust::transform_iterator< thrust::identity<int>, int*> it{}; // int* is a valid iterator 
  int i = *it; // segfault
}

constant/counting/discard_iterator all make sense to be default constructible, but the rest are all iterator adaptors where it feels weird to be able to construct them without an iterator to adapt (even if the type of the adapted iterator is default constructible).

fwiw, the choice to make it default constructible is orthogonal to the fact that it is definitely a bug in reduce_by_key to expect the output iterator to be default constructible. There are valid output iterators that are not and should not be default constructible where this would break.

@harrism
Copy link
Contributor Author

harrism commented Oct 6, 2022

Agree on the second point. But iterators aside, a pointer is default constructible. That doesn't mean it's ok to dereference a default constructed pointer! User error.

@harrism
Copy link
Contributor Author

harrism commented Oct 11, 2022

Filed #1812 to cover the reduce_by_key bug.

@gevtushenko gevtushenko moved this to Done in CCCL Dec 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Archived in project
2 participants