Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cannot use cudf::test::iterators::nulls_at () in .cu test file #17432

Closed
ttnghia opened this issue Nov 24, 2024 · 10 comments · Fixed by #17529
Closed

[BUG] Cannot use cudf::test::iterators::nulls_at () in .cu test file #17432

ttnghia opened this issue Nov 24, 2024 · 10 comments · Fixed by #17529
Assignees
Labels
bug Something isn't working

Comments

@ttnghia
Copy link
Contributor

ttnghia commented Nov 24, 2024

When using cudf::test::iterators::nulls_at () in .cu test file, I got this error:

error #20011-D: calling a __host__ function("std::vector<int, ::std::allocator<int> > ::vector(const ::std::vector<int, ::std::allocator<int> > &)") 
from a __host__ __device__ function("auto ::cudf::test::iterators::nulls_at< ::__gnu_cxx::__normal_iterator<const int *,     ::std::vector<int, ::std::allocator<int> > > > (T1, T1)::[lambda(T1) (instance 1)]::[lambda(T1) (instance 1)]") 
  is not allowed

Notice that the implementation of that function is:

template <typename Iter>
[[maybe_unused]] static auto nulls_at(Iter index_start, Iter index_end)
{
  using index_type = typename std::iterator_traits<Iter>::value_type;

  return cudf::detail::make_counting_transform_iterator(
    0, [indices = std::vector<index_type>{index_start, index_end}](auto i) {
      return std::find(indices.cbegin(), indices.cend(), i) == indices.cend();
    });
}

which initializes std::vector<index_type> upon creating the lambda, causing that issue.

@ttnghia ttnghia added the bug Something isn't working label Nov 24, 2024
@davidwendt
Copy link
Contributor

Can you provide an example .cu implementation that fails?
Changing one of our .cpp files containing nulls_at() did not fail to compile for me.

@ttnghia
Copy link
Contributor Author

ttnghia commented Nov 25, 2024

Yes, it is used in many cpp file but not any .cu file. I'm trying to use once and got that error immediately. It is just used by the similar way as in other .cpp way, I can generate a simple example later if needed.

@davidwendt
Copy link
Contributor

Yes, it is used in many cpp file but not any .cu file. I'm trying to use once and got that error immediately. It is just used by the similar way as in other .cpp way, I can generate a simple example later if needed.

Yes, can you provide an example .cu implementation that fails?
Changing one of our .cpp files containing nulls_at() did not fail to compile for me.

@ttnghia
Copy link
Contributor Author

ttnghia commented Nov 25, 2024

Now is the interesting part.

Below is my example code:

#include <cudf_test/base_fixture.hpp>
#include <cudf_test/column_wrapper.hpp>
#include <cudf_test/debug_utilities.hpp>
#include <cudf_test/iterator_utilities.hpp>

using int32s_col = cudf::test::fixed_width_column_wrapper<int32_t>;

struct Test : cudf::test::BaseFixture {};

TEST_F(Test, SimpleInput)
{
  auto const vals = int32s_col{{0, 1, 2}, cudf::test::iterators::nulls_at({1, 2})};
  cudf::test::print(vals);
}

With this code alone put in a .cu file, I can compile without any issues. But if I add the same code into my other WIP .cu file then I can't compile it.

I'll check to see if other existing .cu files can trigger this issue.

@ttnghia
Copy link
Contributor Author

ttnghia commented Nov 25, 2024

Alright, here is a reproducible code:


#include <cudf_test/base_fixture.hpp>
#include <cudf_test/column_wrapper.hpp>
#include <cudf_test/debug_utilities.hpp>
#include <cudf_test/iterator_utilities.hpp>
#include <cudf_test/type_lists.hpp>

#include <cudf/column/column_device_view.cuh>
#include <cudf/utilities/type_dispatcher.hpp>

#include <rmm/exec_policy.hpp>

#include <thrust/iterator/counting_iterator.h>
#include <thrust/transform_reduce.h>

template <typename InputType, typename OutputType>
struct transform_fn {
  cudf::column_device_view values;
  OutputType __device__ operator()(cudf::size_type idx) const
  {
    if (values.is_null(idx)) { return OutputType{0}; }
    auto const val = static_cast<OutputType>(values.element<InputType>(idx));
    return val * val;
  }
};

// For faster compile times, we only support a few input/output types.
template <typename T>
static constexpr bool is_valid_input_t()
{
  return std::is_same_v<T, double> || std::is_same_v<T, int32_t>;
}

// For faster compile times, we only support a few input/output types.
template <typename T>
static constexpr bool is_valid_output_t()
{
  return std::is_same_v<T, int64_t>;
}

struct reduce_fn {
  template <typename InputType,
            typename OutputType,
            typename... Args,
            CUDF_ENABLE_IF(!is_valid_input_t<InputType>() || !is_valid_output_t<OutputType>())>
  int operator()(Args...) const
  {
    CUDF_FAIL("Unsupported input type.");
  }

  template <typename InputType,
            typename OutputType,
            CUDF_ENABLE_IF(is_valid_input_t<InputType>() && is_valid_output_t<OutputType>())>
  int operator()(cudf::column_view const& values,
                 cudf::data_type output_dtype,
                 rmm::cuda_stream_view stream,
                 rmm::device_async_resource_ref mr) const
  {
    auto const values_dv_ptr = cudf::column_device_view::create(values, stream);
    auto const result =
      thrust::transform_reduce(rmm::exec_policy(stream),
                               thrust::make_counting_iterator(0),
                               thrust::make_counting_iterator(values.size()),
                               transform_fn<InputType, OutputType>{*values_dv_ptr},
                               static_cast<OutputType>(0),
                               thrust::plus<>{});

    return static_cast<int>(result);
  }
};

struct my_struct {
  my_struct() = default;

  [[nodiscard]] int operator()(cudf::column_view const& input,
                               cudf::data_type output_dtype,
                               rmm::cuda_stream_view stream,
                               rmm::device_async_resource_ref mr) const
  {
    return cudf::double_type_dispatcher(
      input.type(), output_dtype, reduce_fn{}, input, output_dtype, stream, mr);
  }
};

using int32s_col = cudf::test::fixed_width_column_wrapper<int32_t>;

struct Test : cudf::test::BaseFixture {};

TEST_F(Test, SimpleInput)
{
  auto const obj  = std::make_unique<my_struct>();
  auto const vals = int32s_col{{0, 1, 2}, cudf::test::iterators::nulls_at({1, 2})};
  cudf::test::print(vals);
}

@ttnghia
Copy link
Contributor Author

ttnghia commented Nov 25, 2024

Note that if I remove double_type_dispatcher then the issue is gone. I.e., replacing this

    return cudf::double_type_dispatcher(
      input.type(), output_dtype, reduce_fn{}, input, output_dtype, stream, mr);

by this:

    return reduce_fn{}.operator()<int, int>(input, output_dtype, stream, mr);

Is that due to double_type_dispatcher has CUDF_HOST_DEVICE in its operator()?

@davidwendt
Copy link
Contributor

Sorry, I'm not seeing where the double_type_dispatcher ends up calling nulls_at().
What is this test for? Are you creating input data or expected output data with this?
Either way, it may be better to just to hardcode the input or expected values.

@ttnghia
Copy link
Contributor Author

ttnghia commented Nov 25, 2024

Sorry, I'm not seeing where the double_type_dispatcher ends up calling nulls_at().

That is the point. There seems not any relation between them, but that is what the compiler does to me.

What is this test for? Are you creating input data or expected output data with this?
Either way, it may be better to just to hardcode the input or expected values.

The example above does not test anything. It just reproduces the compile issue that I mentioned.

@davidwendt
Copy link
Contributor

Sorry, I'm not seeing where the double_type_dispatcher ends up calling nulls_at().

That is the point. There seems not any relation between them, but that is what the compiler does to me.

Oh wow! So I know gtests generate code very strangely.
The problem is with the compiler and resolving kernel function names.
If you wrap the transform_fn definition (at least) through the my_struct definition in an anonymous namespace you may get around the issue. For whatever reason, the compiler is linking in some other functors here.

@ttnghia
Copy link
Contributor Author

ttnghia commented Nov 25, 2024

If you wrap the transform_fn definition (at least) through the my_struct definition in an anonymous namespace you may get around the issue.

Yes, this workaround works 👍 . Thanks for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants