SYCL spec example throw an instance of 'sycl::_v1::invalid_parameter_error' #7568

wangzy0327 · 2022-11-29T09:22:56Z

Describe the bug
terminate called after throwing an instance of 'sycl::_V1::invalid_parameter_error'
what(): Index out of range -30 (PI_ERROR_INVALID_VALUE)

ref

To Reproduce
Please describe the steps to reproduce the behavior:

These are my example code;

my example code

#include <sycl/sycl.hpp>
#include <iostream>
#include <random>

using namespace sycl;

constexpr size_t N = 10;

class RandomFiller {
 public:
  RandomFiller(accessor<int> ptr)
      : ptr_ { ptr } {
    std::random_device hwRand;
    std::uniform_int_distribution<> r { 1, 100 };
    randomNum_ = r(hwRand);
  }
  void operator()(item<1> item) const { ptr_[item.get_id()] = get_random(); }
  int get_random() const { return randomNum_; }

 private:
  accessor<int> ptr_;
  int randomNum_;
};

void workFunction(buffer<int, 1>& b, queue& q, const range<1> r) {
  q.submit([&](handler& cgh) {
    accessor ptr { b, cgh };
    RandomFiller filler { ptr };

    cgh.parallel_for(r, filler);
  });
}

int main(){
    queue myQueue;
    auto myContext = myQueue.get_context();
    auto myDev = myQueue.get_device();

    range<1> r { N };

    buffer<int, 1> a { r };

    auto exeBundle = get_kernel_bundle<bundle_state::executable>(myContext);
    
    std::vector<kernel_id> execKernelIds = exeBundle.get_kernel_ids();
    std::cout<<"executable kernel size : "<<execKernelIds.size()<<std::endl;
    for(auto kernel_id : execKernelIds){
        std::cout<<"exec kernel id name : "<<kernel_id.get_name()<<std::endl;
        kernel myKernel = exeBundle.get_kernel<bundle_state::executable>(kernel_id);
        std::cout<<"exec kernel id info function_name : "<<myKernel.get_info<info::kernel::function_name>()<<std::endl;
        std::cout<<"exec kernel id info reference count : "<<myKernel.get_info<info::kernel::reference_count>()<<std::endl;
        std::cout<<"exec kernel id info num_args : "<<myKernel.get_info<info::kernel::num_args>()<<std::endl;
        std::cout<<"exec kernel id info attributes : "<<myKernel.get_info<info::kernel::attributes>()<<std::endl;
        std::cout<<std::endl;
    }

    workFunction(a,myQueue,r);

    myQueue.wait();

    host_accessor rand {a,read_only};
    std::cout << std::endl << "Result:" << std::endl;
    for (size_t i = 0; i < N; i++) {
        std::cout<<"random array r["<<i<<"] = "<<r[i]<<std::endl;
        // std::cout<<"B B["<<i<<"] = "<<B[i]<<std::endl;
        // std::cout<<"C C["<<i<<"] = "<<C[i]<<std::endl;
        // Compare the result to the analytic value
        if (r[i] <= 0 || r[i] >= 100) {
            std::cout << "Wrong value " << r[i] << " on element " << i << " " << std::endl;
            exit(-1);
        }
    }
    std::cout << "Good Random distribution !" << std::endl;
    return 0;
}

CMakeLists.txt

cmake_minimum_required(VERSION 2.8.12)

set(DPCPP_HOME "/home/wzy/sycl_workspace")
set(DPCPP_SYCL_HOME "${DPCPP_HOME}/build-cuda-2022-09-debug")
# set(DPCPP_SYCL_HOME "${DPCPP_HOME}/build-cuda-2022-09")

set(CMAKE_C_COMPILER "${DPCPP_SYCL_HOME}/bin/clang")
set(CMAKE_CXX_COMPILER "${DPCPP_SYCL_HOME}/bin/clang++")
set(CMAKE_CXX_STANDARD 17)

project(function-objects)


include_directories("${DPCPP_SYCL_HOME}/include/sycl")
include_directories("${DPCPP_SYCL_HOME}/include")

message(STATUS "dpcpp_home : ${DPCPP_HOME}")
message(STATUS "dpcpp_cuda_sycl_home : ${DPCPP_SYCL_HOME}")


message(STATUS "find library path : ${DPCPP_SYCL_HOME}/lib")
set(CMAKE_BUILD_RPATH "${DPCPP_SYCL_HOME}/lib;${CMAKE_BUILD_RPATH}")
message(STATUS "cmake build rpath : ${CMAKE_BUILD_RPATH}")


set(CMAKE_BUILD_TYPE "Debug")
# set(CMAKE_BUILD_TYPE "Release")
set(CMAKE_CXX_FLAGS "-fsycl -fsycl-targets=nvptx64-nvidia-cuda")
set(CMAKE_CXX_FLAGS_DEBUG "$ENV{CXXFLAGS} -O0 -Wall -g -ggdb -std=c++17")
set(CMAKE_CXX_FLAGS_RELEASE "$ENV{CXXFLAGS} -O3 -Wall -std=c++17")


link_directories("${DPCPP_SYCL_HOME}/lib")

aux_source_directory(. DIR_SRCS)
add_executable(function-objects ${DIR_SRCS})
target_include_directories(function-objects PRIVATE "${DPCPP_SYCL_HOME}/include/sycl")
target_include_directories(function-objects PRIVATE "${DPCPP_SYCL_HOME}/include")
target_link_libraries(function-objects PRIVATE sycl)

Environment (please complete the following information):

OS: Linux
Target device and vendor:Nvidia GPU
DPC++ version:Clang++ 16 2022-09
Dependencies version: cuda 11.2

Additional context

There are other problem.

1、what the mean of myKernel.get_infoinfo::kernel::num_args() ? Is parallel_for funcion object operator () arguments ?
2、why the kernel size from bundle_kernel not equals queue submit kernel size ?

The text was updated successfully, but these errors were encountered:

npmiller · 2022-11-29T11:34:30Z

The crash here is because in the verification loop you're using r which is the sycl::range when I think you mean to use rand, the host_accessor, changing that in this part makes the sample work:

    host_accessor rand {a,read_only};
    std::cout << std::endl << "Result:" << std::endl;
    for (size_t i = 0; i < N; i++) {
        std::cout<<"random array rand["<<i<<"] = "<<rand[i]<<std::endl;
        // std::cout<<"B B["<<i<<"] = "<<B[i]<<std::endl;
        // std::cout<<"C C["<<i<<"] = "<<C[i]<<std::endl;
        // Compare the result to the analytic value
        if (rand[i] <= 0 || rand[i] >= 100) {
            std::cout << "Wrong value " << rand[i] << " on element " << i << " " << std::endl;
            exit(-1);
        }
    }

wangzy0327 · 2022-11-29T11:56:20Z

The crash here is because in the verification loop you're using r which is the sycl::range when I think you mean to use rand, the host_accessor, changing that in this part makes the sample work:

    host_accessor rand {a,read_only};
    std::cout << std::endl << "Result:" << std::endl;
    for (size_t i = 0; i < N; i++) {
        std::cout<<"random array rand["<<i<<"] = "<<rand[i]<<std::endl;
        // std::cout<<"B B["<<i<<"] = "<<B[i]<<std::endl;
        // std::cout<<"C C["<<i<<"] = "<<C[i]<<std::endl;
        // Compare the result to the analytic value
        if (rand[i] <= 0 || rand[i] >= 100) {
            std::cout << "Wrong value " << rand[i] << " on element " << i << " " << std::endl;
            exit(-1);
        }
    }

There are other problem.

1、what the mean of myKernel.get_infoinfo::kernel::num_args() ? Is parallel_for funcion object operator () arguments ?
2、why the kernel size from bundle_kernel not equals queue submit kernel size ?

AlexeySachkov · 2022-11-29T12:37:11Z

Hi @wangzy0327,

1、what the mean of myKernel.get_infoinfo::kernel::num_args() ? Is parallel_for funcion object operator () arguments ?

Yes, that query returns a number of arguments your kernel has. When kernel is defined as a named function object, all it member variables are turned into kernel arguments, see 4.12.4. Rules for parameter passing to kernels:

If the kernel is a named function object, the operator() member function (or other member functions that it calls) may reference member variables inside the same named function object. Any such member variables become parameters to the kernel. If the kernel is a lambda function, any variables captured by the lambda become parameters to the kernel.

2、why the kernel size from bundle_kernel not equals queue submit kernel size ?

Could you please clarify what do you mean by "queue submit kernel size" here?

wangzy0327 · 2022-11-29T12:48:34Z

Hi @wangzy0327,

1、what the mean of myKernel.get_infoinfo::kernel::num_args() ? Is parallel_for funcion object operator () arguments ?

Yes, that query returns a number of arguments your kernel has. When kernel is defined as a named function object, all it member variables are turned into kernel arguments, see 4.12.4. Rules for parameter passing to kernels:

If the kernel is a named function object, the operator() member function (or other member functions that it calls) may reference member variables inside the same named function object. Any such member variables become parameters to the kernel. If the kernel is a lambda function, any variables captured by the lambda become parameters to the kernel.

2、why the kernel size from bundle_kernel not equals queue submit kernel size ?

Could you please clarify what do you mean by "queue submit kernel size" here?

2、why the kernel size from bundle_kernel not equals queue submit kernel size ?

for the example

function_object.cc


#include <sycl/sycl.hpp>
#include <iostream>
#include <random>

using namespace sycl;

constexpr size_t N = 10;

class RandomFiller {
 public:
  RandomFiller(accessor<int> ptr)
      : ptr_ { ptr } {
    std::random_device hwRand;
    std::uniform_int_distribution<> r { 1, 100 };
    randomNum_ = r(hwRand);
  }
  void operator()(item<1> item) const { ptr_[item.get_id()] = get_random(); }
  int get_random() const { return randomNum_; }

 private:
  accessor<int> ptr_;
  int randomNum_;
};

void workFunction(buffer<int, 1>& b, queue& q, const range<1> r) {
  q.submit([&](handler& cgh) {
    accessor ptr { b, cgh };
    RandomFiller filler { ptr };

    cgh.parallel_for(r, filler);
  });
}

int main(){
    queue myQueue;
    auto myContext = myQueue.get_context();
    auto myDev = myQueue.get_device();

    range<1> r { N };

    buffer<int, 1> a { r };

    auto exeBundle = get_kernel_bundle<bundle_state::executable>(myContext);
    
    std::vector<kernel_id> execKernelIds = exeBundle.get_kernel_ids();
    std::cout<<"executable kernel size : "<<execKernelIds.size()<<std::endl;
    std::cout<<std::endl;
    for(auto kernel_id : execKernelIds){
        std::cout<<"exec kernel id name : "<<kernel_id.get_name()<<std::endl;
        kernel myKernel = exeBundle.get_kernel<bundle_state::executable>(kernel_id);
        std::cout<<"exec kernel id info function_name : "<<myKernel.get_info<info::kernel::function_name>()<<std::endl;
        std::cout<<"exec kernel id info reference count : "<<myKernel.get_info<info::kernel::reference_count>()<<std::endl;
        std::cout<<"exec kernel id info num_args : "<<myKernel.get_info<info::kernel::num_args>()<<std::endl;
        std::cout<<"exec kernel id info attributes : "<<myKernel.get_info<info::kernel::attributes>()<<std::endl;
        std::cout<<std::endl;
    }

    kernel myKernel = exeBundle.get_kernel<bundle_state::executable>(execKernelIds[0]);

    workFunction(a,myQueue,r);
    // myQueue.submit([&](handler& cgh){
    //     accessor ptr = {a,cgh};
    //     cgh.set_args(ptr);
    //     cgh.parallel_for(r,myKernel);
    // });

    myQueue.wait();

    host_accessor rand {a,read_only};
    std::cout << std::endl << "Result:" << std::endl;
    for (size_t i = 0; i < N; i++) {
        std::cout<<"random array rand["<<i<<"] = "<<rand[i]<<std::endl;
        if (rand[i] <= 0 || rand[i] >= 100) {
            std::cout << "Wrong value " << rand[i] << " on element " << i << " " << std::endl;
            exit(-1);
        }
    }
    std::cout << "Good Random distribution !" << std::endl;
    return 0;
}

In the example code, it just submit a kernel function by queue.But kernel_bundle get_kernel_ids() get 2 kernel function. In the print result, there are kernel_id :_ZTS12RandomFiller and kernel_id : _ZTSN4sycl3_V16detail18RoundedRangeKernelINS0_4itemILi1ELb1EEELi1E12RandomFillerEE

AlexeySachkov · 2022-11-29T13:01:24Z

In the example code, it just submit a kernel function by queue.But kernel_bundle get_kernel_ids() get 2 kernel function. In the print result, there are kernel_id :_ZTS12RandomFiller and kernel_id : _ZTSN4sycl3_V16detail18RoundedRangeKernelINS0_4itemILi1ELb1EEELi1E12RandomFillerEE

The second kernel is implicitly create by our implementation. It is a part of "range rounding" feature (see #2703): when user passes a range into parallel_for, implementation is responsible for choosing work-group size of a kernel. However, if range contains prime numbers, implementation will have to use work-group size of 1, which will affect performance. To improve such cases, we generate a second kernel and launch it with increased global size so we are able to select bigger work-group size and get better performance results.

wangzy0327 · 2022-11-29T14:22:36Z

In the example code, it just submit a kernel function by queue.But kernel_bundle get_kernel_ids() get 2 kernel function. In the print result, there are kernel_id :_ZTS12RandomFiller and kernel_id : _ZTSN4sycl3_V16detail18RoundedRangeKernelINS0_4itemILi1ELb1EEELi1E12RandomFillerEE

The second kernel is implicitly create by our implementation. It is a part of "range rounding" feature (see #2703): when user passes a range into parallel_for, implementation is responsible for choosing work-group size of a kernel. However, if range contains prime numbers, implementation will have to use work-group size of 1, which will affect performance. To improve such cases, we generate a second kernel and launch it with increased global size so we are able to select bigger work-group size and get better performance results.

If I want to invoke kernel by online compilation way.,how do I invoke the kernel and set_args in the example? Can you help me？

void parallel_for(range<1> NumWorkItems, kernel Kernel)

AlexeySachkov · 2022-11-29T15:22:21Z

If I want to invoke kernel by online compilation way.,how do I invoke the kernel and set_args in the example

You can find such example in the spec, see 4.11.15.1. Controlling the timing of online compilation

Essentially, the only change you need is to call handler::use_kernel_bundle within queue::submit lambda in order to instruct runtime to use your kernel bundle (which may be already in executable state) instead of creating one implicitly on the fly.

You do not need to pass a kernel object to parallel_for for that, your parallel_for should look like usual, RT will do all necessary steps under the hood automatically.

wangzy0327 added the bug Something isn't working label Nov 29, 2022

MrSidims removed the bug Something isn't working label Nov 29, 2022

MrSidims closed this as completed Nov 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SYCL spec example throw an instance of 'sycl::_v1::invalid_parameter_error' #7568

SYCL spec example throw an instance of 'sycl::_v1::invalid_parameter_error' #7568

wangzy0327 commented Nov 29, 2022 •

edited

Loading

npmiller commented Nov 29, 2022

wangzy0327 commented Nov 29, 2022

AlexeySachkov commented Nov 29, 2022

wangzy0327 commented Nov 29, 2022 •

edited

Loading

AlexeySachkov commented Nov 29, 2022

wangzy0327 commented Nov 29, 2022

AlexeySachkov commented Nov 29, 2022

SYCL spec example throw an instance of 'sycl::_v1::invalid_parameter_error' #7568

SYCL spec example throw an instance of 'sycl::_v1::invalid_parameter_error' #7568

Comments

wangzy0327 commented Nov 29, 2022 • edited Loading

npmiller commented Nov 29, 2022

wangzy0327 commented Nov 29, 2022

AlexeySachkov commented Nov 29, 2022

wangzy0327 commented Nov 29, 2022 • edited Loading

AlexeySachkov commented Nov 29, 2022

wangzy0327 commented Nov 29, 2022

AlexeySachkov commented Nov 29, 2022

wangzy0327 commented Nov 29, 2022 •

edited

Loading

wangzy0327 commented Nov 29, 2022 •

edited

Loading