[PERF] Remove stream sync in concatenate for better pipelining #17172
Labels
improvement
Improvement / enhancement to an existing function
libcudf
Affects libcudf (C++/CUDA) code.
Performance
Performance related issue
Concatenate function uses
thrust::copy
in its implementation but throws away the return value fromthrust::copy
. Sincethrust::copy
needs to return an iterator it leads to an unnecessary stream sync.cudf/cpp/src/copying/concatenate.cu
Line 310 in 2de545b
The text was updated successfully, but these errors were encountered: