-
Notifications
You must be signed in to change notification settings - Fork 757
Unexpected behaviour when return type is specified for transform iterator. #1299
Comments
Does this work when you change the Device lambdas are very unreliable when used with generic algorithms like Thrust. See the discussion in #779. I suspect the lambda is the problem since this works when the return type is specified instead of deduced. |
@allisonvacanti Thanks for the reply. I will try some workarounds like using functor and host device attributes.
Actually, it works when return type is not specified. |
Ah, I see now, I didn't look closely enough before. This bug just got much more interesting! Related, I'm planning to remove Thrust's current scan implementation in the near future and just switch to CUB's |
@allisonvacanti Update: Actually it doesn't work either way. When return type is not specified, it's actually an invalid device function. I checked it with #include <thrust/scan.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/discard_iterator.h>
struct KeyIter {
size_t size;
size_t __host__ __device__ operator()(size_t idx) {
assert(idx < size);
return idx;
}
};
void TestScan() {
size_t size = 2150602529;
{
auto key_iter = thrust::make_transform_iterator(
thrust::make_counting_iterator<size_t>(0ul),
[=] __host__ __device__(size_t idx) {
assert(idx < size);
return idx;
});
auto end_it = key_iter + size;
thrust::inclusive_scan(thrust::device, key_iter, end_it,
thrust::make_discard_iterator(),
[] __device__(auto a, auto b) { return b; });
}
{
auto key_iter = thrust::make_transform_iterator(
thrust::make_counting_iterator<size_t>(0ul),
KeyIter{size});
auto end_it = key_iter + size;
thrust::inclusive_scan(thrust::device, key_iter, end_it,
thrust::make_discard_iterator(),
[] __device__(auto a, auto b) { return b; });
}
}
int main () {
TestScan();
} |
Not sure how it works, but isn't the Should I open an issue in cub for switching to |
No. CUB is consciously using 32 bits there; just switching all the public interfaces to use 64 bit indices causes a perf regression of about 10%-15%, according to a quick benchmark I did some time ago. CUB has a secondary interface that allows specifying the index type explicitly; in fact you can see it invoked on line 168. As Allison mentioned, we're not using CUB in scans right now, but scan itself should've been fixed by 1d16811... |
@griwes Thanks for the clarification. |
I've started working on refactoring Thrust to use CUB's scans directly: I still need to fix some issues and do more testing, but this looks like this will fix your issue. When I compile your test programs here against that branch and replace the |
Closing as a duplicate since the fundamental issues here are tracked in other bugs:
|
Also, NVIDIA/cccl#744 is tracking the 32-bit indexing issues. |
Platform
Reproduce
Following snippet is an example of creating
thrust::transform_iterator
with and without specifying return type. When the return type is not specified (the default), the iterator works correctly. But if we supply the return type explicitly, thrust scan generates out of bound iterators.Original comment is posted at #967 (comment)
The text was updated successfully, but these errors were encountered: