Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][CUDA][LIT] Fix base address get and subbuffer LIT test #1913

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions sycl/plugins/cuda/pi_cuda.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -951,12 +951,15 @@ pi_result cuda_piDeviceGetInfo(pi_device device, pi_device_info param_name,
size_t{4000u});
}
case PI_DEVICE_INFO_MEM_BASE_ADDR_ALIGN: {
// TODO: is this config consistent across all NVIDIA GPUs?
// "The minimum value is the size (in bits) of the largest OpenCL built-in
// data type supported by the device"
// Hard coded to value returned by clinfo for OpenCL 1.2 CUDA | GeForce GTX
// 1060 3GB
return getInfo(param_value_size, param_value, param_value_size_ret, 4096u);
int mem_base_addr_align = 0;
cl::sycl::detail::pi::assertion(
cuDeviceGetAttribute(&mem_base_addr_align,
CU_DEVICE_ATTRIBUTE_TEXTURE_ALIGNMENT,
device->get()) == CUDA_SUCCESS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Przemek and I discussed offline - the reported value aligns with the hard-coded value before. This alignment is probably required to enable creation of images, e.g., CL_MEM_OBJECT_IMAGE1D_BUFFER, from sub-buffers.

// Multiply by 8 as clGetDeviceInfo returns this value in bits
mem_base_addr_align *= 8;
return getInfo(param_value_size, param_value, param_value_size_ret,
mem_base_addr_align);
}
case PI_DEVICE_INFO_HALF_FP_CONFIG: {
// TODO: is this config consistent across all NVIDIA GPUs?
Expand Down
9 changes: 3 additions & 6 deletions sycl/test/basic_tests/buffer/subbuffer.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
// XFAIL: cuda
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me happy!

// TODO: Fix CUDA implementation.
//
// RUN: %clangxx -fsycl -fsycl-targets=%sycl_triple %s -o %t.out
// RUN: env SYCL_DEVICE_TYPE=HOST %t.out
// RUN: %CPU_RUN_PLACEHOLDER %t.out
Expand Down Expand Up @@ -112,12 +109,12 @@ void check1DSubBuffer(cl::sycl::queue &q) {
assert(false && "Exception was caught");
}

for (int i = offset; i < subbuf_size; ++i)
assert(vec[i] == (i > 34 ? i * 10 : i * -10) &&
for (int i = offset; i < offset + subbuf_size; ++i)
assert(vec[i] == (i < offset + offset_inside_subbuf ? i * 10 : i * -10) &&
"Invalid result in 1d sub buffer");

for (int i = 0; i < subbuf_size; ++i)
assert(vec2[i] == (i < 3 ? (32 + i) : (32 + i) * -1) &&
assert(vec2[i] == (i < 3 ? (offset + i) : (offset + i) * -1) &&
"Invalid result in 1d sub buffer");
}

Expand Down