-
Notifications
You must be signed in to change notification settings - Fork 751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Implement SYCL part of sycl_ext_oneapi_prefetch #11458
Conversation
sycl/test/extensions/prefetch.cpp
Outdated
|
||
char data[] = {0, 1, 2, 3}; | ||
|
||
// CHECK: [[PREFETCH_STR:@.*]] = private unnamed_addr addrspace(1) constant [19 x i8] c"sycl-prefetch-hint\00", section "llvm.metadata" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these manually written? If so, do we have an utility script to generate them automatically? If yes, one might need to disable instrumentation/use -O1
to get a more readable IR.
sycl/test/extensions/prefetch.cpp
Outdated
namespace syclex = sycl::ext::oneapi::experimental; | ||
sycl::queue q; | ||
void *dataPtr = &data; | ||
q.parallel_for(1, [=](sycl::id<1> idx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be single_task
. I'd also want to see an E2E test with this used in non-uniform control flow (I don't think the spec prohibits that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to single task - 7afe1c9. I'm going to add E2E tests a bit later when the whole feature (incl. llvm-spirv translator part) is done. Not sure what do you mean by "non-uniform control". Could you please explain a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (id.get_global_id(0) % 3 == 0))
syclex::prefetch(p);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you've added joint_prefetch
that just delegates to the per-WI one, I'm even more concerned about non-uniform control flow scenario...
auto DecorIt = SpirvDecorMap.find(*Property.first); | ||
// Leave these annotations as is. They will be processed by SPIRVWriter. | ||
if (first == "sycl-prefetch-hint" || first == "sycl-prefetch-hint-nt") { | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't feel right to me, but it's outside of SYCL RT...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like currently it's the only way to transform these properties into spirv decorations for this pointer. I tried to create spirv.Decorations
metadata instead, but the compiler eliminates them with optimization flags.
Annotation also may be eliminated, but it's much less likely.
prefetch( | ||
accessor<DataT, Dimensions, AccessMode, target::device, IsPlaceholder> acc, | ||
size_t offset, size_t count, Properties properties = {}) { | ||
prefetch((void *)&acc[offset], count * sizeof(DataT), properties); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know for sure that count
elements are consecutive in memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. @Pennycook I'm not sure how it's intended to work in case of N-dim offset. Should we call the __spirv_ocl_prefetch spirv instruction several times for different memory segments in such case or there should be some constraint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We struggled a bit with multi-dimensional prefetches (see the issues).
Where we landed (for now) is that the block being prefetched is assumed to be contiguous, and only the offset itself is multi-dimensional. It's effectively a shorthand to avoid computing the linear offset from the start of the buffer. Note that the specification says for the multi-dimensional cases:
Effects: Equivalent to prefetch((void*) &acc[offset], sizeof(DataT), properties).
Effects: Equivalent to prefetch((void*) &acc[offset], count, properties).
If somebody requests a multi-dimensional prefetch later, we can describe it with a range
parameter in place of a size_t
count, and implement it the way you suggested (by calling the instruction multiple times).
@intel/dpcpp-tools-reviewers, please, take a look. The compiler change is non-functional, so should be a no brainer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SYCLLowerIR changes look good to me, as they are NFC. Just a quick request about variable name. Also, I just noticed there is no test added.
Thanks
Pre-commit failure doesn't seem related, it is covered by #11549 |
The first part: intel#11458 Adjust the CompileTimePropertiesPass so it can convert new properties into spirv decorations.
SYCL part: intel/llvm#11458 intel/llvm#11597 Handle new properties and decorate prefetch's arg.
SYCL part: intel/llvm#11458 intel/llvm#11597 Handle new properties and decorate prefetch's arg.
SYCL part: #11458 #11597 Handle new properties and decorate prefetch's arg. Original commit: KhronosGroup/SPIRV-LLVM-Translator@a76f24e
Spec: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_prefetch.asciidoc
The joint_prefetch functions will be done is a separate patch.
This implementation also requires a patch for llvm-spirv translator. SPIRVWriter should handle these new annotations and create the appropriate decorations in spv representation.