[SYCL][CUDA] Minor fixes required to run BabelStream benchmarks on CUDA #1543

Ruyk · 2020-04-17T15:20:36Z

Minor fixes to CUDA backend to get BabelStream (https://github.com/UoB-HPC/BabelStream), each one on individual commits:

Fixed: default selection for CUDA devices in presence of the NVIDIA OpenCL platform
Fixed: Missing return event on Unmap
Throw error in queue constructor when backend is not correct (currently segfaults)

sycl/plugins/cuda/pi_cuda.cpp

smaslov-intel · 2020-04-19T02:15:28Z

sycl/source/detail/program_manager/program_manager.cpp

@@ -101,7 +101,8 @@ static RT::PiProgram createBinaryProgram(const ContextImplPtr Context,
  Plugin.call<PiApiKind::piPlatformGetInfo>(Platform, PI_PLATFORM_INFO_NAME,
                             PlatformName.size(), PlatformName.data(), nullptr);
  if (PlatformNameSize > 0u &&
-      std::strncmp(PlatformName.data(), "NVIDIA CUDA", PlatformNameSize) == 0) {
+      std::strncmp(PlatformName.data(), "NVIDIA CUDA BACKEND",


There are 3 occurrence of this literal string, can you turn it into a macro at least?

Shouldn't the lines 91-108 become a run-time check that the active selected BE (Plugin.getBackend()) is CUDA?

There are 3 occurrence of this literal string, can you turn it into a macro at least?

Yes, but I am unsure where to set that macro that is shared between the SYCL RT and the PI CUDA plugin, but doesnt create a dependency on the SYCL RT to a header in the CUDA plugin.

Maybe I can just add it to the common pi header as a CUDA extension.

Shouldn't the lines 91-108 become a run-time check that the active selected BE (Plugin.getBackend()) is CUDA?

Yes, but that should be on a separate patch I think

Yes, but that should be on a separate patch I think

Why? If you do so then you don't need many of the changes you are doing in this PR.

I've changed it to use the is_cuda check from platform. Plugin.getBackend() is not yet on sycl branch.

getBackend() is in now, please use it instead

sycl/source/detail/queue_impl.hpp

smaslov-intel · 2020-04-19T02:30:49Z

sycl/source/device_selector.cpp

    // If using PI_CUDA, don't accept a non-CUDA device
-    if (platformVersion.find("CUDA") == std::string::npos &&
-        backend == "PI_CUDA") {
+    if (HasCudaString && HasOpenCLString && backend == "PI_CUDA") {


This entire code fragment (lines 37-58) is in conflict with #1522, where SYCL_BE is used as the preferred not as the forced one. For example, if the target device type is CPU, which CUDA PI plugin does not support and users specified SYCL_BE=PI_CUDA , then with this code no device will be found and an exception thrown, while in that code it will use the plugin where the CPU is supported, i.e. OpenCL. I think both semantics are useful and we should probably fork SYCL_BE into SYCL_PI_FORCE & SYCL_PI_PREFER.

The only thing that line 51 does is that, if a platform has both the CUDA string and the OpenCL string, it is rejected if , and only if, the desired backend is PI_CUDA. This prevents the selection of the NVIDIA OpenCL platform when the user wants to use the PI CUDA backend. Everything else should work in the same way. I don't think there is a need to fork the env. variables at this point.

Sorry, I misread the code. It is not rejecting the non-CUDA devices under SYCL_BE=PI_CUDA as I initially thought. It is still to be merged properly with the code coming in #1522 (tagging @againull). Also it should be taking platform.getBackend() instead of checking platform name.

Should I wait until #1522 and #1490 are merged? I can't use the platform.getBackend() until then.
Alternatively I can use the is_cuda until the other PRs are there.

i've rebased this on top of the mentioned patches so it uses the get plugin and the backend

sycl/source/detail/platform_impl.hpp

bjoernknafla · 2020-04-22T16:49:16Z

With the map/unmap fix you should see the following LIT test XPASS (which fails LIT): sycl/test/scheduler/DataMovement.cpp

To make it pass, please remove the // XFAIL: cuda and the comment line // TODO: Fix accidential error return when unmapping read-only memory objects. from it.

Signed-off-by: Ruyman Reyes <[email protected]>

Implemented a comparison operator for the plugin class in SYCL RT: Two plugins are equal if the pointer to their string is the same. plugin constructor marked explicit to avoid accidental implicit conversions. Signed-off-by: Ruyman Reyes <[email protected]>

Constructor of the SYCL queue throws an exception if the device passed in is from a different backend than the context that is associated with the queue. Signed-off-by: Ruyman Reyes <[email protected]>

romanovvlad

LGTM. Please do not force push next time, use merge.

Signed-off-by: Ruyman Reyes <[email protected]>

Fixes previous incorrect usage of version Signed-off-by: Ruyman Reyes <[email protected]>

Signed-off-by: Ruyman Reyes <[email protected]>

smaslov-intel · 2020-04-27T18:00:33Z

sycl/source/detail/plugin.hpp

+/// \ingroup sycl_pi
+///
+inline bool operator==(const plugin &lhs, const plugin &rhs) {
+  return (lhs.getPiPlugin().PluginVersion == rhs.getPiPlugin().PluginVersion);


Should this just check the MBackend (getBackend) of the plugins are the same.

smaslov-intel · 2020-04-27T18:01:27Z

sycl/source/detail/program_manager/program_manager.cpp

@@ -101,7 +101,8 @@ static RT::PiProgram createBinaryProgram(const ContextImplPtr Context,
  Plugin.call<PiApiKind::piPlatformGetInfo>(Platform, PI_PLATFORM_INFO_NAME,
                             PlatformName.size(), PlatformName.data(), nullptr);
  if (PlatformNameSize > 0u &&
-      std::strncmp(PlatformName.data(), "NVIDIA CUDA", PlatformNameSize) == 0) {
+      std::strncmp(PlatformName.data(), "NVIDIA CUDA BACKEND",


getBackend() is in now, please use it instead

sycl/source/detail/program_manager/program_manager.cpp

The NVIDIA OpenCL platform is problematic for NVIDIA CUDA backend users and for overall DPCPP users, since it doesnt work straight away and is typically selected on OpenCL backend as a preference. This patch removes the CUDA OpenCL platform from the device selection, and prevents it from being used in the lit testing. Signed-off-by: Ruyman Reyes <[email protected]>

sycl/source/device_selector.cpp

Defines two plugins being equal if their backend types are the same. Signed-off-by: Ruyman Reyes <[email protected]>

hiaselhans · 2020-04-29T11:26:48Z

sycl/source/device_selector.cpp

+                "Type is not the same");
+
+  // If no preference, assume OpenCL and reject CUDA backend
+  if (BackendType == backend::cuda && !BackendPref) {


this always disqualifies cuda devices when SYCL_BE is not set, right...?

Yes, OpenCL is the preferred path...

hiaselhans · 2020-04-29T11:28:20Z

sycl/source/device_selector.cpp

+  } else if (!BackendPref)
+    return false;
+
+  // If using PI_CUDA, don't accept a non-CUDA device


couldn't we just do:

if (BackendType != *BackendPref) return true;

CUDA OpenCL platform is ignored on get_devices as suggested by feedback Signed-off-by: Ruyman Reyes <[email protected]>

Signed-off-by: Ruyman Reyes <[email protected]>

hiaselhans · 2020-04-30T08:15:40Z

@Ruyk i tried your patch and i think there hasn't yet been a clear decision on how it's supposed to work.

I read an argument some time ago by @bader that every device should be shown in case someone just uses sycl as a library to fetch a device list. that would speak against completely hiding away the nvidia opencl device. Is there a consensus on wheater devices should be completely hidden or just not selected?
With your patch it is absolutely necessary to have SYCL_BE=PI_CUDA set in order to run on nvidia devices. There is no way for the sane user (developer) of this library to choose a cuda device in his device_selector at runtime, the only way is to use env. Imagine someone having 2 queues. One using the nvidia backend the other one using an opencl device. How would that be possible at all?
We are now introducing extra logic to hide away the nvidia opencl device. In fact the opencl implementation doesn't support opencl<2 and there are plenty of those devices out there. So why don't we treat them the same? why all the backend-specific logic all over the place? We could have a PI call to the backend asking if a device is supported or not?

just my 2cts but i'm opinionated on this and need to yell it. sorry :)

Ruyk · 2020-04-30T08:58:00Z

@Ruyk i tried your patch and i think there hasn't yet been a clear decision on how it's supposed to work.

I read an argument some time ago by @bader that every device should be shown in case someone just uses sycl as a library to fetch a device list. that would speak against completely hiding away the nvidia opencl device. Is there a consensus on wheater devices should be completely hidden or just not selected?

The NVIDIA OpenCL platform is unusable with DPC++. We put some effort on our side to make it work initially in DPC++ and also in ComputeCpp. The reality is it creates more problem than it solves, specially if you have a proper CUDA backend. The moment the NVIDIA OpenCL platform is available on the system (e.g., via the ICD loader), it becomes the default device selected in most implementations (its a GPU so score is higher than CPU and host). However, the default options for compilation won't work: you need to pass the specific triple for the OpenCL PTX variant.. Even with that triple passed in, there are still problems in the compiler side and the code generated is not always correct for SYCL.

Our current thinking is that, unless someone puts effort on supporting the OpenCL NVIDIA platform on DPC++, is better to disable it all together and simplify using SYCL applications on CUDA as much as we can.

With your patch it is absolutely necessary to have SYCL_BE=PI_CUDA set in order to run on nvidia devices. There is no way for the sane user (developer) of this library to choose a cuda device in his device_selector at runtime, the only way is to use env. Imagine someone having 2 queues. One using the nvidia backend the other one using an opencl device. How would that be possible at all?

That is still possible. Selecting the PI_CUDA backend doesnt prevent other devices running in the OpenCL one, it only ensures the selection of the CUDA backend is possible. Its a preferred backend when available.
Note however that there are driver issues that prevent using more than one binary format, so this will still fail at runtime.

Hopefully once the CUDA backend is no longer experimental and we have a more clear interface for multiple backends in the next SYCL specification, the environment variable will not be necessary.

We are now introducing extra logic to hide away the nvidia opencl device. In fact the opencl implementation doesn't support opencl<2 and there are plenty of those devices out there. So why don't we treat them the same? why all the backend-specific logic all over the place? We could have a PI call to the backend asking if a device is supported or not?

There are other OpenCL 1.2 implementations that support DPC++, POCL or ComputeAorta (Codeplay's implementation) are some of them. They can consume SPIRV and don't create conflicts on the system. When deploying either of the two, they don't pollute the include or library paths forcing them to be defaulted.

I could be convinced to filter for SPIR-V support, but I think that could be problematic for OpenCL platforms that work with AOT compilation, since they may not report SPIR-V and expect only AOT.

just my 2cts but i'm opinionated on this and need to yell it. sorry :)

No need to be sorry! its good to have an open discussion about this :-)

Ruyk · 2020-04-30T09:05:35Z

With your patch it is absolutely necessary to have SYCL_BE=PI_CUDA set in order to run on nvidia devices. There is no way for the sane user (developer) of this library to choose a cuda device in his device_selector at runtime, the only way is to use env. Imagine someone having 2 queues. One using the nvidia backend the other one using an opencl device. How would that be possible at all?

That is still possible. Selecting the PI_CUDA backend doesnt prevent other devices running in the OpenCL one, it only ensures the selection of the CUDA backend is possible. Its a preferred backend when available.
Note however that there are driver issues that prevent using more than one binary format, so this will still fail at runtime.

Actually you are correct, something went weird on my latest merge, this line https://github.com/intel/llvm/pull/1543/files#diff-8822f406c48f60f8a03ef19ad896145fR207 is not used, and this https://github.com/intel/llvm/pull/1543/files#diff-8822f406c48f60f8a03ef19ad896145fR213 shouldn't be there...

I'll fix it

* Removes NVIDIA OpenCL from the available list of platforms * CUDA backend is available only if SYCL_BE=PI_CUDA is set Signed-off-by: Ruyman Reyes <[email protected]>

Ruyk · 2020-04-30T12:24:36Z

I've fixed the bad merge, code is a bit cleaner now.

hiaselhans · 2020-04-30T19:10:34Z

I'm at ad3951cb31899cdac307b12ef13114ba64064d71 now. latest i think...

The NVIDIA OpenCL platform is unusable with DPC++. We put some effort on our side to make it work initially in DPC++ and also in ComputeCpp. The reality is it creates more problem than it solves, specially if you have a proper CUDA backend. The moment the NVIDIA OpenCL platform is available on the system (e.g., via the ICD loader), it becomes the default device selected in most implementations (its a GPU so score is higher than CPU and host). However, the default options for compilation won't work: you need to pass the specific triple for the OpenCL PTX variant.. Even with that triple passed in, there are still problems in the compiler side and the code generated is not always correct for SYCL.

Our current thinking is that, unless someone puts effort on supporting the OpenCL NVIDIA platform on DPC++, is better to disable it all together and simplify using SYCL applications on CUDA as much as we can.

I have no problem with not supporting nvidia's opencl implementation. I guess it's a mess, but so are other vendor's implementations. amdgpu-pro is stuck with 1.x aswell.

That is still possible. Selecting the PI_CUDA backend doesnt prevent other devices running in the OpenCL one, it only ensures the selection of the CUDA backend is possible. Its a preferred backend when available.
Note however that there are driver issues that prevent using more than one binary format, so this will still fail at runtime.

From reading the code i was pretty sure that's not the case. Especially with what's after line 220:
// If no preference, assume OpenCL and reject CUDA
I still tried:

Hopefully once the CUDA backend is no longer experimental and we have a more clear interface for multiple backends in the next SYCL specification, the environment variable will not be necessary.

i hope so too :)
But with your patch we move one step away from making SYCL_BE optional.
In the sycl branch i can just select the cuda device and all is fine. I tried with your patch and without setting the env var the cuda device is completely hidden from the selector. On another machine with just an intel i7 and SYCL_BE=PI_CUDA that opencl device is completely hidden too.

There are other OpenCL 1.2 implementations that support DPC++, POCL or ComputeAorta (Codeplay's implementation) are some of them. They can consume SPIRV and don't create conflicts on the system. When deploying either of the two, they don't pollute the include or library paths forcing them to be defaulted.

I could be convinced to filter for SPIR-V support, but I think that could be problematic for OpenCL platforms that work with AOT compilation, since they may not report SPIR-V and expect only AOT.

Right now there's this code:

llvm/sycl/source/detail/program_manager/program_manager.cpp

Line 276 in 1a5165e

C.get_platform().get_info<info::platform::version>() >= "2.1")

This seems to raise when the device only supports opencl < 2.1
Does it work in it's current state with POCL or ComputeAorta?

I think after all it should be the backend to decide if a device is supported by it or not.
In my dreams we were just introducing a PI call bool PiSupportsDevice() and call it in device_selector. then just rate unsupported devices with -1.
The above code could end up in that call in opencl plugin...

It's just my personal preference on how to treat this env variable but it's probably a quite fundamental decision that has to be taken at some point...?

bader · 2020-04-30T20:01:50Z

Interesting discussion.
One observation: we already have a mechanism to "disable" unsupported devices using "SYCL_DEVICE_ALLOWLIST". If I understand it correctly you can provide a configuration file with some regular expression to filter out unsupported OpenCL implementations. @Ruyk, will this work for you?

Ruyk · 2020-05-01T11:42:35Z

@bader

Interesting discussion.
One observation: we already have a mechanism to "disable" unsupported devices using "SYCL_DEVICE_ALLOWLIST". If I understand it correctly you can provide a configuration file with some regular expression to filter out unsupported OpenCL implementations. @Ruyk, will this work for you?

I've seen that mechanism, but what would be needed is the opposite, a "SYCL_DEVICE_DISABLELIST" where you can add the NVIDIA OpenCL backend so its not visible on device selection.
Still, a user wanting to use the CUDA backend would need to always pass the SYCL_DEVICE_DISABLELIST to prevent the OpenCL platform being used on the default selection. Even when writing a custom device selector, since the user does not have access to the backend types, it needs to make sure it selects the CUDA backend and not the CUDA OpenCL platform.

The simplest option for end users is to disable the NVIDIA OpenCL, and prefer the CUDA backend.

I think a separate discussion is whether if we enable the CUDA backend by default or we need the SYCL_BE flag passed to enable it.

@hiaselhans

But with your patch we move one step away from making SYCL_BE optional.
In the sycl branch i can just select the cuda device and all is fine. I tried with your patch and >without setting the env var the cuda device is completely hidden from the selector. On another machine with just an intel i7 and SYCL_BE=PI_CUDA that opencl device is completely hidden too.

The patch should still enable OpenCL devices even with SYCL_BE=PI_CUDA, but i'll run some more tests to make sure its the case. There may be logic elsewhere in the SYCL RT that uses the SYCL_BE that I have missed.

One of the reasons I lean towards having people "activate" the CUDA backend via env variable is to ensure no accidental usage occurs. We have various reports of people using default selection that get "Invalid binary" or other errors because their CUDA backend is selected above other devices in the system. Unless we build always for all binary targets, or there is a clear SYCL API to select backends, users will have to modify their SYCL 1.2.1 codes with custom selectors to avoid choosing the CUDA backend by accident.

An example of this problem is the MultiDevice lit test (https://github.com/intel/llvm/blob/sycl/sycl/test/scheduler/MultipleDevices.cpp).
The test itself is well written SYCL 1.2.1 code. A function takes two queues, submits work to both of them and does some checks. It even prints "Good computation" at the end for better user experience.
Now, the test will always fail if you have more than one active backend.
The test is build for one target (SPIR, or PTX), so only one of the binaries is available. But requires two devices to run, so once there are two backends, only one will have the right binary.

Furthermore the test uses standard SYCL selectors,

llvm/sycl/test/scheduler/MultipleDevices.cpp

Lines 95 to 97 in 5e21b71

    
           host_selector hostSelector; 
        
           cpu_selector CPUSelector; 
        
           gpu_selector GPUSelector;

Those selectors are perfectly valid selectors, but the GPU selector will select the CUDA backend by default.

So that is a perfectly valid SYCL code that once the CUDA backend is enabled by default, it fails with an "invalid device type".
If we only enable the CUDA backend once the SYCL_BE is there, then the perfectly valid SYCL 1.2.1 code still works.
Obviously, this is a test and we could just modify it to handle the CUDA backend somehow. Then it comes the second problem, there is no SYCL API to select a backend. You need to query the device and platform information to ensure you don't pick the CUDA backend.
If we disable the NVIDIA OpenCL CUDA platform then, well, the user can reject any device that has the CUDA BACKEND string on the platform name. But then, a valid SYCL code needs to be modified to cater an implementation-specific behavior.
This will get more complicated if other backends are added, because now users must write device selectors that exclude any other incoming backend.

I see a couple of options moving forward (all of them assuming we ban the NVIDIA OpenCL platform):

We disable CUDA backend by default, need to pass SYCL_BE to enable it (this should not prevent using CUDA and OpenCL devices together, i'll take a look to my patch again). Once SYCL 2020 is there, mechanism for device selection with backends can be used to deal with this problem properly. On SYCL 1.2.1, the default is OpenCL, and CUDA is an opt-in experience.
DPC++ builds for both SPIRV and PTX by default, so all binaries are always available. Then, the SYCL_BE is only used to prevent using the CUDA backend in some cases. This has effects on binary sizes and compilation times, and probably building and deployment implications (e.g., CUDA dependencies are always needed when building DPC++ as BUILD_PI_CUDA would be to be enabled). There is work to get to this point, since currently the clang driver has several issues with multiple backends. This makes CUDA is an opt-out experience. Unclear how to scale this with other backends that may have their own binary formats.
We enable both CUDA and OpenCL as it is now by default, and let users deal with the consecuences. When a system has a CUDA platform and a default selector, some SYCL applications will fail if they haven't been built with PTX and SPIRV support. Note then that an application build and tested on a system with SPIRV devices will suddenly fail if run on a platform that exposes an NVIDIA platform. CUDA is a silently-always-on option.

I see good and bad points on all of them.
To me (3) is a user-experience problem that would lead to more bug reports, but facilitates the usage of the CUDA backend and the experience of advanced users.
(1) is simpler to implement at the moment, but CUDA users may find it difficult to use the CUDA backend.
(2) Is the simplest default user experience, but building times can easily double. This has implications on testing times, among others.

I think we need opinions of a few people (@bader , @keryell , @bjoernknafla , @romanovvlad , @pvchupin to list a few) before we make a decision on what the default experience for DPC++ (or clang sycl) should be.

bader · 2020-05-07T10:40:27Z

I think we need opinions of a few people (@bader , @keryell , @bjoernknafla , @romanovvlad , @pvchupin to list a few) before we make a decision on what the default experience for DPC++ (or clang sycl) should be.

Agree. We need a separate discussion on what is default logic for default device selection and how it can be configured by user (e.g. define SYCL_DEVICE_TYPE and SYCL_BE impact).

@hiaselhans, @smaslov-intel, are you okay to commit this PR and address your questions separately, if they are not addressed already?

@Ruyk, it's better to send multiple fixes in separate PRs. Long discussion for a single item blocks merging the whole PR i.e. N - 1 of N fixes.

hiaselhans · 2020-05-07T12:50:38Z

i think i am the last to actually decide on such a matter. I just wanted to point out that this PR actually DOES change the behaviour by introducing that isDeviceInvalidForBe fn

In the upstream/sycl branch setting SYCL_BE already hides all devices other than the selected backend. i guess it's all we ever need for those lit tests mentioned.

With an intel opencl device on upstream/sycl:

> SYCL_BE=PI_CUDA ./sycl
SYCL host device (/1.2): -1
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  No device of requested type available. -1 (CL_DEVICE_NOT_FOUND)
sycl.sh: Zeile 21: 361549 Abgebrochen             (Speicherabzug geschrieben) SYCL_BE=PI_CUDA $BUILD_PATH/$FILE
> ./sycl
Intel(R) Gen9 HD Graphics NEO (Intel(R) Corporation/OpenCL 2.1 NEO ): 10
gfx804 (Advanced Micro Devices, Inc./OpenCL 1.2 AMD-APP (2906.7)): -1
SYCL host device (/1.2): -1
The results are correct!

before this pr:

SYCL_BE empty -> cuda and opencl backend (and devices) are active
SYCL_BE=PI_CUDA -> only cuda backend
SYCL_BE=PI_OPENCL -> only opencl backend

after this pr:

SYCL_BE empty -> only opencl backend
SYCL_BE=PI_CUDA -> only cuda backend
SYCL_BE=PI_OPENCL -> only opencl backend

So we remove the posibility to have more than one backend active by requiring to actively enable the cuda plugin (which deactivates opencl).
From those 30 lines of isDeviceInvalidForBe I think that:

we already have 80% of the logic in place (activating only one backend with SYCL_BE).
10% of it is requiring to manually activate CUDA backend at runtime (which i dont like) and another
10% to disable opencl devices with 'OpenCL' and 'CUDA' in their caption (which i also don't like).
In a way there's no difference between that 'gfx804' device that i have and the "opencl cuda" device. They are both unsupported. one shows up, one doesn't. It's a bit inconsistent and inconsequent in my eyes

bader · 2020-05-07T15:23:11Z

sycl/source/detail/platform_impl.cpp

+  const bool HasOpenCL = (platformVersion.find("OpenCL") != std::string::npos);
+  const bool HasCUDA = (platformVersion.find("CUDA") != std::string::npos);
+
+  backend *PrefBackend = detail::SYCLConfig<detail::SYCL_BE>::get();


Suggested change

backend *PrefBackend = detail::SYCLConfig<detail::SYCL_BE>::get();

Let's move this like after rejecting NVIDIA OpenCL impementation.

bader · 2020-05-07T15:24:02Z

sycl/source/detail/platform_impl.cpp

+  // Reject the NVIDIA OpenCL implementation
+  if (DeviceBackend == backend::opencl && HasCUDA && HasOpenCL)
+    return true;
+


Suggested change

backend *PrefBackend = detail::SYCLConfig<detail::SYCL_BE>::get();

bader · 2020-05-07T15:29:34Z

sycl/source/detail/platform_impl.cpp

+  // If no preference, assume OpenCL and reject CUDA
+  if (DeviceBackend == backend::cuda && !PrefBackend) {
+    return true;
+  } else if (!PrefBackend)
+    return false;
+
+  // If using PI_OPENCL, reject the CUDA backend
+  if (DeviceBackend == backend::cuda && *PrefBackend == backend::opencl)
+    return true;
+
+  return false;
+}


Suggested change

// If no preference, assume OpenCL and reject CUDA

if (DeviceBackend == backend::cuda && !PrefBackend) {

return true;

} else if (!PrefBackend)

return false;

// If using PI_OPENCL, reject the CUDA backend

if (DeviceBackend == backend::cuda && *PrefBackend == backend::opencl)

return true;

return false;

}

// If no preference, assume OpenCL and reject CUDA

if (DeviceBackend == backend::cuda && (!PrefBackend || *PrefBackend == backend::opencl))

return true;

return false;

}

Minor fixes to CUDA backend to get BabelStream (https://github.com/UoB-HPC/BabelStream), each one on individual commits:

Fixed: default selection for CUDA devices in presence of the NVIDIA OpenCL platform

Fixed: Missing return event on Unmap

Throw error in queue constructor when backend is not correct (currently segfaults)

From the PR description, this piece of code seems to be "not required to run BabelStream benchmarks on CUDA" and removing NVIDIA OpenCL implementation is enough. Right?

If so, @Ruyk, can we move this part to a separate PR to make @hiaselhans happy?

Sure, i'll split out the changes

not required to run BabelStream benchmarks on CUDA" and removing NVIDIA OpenCL > implementation is enough. Right?

Actually, babel stream doesnt use a device selector but a device list, so the NVIDIA OpenCL still appears. There are other workarounds I can do so not a problem.

If so, @Ruyk, can we move this part to a separate PR to make @hiaselhans happy?

thanks @bader 😄
well, maybe it's me being a dickhead with this.

llvm/sycl/source/device.cpp

Line 54 in cf65794

// If SYCL_BE is set then skip platforms which doesn't have specified

just saying we're re-implementing things already implemented somewhere else...

Ruyk · 2020-05-08T11:16:22Z

I am closing this one and submitting separate ones shortly.

#1543) Original commit: KhronosGroup/SPIRV-LLVM-Translator@40fd741

Move the LLVM components to LINK_COMPONENTS because the DEPENDS list has the same semantics as add_dependencies(). In this case it doesn't include the LLVM components when calling the linker. It's almost complete revert of #1543 Original commit: KhronosGroup/SPIRV-LLVM-Translator@cf5a5a4

Ruyk requested review from kbobrovs, smaslov-intel and a team as code owners April 17, 2020 15:20

Ruyk requested a review from sergey-semenov April 17, 2020 15:20

Ruyk self-assigned this Apr 17, 2020

Ruyk added the cuda CUDA back-end label Apr 17, 2020

smaslov-intel suggested changes Apr 19, 2020

View reviewed changes

Ruyk mentioned this pull request Apr 21, 2020

CUDA version simultaneously needs and cannot have OpenCL #1559

Closed

bjoernknafla reviewed Apr 22, 2020

View reviewed changes

sycl/source/detail/platform_impl.hpp Outdated Show resolved Hide resolved

bjoernknafla mentioned this pull request Apr 22, 2020

Re-enable CUDA tests #1542

Closed

Ruyk force-pushed the fix-default-selector-cuda branch from 1d08181 to 8e8fa6f Compare April 23, 2020 09:04

Ruyk added 3 commits April 23, 2020 10:17

[SYCL][CUDA] Fix for default selection of CUDA devices

1757da8

Signed-off-by: Ruyman Reyes <[email protected]>

[SYCL][CUDA] Missing return event on map/unmap

aa384ae

Signed-off-by: Ruyman Reyes <[email protected]>

Ruyk force-pushed the fix-default-selector-cuda branch from 8e8fa6f to 16c2b65 Compare April 23, 2020 10:18

[SYCL][PI] Queue constr. throws if invalid backend

195414f

Constructor of the SYCL queue throws an exception if the device passed in is from a different backend than the context that is associated with the queue. Signed-off-by: Ruyman Reyes <[email protected]>

Ruyk force-pushed the fix-default-selector-cuda branch from 16c2b65 to 195414f Compare April 23, 2020 11:44

romanovvlad previously approved these changes Apr 23, 2020

View reviewed changes

Ruyk added 2 commits April 23, 2020 15:20

[SYCL][CUDA] Re-enabled DataMovement test for CUDA

5f611cb

Signed-off-by: Ruyman Reyes <[email protected]>

[SYCL][CUDA] Using name to check for CUDA platform

09b9859

Fixes previous incorrect usage of version Signed-off-by: Ruyman Reyes <[email protected]>

Ruyk dismissed romanovvlad’s stale review via 09b9859 April 24, 2020 10:26

Merge branch 'sycl' into fix-default-selector-cuda

a907581

Signed-off-by: Ruyman Reyes <[email protected]>

bader requested a review from smaslov-intel April 27, 2020 17:46

smaslov-intel reviewed Apr 27, 2020

View reviewed changes

smaslov-intel suggested changes Apr 27, 2020

View reviewed changes

sycl/source/detail/program_manager/program_manager.cpp Outdated Show resolved Hide resolved

smaslov-intel reviewed Apr 29, 2020

View reviewed changes

sycl/source/device_selector.cpp Outdated Show resolved Hide resolved

Ruyk mentioned this pull request Apr 29, 2020

[SYCL] Always let the backend choose the binary #1587

Merged

[SYCL][PI] Equality of plugins per backend

70a3e08

Defines two plugins being equal if their backend types are the same. Signed-off-by: Ruyman Reyes <[email protected]>

Merge branch 'sycl' into fix-default-selector-cuda

a659354

bader requested a review from smaslov-intel April 29, 2020 10:17

hiaselhans reviewed Apr 29, 2020

View reviewed changes

Ruyk added 2 commits April 29, 2020 20:33

[SYCL][CUDA] CUDA OpenCL platform ignored in get_devices

17480e0

CUDA OpenCL platform is ignored on get_devices as suggested by feedback Signed-off-by: Ruyman Reyes <[email protected]>

[SYCL] clang-format affected files

66810be

Signed-off-by: Ruyman Reyes <[email protected]>

[SYCL][CUDA] CUDA Device selection changes

ad3951c

* Removes NVIDIA OpenCL from the available list of platforms * CUDA backend is available only if SYCL_BE=PI_CUDA is set Signed-off-by: Ruyman Reyes <[email protected]>

Ruyk force-pushed the fix-default-selector-cuda branch from b7b7b98 to ad3951c Compare April 30, 2020 12:23

bader requested a review from romanovvlad May 7, 2020 10:19

bader mentioned this pull request May 7, 2020

[SYCL][CUDA] Fix LIT fails on machines without non-NVIDIA OpenCL #1613

Merged

bader reviewed May 7, 2020

View reviewed changes

Ruyk closed this May 8, 2020

Ruyk mentioned this pull request May 11, 2020

[SYCL][CUDA] Default selector behaviour #1665

Closed

bb-sycl pushed a commit that referenced this pull request Sep 15, 2022

Add support for toolchain compilation with LLVM_LINK_LLVM_DYLIB option (

4447e58

#1543) Original commit: KhronosGroup/SPIRV-LLVM-Translator@40fd741



	backend *PrefBackend = detail::SYCLConfig<detail::SYCL_BE>::get();

[SYCL][CUDA] Minor fixes required to run BabelStream benchmarks on CUDA #1543

[SYCL][CUDA] Minor fixes required to run BabelStream benchmarks on CUDA #1543

Conversation

Ruyk commented Apr 17, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ruyk Apr 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjoernknafla commented Apr 22, 2020

romanovvlad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hiaselhans commented Apr 30, 2020

Ruyk commented Apr 30, 2020

Ruyk commented Apr 30, 2020

Ruyk commented Apr 30, 2020

hiaselhans commented Apr 30, 2020

bader commented Apr 30, 2020

Ruyk commented May 1, 2020

bader commented May 7, 2020

hiaselhans commented May 7, 2020

Choose a reason for hiding this comment

bader May 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ruyk commented May 8, 2020

Ruyk Apr 23, 2020 •

edited

Loading

bader May 7, 2020 •

edited

Loading