[SYCL][CUDA] Initial CUDA backend support #1091

Alexander-Johnston · 2020-02-03T15:17:16Z

Initial support of CUDA backend bringing roughly 40% of SYCL 1.2.1 CTS conformance on devices that support CUDA 10.2.

AlexeySachkov · 2020-02-03T15:32:24Z

clang/lib/Driver/ToolChains/Cuda.cpp

@@ -61,7 +61,7 @@ static CudaVersion ParseCudaVersionFile(llvm::StringRef V) {
    return CudaVersion::CUDA_92;
  if (Major == 10 && Minor == 0)
    return CudaVersion::CUDA_100;
-  if (Major == 10 && Minor == 1)
+  if (Major == 10 && (Minor == 1 || Minor == 2))


Could you please put some comment with explanations why it is okay to map 10.2 to CUDA_101?

ToT Clang lags behind on released CUDA versions.

We are hoping upstream clang catches up with 10.2 support, this change allows to use it on the meantime. Their release notes and our testing shows 10.2 can work as if 10.1 and there are no regressions. Otherwise, the default installation of many people simply fails to compile any SYCL for CUDA application.

Suggested change

if (Major == 10 && (Minor == 1 || Minor == 2))

// Enable CUDA 10.2 toolkit acting as 10.1 until tip clang catches up

if (Major == 10 && (Minor == 1 || Minor == 2))

Actually, upstream llvm patch llvm/llvm-project@12fefee (https://reviews.llvm.org/D73231) is a better fix, so we won't need once this lands into the intel/llvm fork

AlexeySachkov · 2020-02-03T15:37:23Z

clang/lib/Frontend/InitPreprocessor.cpp

+    if (!getenv("DISABLE_INFER_AS"))
+      Builder.defineMacro("__SYCL_ENABLE_INFER_AS__", "1");


In presence of #893, this change is not needed anymore

AlexeySachkov · 2020-02-03T15:42:39Z

llvm/CMakeLists.txt

@@ -462,6 +462,10 @@ if( LLVM_USE_PERF )
  endif( NOT CMAKE_SYSTEM_NAME MATCHES "Linux" )
 endif( LLVM_USE_PERF )

+option(SYCL_BUILD_PI_CUDA


I believe that this should be in llvm/sycl/CMakeLists.txt. Is there any reason it cannot be put in there?

We ifdef in/out SYCL NVPTX support in the clang Driver based on this variable, which is why we put it in llvm/CMakeLists.txt rather than in llvm/sycl/CMakeLists.txt

The name of the option is confusing in this case. PI has nothing to do with the driver.
Why do we need ifdef out SYCL NVPTX support in the clang driver?

We don't need to, but it helps prevent users accidentally compiling kernels to the wrong target for the plugins they have available.

Are you able to compile kernels for PTX w/o CUDA toolchain?

I so, we could use this for testing SYCL for CUDA compiler using cross-compilation on Intel HW.

I believe clang will fail to produce PTX kernels without a CUDA toolchain installed.

We could enable support for the SYCL cuda toolchain if the NVPTX target is added, so it has the same requirements as the CUDA support in llvm? I suggest we do that on a separate patch

Note, the code added under ifdef is not covered by CI system, so it might be broken.
It seems to me that in order to enable compilation for NVPTX target we need just a few changes. Am I wrong?

AlexeySachkov · 2020-02-03T15:59:04Z

sycl/source/CMakeLists.txt

@@ -4,6 +4,7 @@
 #cmake_policy(SET CMP0057 NEW)
 #include(AddLLVM)

+


Suggested change

AlexeySachkov · 2020-02-03T15:59:30Z

sycl/source/cg.cpp

+  }
+
+}  // sycl
+}  // sycl


Suggested change

} // sycl

} // cl

fwyzard · 2020-02-04T00:07:48Z

sycl/include/CL/sycl/backend/cuda.hpp

+//
+//===----------------------------------------------------------------------===//
+
+namespace cl {


Suggested change

namespace cl {

#include <CL/sycl/detail/defines.hpp>

__SYCL_INLINE namespace cl {

should fix the build error

[ 2%] Building CXX object tools/sycl/plugins/cuda/CMakeFiles/pi_cuda.dir/pi_cuda.cpp.o In file included from /data/user/fwyzard/sycl/llvm/sycl/include/CL/sycl/detail/pi.hpp:13, from /data/user/fwyzard/sycl/llvm/sycl/plugins/cuda/pi_cuda.cpp:10: /data/user/fwyzard/sycl/llvm/sycl/include/CL/sycl/detail/common.hpp:25:25: error: inline namespace must be specified at initial definition __SYCL_INLINE namespace cl { ^~ In file included from /data/user/fwyzard/sycl/llvm/sycl/plugins/cuda/pi_cuda.cpp:9: /data/user/fwyzard/sycl/llvm/sycl/include/CL/sycl/backend/cuda.hpp:9:11: note: 'cl' defined here namespace cl { ^~

garimagu · 2020-02-05T19:33:11Z

sycl/include/CL/sycl/detail/pi_cuda.hpp

@@ -0,0 +1,479 @@
+//===-- pi_cuda.hpp - CUDA Plugin -----------------------------------------===//


Can you move this file to plugins/cuda/ ? It is the better place for it.

alexbatashev · 2020-02-06T07:05:50Z

sycl/source/context.cpp

@@ -23,25 +23,31 @@

 __SYCL_INLINE namespace cl {
 namespace sycl {
-context::context(const async_handler &AsyncHandler)
-    : context(default_selector().select_device(), AsyncHandler) {}
+context::context(const async_handler &AsyncHandler, bool usePrimaryContext)


Nit:

Suggested change

context::context(const async_handler &AsyncHandler, bool usePrimaryContext)

context::context(const async_handler &AsyncHandler, bool UsePrimaryContext)

Same applies to the rest of the file.

alexbatashev · 2020-02-06T07:10:38Z

sycl/CMakeLists.txt

@@ -190,7 +193,6 @@ add_subdirectory( source )
 # SYCL toolchain builds all components: compiler, libraries, headers, etc.
 add_custom_target( sycl-toolchain
  DEPENDS ${SYCL_RT_LIBS}
-          pi_opencl


Why removing OpenCL support?

The pi_opencl dependency is added to sycl-toolchain in the opencl plugin cmake sycl/plugins/opencl/CMakeLists.txt in 884459b#diff-edef34e019c51a57812a69f7a111afdcR22

alexbatashev · 2020-02-06T07:12:25Z

sycl/include/CL/sycl/detail/cg.hpp

+  std::function<void(cl::sycl::interop_handler)> MFunc;
+
+public:
+  InteropTask(std::function<void(cl::sycl::interop_handler)> Func)


Suggested change

InteropTask(std::function<void(cl::sycl::interop_handler)> Func)

InteropTask(function_class<void(cl::sycl::interop_handler)> Func)

alexbatashev · 2020-02-06T07:33:18Z

sycl/include/CL/sycl/detail/queue_impl.hpp

@@ -48,7 +48,7 @@ class queue_impl {
             QueueOrder Order, const property_list &PropList)
      : queue_impl(Device,
                   detail::getSyclObjImpl(
-                       context(createSyclObjFromImpl<device>(Device))),
+                       context(createSyclObjFromImpl<device>(Device), {}, true)),


Nit: {} and true seem to me some "magic" values. Can you please add comments here?

alexbatashev · 2020-02-06T07:33:44Z

sycl/include/CL/sycl/detail/scheduler/scheduler.hpp

@@ -77,7 +77,7 @@ class Scheduler {
  // releaseHostAccessor is called.
  // Returns an event which indicates when these nodes are completed and host
  // accessor is ready for using.
-  EventImplPtr addHostAccessor(Requirement *Req);
+  EventImplPtr addHostAccessor(Requirement *Req, const bool destructor = false);


Nit:

Suggested change

EventImplPtr addHostAccessor(Requirement *Req, const bool destructor = false);

EventImplPtr addHostAccessor(Requirement *Req, const bool Destructor = false);

sycl/test/basic_tests/access_to_subset.cpp

alexbatashev · 2020-02-06T07:41:02Z

sycl/include/CL/sycl/handler.hpp

@@ -771,6 +771,15 @@ class handler {
 #endif
  }

+  // Similar to single_task, but passed lambda will be executed on host


Nit: Can you please make doxygen comment out of this?

alexbatashev · 2020-02-06T07:42:58Z

sycl/source/CMakeLists.txt

@@ -80,6 +81,7 @@ set(SYCL_SOURCES
    "detail/usm/usm_dispatch.cpp"
    "detail/usm/usm_impl.cpp"
    "detail/util.cpp"
+    "cg.cpp"


The corresponding header file resides in CL/sycl/detail/. Have you considered putting cg.cpp to source/detail?

alexbatashev · 2020-02-06T07:44:32Z

sycl/source/detail/accessor_impl.cpp

@@ -14,8 +14,9 @@ namespace sycl {
 namespace detail {

 AccessorImplHost::~AccessorImplHost() {
-  if (MBlockedCmd)
+  if (MBlockedCmd) {


AFAIK, it's common practice in LLVM not to wrap single line statement in curly brackets.

alexbatashev · 2020-02-06T07:47:03Z

sycl/include/CL/sycl/context.hpp

+  /// @param useCUDAPrimaryContext is a bool determining whether to use the
+  ///        primary context in the CUDA backend.
+  explicit context(const async_handler &AsyncHandler = {},
+                   bool useCUDAPrimaryContext = false);


Nit:

Suggested change

bool useCUDAPrimaryContext = false);

bool UseCUDAPrimaryContext = false);

AlexeySachkov

Could you please also split this PR into several commits? Even if you created just two commits (with libclc changes and the rest), it would be easier to review

Ruyk · 2020-02-06T12:50:48Z

Thanks for the feedback, we are working in fixing some lit-testing regressions and rebasing on top of the latest changes.

AlexeySachkov · 2020-02-06T14:56:51Z

clang/include/clang/Config/config.h.cmake

@@ -80,6 +80,9 @@
 #cmakedefine01 CLANG_ENABLE_OBJC_REWRITER
 #cmakedefine01 CLANG_ENABLE_STATIC_ANALYZER

+/* Define if we have SYCL PI CUDA support */
+#cmakedefine SYCL_HAVE_PI_CUDA ${SYCL_HAVE_PI_CUDA}


Suggested change

#cmakedefine SYCL_HAVE_PI_CUDA ${SYCL_HAVE_PI_CUDA}

#cmakedefine01 SYCL_HAVE_PI_CUDA

According to the docs it should do the same

Do we really need this define? Can we have "SYCL PI CUDA support" unconditionally?

If we were to have PI CUDA support unconditionally the cuda toolchain will always be required for compilation. We decided to make it optional to allow people who only use the OpenCL plugin to compile the project without a cuda toolchain on their system.

According to my understanding we need CUDA toolchain to build CUDA plugin only.
Could you clarify why we should require CUDA toolchain to build the driver?
https://llvm.org/docs/CompileCudaWithLLVM.html - doesn't seem to require some custom driver.

You are correct, we only need CUDA toolchain for building of the plugin, but we limit the valid SYCL triples in the clang driver based if PI CUDA support is available or not.

https://github.com/intel/llvm/pull/1091/files#diff-beaf25b0cdf8830dd4ea165404b00671R618

static bool isValidSYCLTriple(llvm::Triple T) { #ifdef SYCL_HAVE_PI_CUDA // NVPTX is valid for SYCL. if (T.isNVPTX()) return true; #endif // Check for invalid SYCL device triple values. // Non-SPIR arch. if (!T.isSPIR()) return false; // SPIR arch, but has invalid SubArch for AOT. StringRef A(T.getArchName()); if (T.getSubArch() == llvm::Triple::NoSubArch && ((T.getArch() == llvm::Triple::spir && !A.equals("spir")) || (T.getArch() == llvm::Triple::spir64 && !A.equals("spir64")))) return false; return true; }

We can remove this limitation though and always allow nvptx triples for compilation, regardless of it the CUDA plugin is available.

+1 for removing.

AlexeySachkov · 2020-02-06T15:04:36Z

clang/tools/clc-spirv-mapping-generator/ClcSpirvMappingGenerator.cpp

+    // The translator tries to convert these function to the same name as the
+    // SPIR-V built-in for
+    // another function but with a different signature which results in an
+    // attempt to re-define a
+    // function.
+    "read_imagei",  // conflicts with `read_imagef`
+    "read_imageui", // conflicts with `read_imagef`
+    "read_imageh",  // conflicts with `read_imagef`


BTW, this should be fixed by KhronosGroup/SPIRV-LLVM-Translator#408

AlexeySachkov · 2020-02-06T15:21:59Z

llvm-spirv/lib/SPIRV/OCL20ToSPIRV.cpp

@@ -288,11 +288,31 @@ class OCL20ToSPIRV : public ModulePass, public InstVisitor<OCL20ToSPIRV> {
  Module *M;
  LLVMContext *Ctx;
  unsigned CLVer; /// OpenCL version as major*10+minor
+  unsigned CLLang; /// OpenCL language, see `spv::SourceLanguage`.


Please note that this repo contains a copy of KrhonosGroup/SPIRV-LLVM-Translator which is updated from time to time and usually, by simple copy-pasting the whole directory. So, there is a huge chance that these changes will be overwritten occasionally

Please at least extract these changes to a separate commit, so it can be re-applied at each sync, or (which is much better) commit them directly to KhronosGroup/SPIRV-LLVM-Translator

We will try to commit these directly to the Khronos repository, but will need the changes here until it is merged on the Khronos repository and pulled back down for the CUDA backend to work.

For now the changes to llvm-spirv are separated into 09b3b2e

fwyzard · 2020-02-08T20:26:37Z

The codeplaysoftware:cuda-dev branch fails to build if configured with -DBUILD_SHARED_LIBS=ON:

Scanning dependencies of target prepare_builtins
[ 64%] Building CXX object tools/libclc/utils/CMakeFiles/prepare_builtins.dir/prepare-builtins.cpp.o
[ 64%] Linking CXX executable ../../../bin/prepare_builtins
CMakeFiles/prepare_builtins.dir/prepare-builtins.cpp.o: In function `main':
prepare-builtins.cpp:(.text.startup.main+0x100): undefined reference to `llvm::parseBitcodeFile(llvm::MemoryBufferRef, llvm::LLVMContext&)'
prepare-builtins.cpp:(.text.startup.main+0x46c): undefined reference to `llvm::errorToErrorCodeAndEmitErrors(llvm::LLVMContext&, llvm::Error)'
collect2: error: ld returned 1 exit status
make[3]: *** [tools/libclc/utils/CMakeFiles/prepare_builtins.dir/build.make:90: bin/prepare_builtins] Error 1
make[2]: *** [CMakeFiles/Makefile2:98983: tools/libclc/utils/CMakeFiles/prepare_builtins.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:99597: tools/sycl/CMakeFiles/sycl-toolchain.dir/rule] Error 2
make: *** [Makefile:25159: sycl-toolchain] Error 2

The intel:sycl branch does build fine.

vladimirlaz · 2020-02-10T10:39:44Z

buildbot/configure.py

+        "-DCMAKE_BUILD_TYPE={}".format(args.build_type),
+        "-DLLVM_ENABLE_ASSERTIONS={}".format(llvm_enable_assertions),
+        "-DLLVM_TARGETS_TO_BUILD={}".format(llvm_targets_to_build),
+        "-DLLVM_EXTERNAL_PROJECTS=sycl;llvm-spirv",


opencl-aot project is missed

Is this needed for the overall project as well or just for the testing? Its not mentioned on the Getting Started Guide

@Ruyk opencl-aot is an optional tool to enable AOT compilation for SYCL. It is not a requirement for end users, but it is being tested in CI.

vladimirlaz · 2020-02-10T10:40:11Z

buildbot/configure.py

-    icd_loader_lib = ''
+    icd_loader_lib = os.path.join(args.obj_dir, "OpenCL-ICD-Loader", "build")
+    llvm_targets_to_build = 'X86'
+    llvm_enable_projects = 'clang;llvm-spirv;sycl'


opencl-aot project is missed

bader · 2020-02-11T09:55:54Z

libclc/generic/include/spirv/math/fmod.h

+//
+//===----------------------------------------------------------------------===//
+
+#define __SPIRV_FUNCTION __spirv_ocl_fmod


I don't see definition of __spirv_ocl_fmod function. Is it a bug?

There may not be a definition for __spirv_ocl_fmod yet. We haven't finished implementing all of the builtins yet, but have some internal patches with additional builtins in progress to raise later.

sycl/include/CL/sycl/backend/cuda.hpp

Ruyk · 2020-02-14T13:25:12Z

We are currently working on fixing the default device selection. The current implementation does not take into account the multiple backends, and causes several lit tests to fail. We should have a fix soon.

Alexander-Johnston · 2020-02-20T17:28:57Z

@bader @smaslov-intel

We've pushed some more fixes for the CUDA backend but are still finding issues relating to the new plugin interface introduced in #1030.

We've found that the new plugin interface is unstable when multiple plugins are available. Though we've fixed some issues related to mismatching devices/platforms/plugins when doing info queries and device selection, we are still finding device selection and some of the plugin internals unstable. As a result we are still seeing some test failures. The most notable is basic_tests/queue.cpp, which visibly fails for us when trying to use the OpenCL or CUDA plugins.

We also found a number of tests failing silently by catching exceptions from the sycl runtime, but not erroring after they occur.

We've added a new target, check-sycl-cuda, for specifically attempting to test with CUDA devices by passing SYCL_BE=PI_CUDA. We've also added a matching flag to check-sycl to ensure a path to test OpenCL devices. Without SYCL_BE it is nondeterministic which device is chosen by the default_selector used by many tests when multiple plugins are available.

We have also been testing without SYCL_BUILD_PI_CUDA enabled or CUDA available on the system with success.

For now we can either

merge and not enable testing for the CUDA backend while we work together on the new plugin interfaces stability with multiple plugins
merge and enable testing for the CUDA backend, but mark the failures as warnings instead of failures until the new plugin interface is stable with multiple plugins

To enable building/testing of the CUDA backend you will need to add SYCL_BUILD_PI_CUDA=ON to the buildbot CMake invocation and check-sycl-cuda as a test target to the buildbots.

smaslov-intel · 2020-02-20T22:56:31Z

We've also added a matching flag to check-sycl to ensure a path to test OpenCL devices. Without SYCL_BE it is nondeterministic which device is chosen by the default_selector...

That sounds weird. Isn't the useBackend deterministic in preferring OpenCL when SYCL_BE is unset?

merge and enable testing for the CUDA backend, but mark the failures as warnings...

I'd prefer it this way, and work further to stabilize the plugins mechanism (tagging @garimagu)

bader

Let's test for regressions.

Remove approve to avoid unintentional merge.

buildbot/configure.py

Signed-off-by: Alexander Johnston <[email protected]>

Synchronise the CUDA backend with the general SYCL changes from #1121. Signed-off-by: Andrea Bocci <[email protected]>

Signed-off-by: Alexander Johnston <[email protected]>

To ensure that the check-sycl targets test OpenCL devices, pass SYCL_BE=PI_OPENCL. This mirrors the check-sycl-cuda target which passes SYCL_BE=PI_CUDA. Without this it is nondeterministic which device is tested by check-sycl. Signed-off-by: Alexander Johnston <[email protected]>

Removes PI_CUDA specific code paths and tests from clang, opting to always enable them. Signed-off-by: Alexander Johnston <[email protected]>

Signed-off-by: Alexander Johnston <[email protected]>

Fix platform string comparison for CUDA platform detection. Fix device info platform query so that it uses the device's plugin, rather than the GlobalPlugin. Signed-off-by: Alexander Johnston <[email protected]>

Signed-off-by: Alexander Johnston <[email protected]>

Dismiss review.

bader

Unexpected Passing Tests (3):
    SYCL :: basic_tests/image.cpp
    SYCL :: basic_tests/stream/stream.cpp
    SYCL :: hier_par/hier_par_basic.cpp

Let's remove XFAIL from these tests.
Are these failing on NV HW only?

sycl/test/basic_tests/image.cpp

sycl/test/basic_tests/stream/stream.cpp

sycl/test/hier_par/hier_par_basic.cpp

clang/test/Misc/nvptx.languageOptsOpenCL.cl

sycl/CMakeLists.txt

Fix minor test and build configuration issues introduced in the development of the CUDA backend. Signed-off-by: Alexander Johnston <[email protected]>

bader

Start testing.

bader · 2020-02-24T18:25:35Z

@Alexander-Johnston, I'm going to do more thorough code review post-commit. This PR is too big to review in one shot.

…ages_docs * origin/sycl: (1092 commits) [CI] Add clang-format checker to pre-commit checks (intel#1163) [SYCL][CUDA] Initial CUDA backend support (intel#1091) [USM] Align OpenCL USM extension header with the specification (intel#1162) [SYCL][NFC] Fix unreferenced variable warning (intel#1158) [SYCL] Fix __spirv_GroupBroadcast overloads (intel#1152) [SYCL] Add llvm/Demangle link dependency for llvm-no-spir-kernel (intel#1156) [SYCL] LowerWGScope pass should not be skipped when -O0 is used [SYCL][Doc][USM] Add refactored pointer and device queries to USM spec (intel#1118) [SYCL] Update the kernel parameter rule to is-trivially-copy-construc… (intel#1144) [SYCL] Move internal headers to source dir (intel#1136) [SYCL] Forbid declaration of non-const static variables inside kernels (intel#1141) [SYCL][NFC] Remove idle space (intel#1148) [SYCL] Improve the error mechanism of llvm-no-spir-kernel (intel#1068) [SYCL] Added CTS test config (intel#1063) [SYCL] Implement check-sycl-deploy target (intel#1142) [SYCL] Preserve original message and code of kernel/program build result (intel#1108) [SYCL] Fix LIT after LLVM change in community Translate LLVM's cmpxchg instruction to SPIR-V Add volatile qualifier for atom_ builtins Fix -Wunused-variable warnings ...

garimagu · 2020-02-27T00:13:05Z

sycl/source/detail/scheduler/commands.cpp

        cl_mem MemArg = (cl_mem)AllocaCmd->getMemAllocation();
        Plugin.call<PiApiKind::piKernelSetArg>(Kernel, Arg.MIndex,
                                               sizeof(cl_mem), &MemArg);
+        Plugin.call<PiApiKind::piKernelSetArg>(Kernel, Arg.MIndex,


@bader @Alexander-Johnston Was this extra API added by mistake?
Is there a reason for adding it twice?
FYI @rbegam .

jbrodman · 2020-02-27T17:58:21Z

sycl/source/detail/context_impl.cpp

  MKernelProgramCache.setContextPtr(this);
 }

 context_impl::context_impl(const vector_class<cl::sycl::device> Devices,
-                           async_handler AsyncHandler)
+                           async_handler AsyncHandler, bool UseCUDAPrimaryContext)


What is the purpose of this variable? It is unused.

The flag -nostdlib added in "cf4d4e366a2 libclc: Compile with -nostdlib" was moved to the file AddLibclc.cmake, which was added in: "7a9a4251f57 [SYCL][CUDA] Initial CUDA backend support (intel#1091)" CONFLICT (content): Merge conflict in libclc/CMakeLists.txt

PR #1091 has refactor the logic of prepare_builtins into utils/CMakeLists.txt. We don't need to duplicate logic in CMakeLists.txt now. Before we upstream the utils/CMakeLists.txt, we should just remove the duplicate code.

AlexeySachkov reviewed Feb 3, 2020

View reviewed changes

fwyzard reviewed Feb 4, 2020

View reviewed changes

Alexander-Johnston force-pushed the cuda-dev branch from 0c22091 to 31fd1bb Compare February 4, 2020 11:36

bader mentioned this pull request Feb 5, 2020

[SYCL][PI] Removal of PI_CALL family of macros & implementation of Plugin_impl class #1030

Merged

garimagu reviewed Feb 5, 2020

View reviewed changes

alexbatashev reviewed Feb 6, 2020

View reviewed changes

AlexeySachkov reviewed Feb 6, 2020

View reviewed changes

Alexander-Johnston force-pushed the cuda-dev branch from ee44fa9 to 5f5fa8c Compare February 6, 2020 21:07

vladimirlaz reviewed Feb 10, 2020

View reviewed changes

bader reviewed Feb 11, 2020

View reviewed changes

Alexander-Johnston force-pushed the cuda-dev branch from 5f5fa8c to dedaaec Compare February 11, 2020 13:34

Ruyk mentioned this pull request Feb 11, 2020

[CUDA] ptxas: unresolved extern function __spirv_AtomicLoad(...) #1111

Closed

fwyzard reviewed Feb 12, 2020

View reviewed changes

sycl/include/CL/sycl/backend/cuda.hpp Outdated Show resolved Hide resolved

Ruyk mentioned this pull request Feb 17, 2020

sycl-for-cuda build/doc/run issues #1137

Closed

bader added the cuda CUDA back-end label Feb 18, 2020

Alexander-Johnston force-pushed the cuda-dev branch 2 times, most recently from ea6e850 to 683faf0 Compare February 19, 2020 12:08

fwyzard mentioned this pull request Feb 20, 2020

[CUDA] printfs results in ptxas error: Call has wrong number of parameters #1154

Closed

Alexander-Johnston force-pushed the cuda-dev branch from 44af06f to 3cf8625 Compare February 21, 2020 13:01

bader previously approved these changes Feb 21, 2020

View reviewed changes

bader reviewed Feb 21, 2020

View reviewed changes

buildbot/configure.py Outdated Show resolved Hide resolved

Alexander Johnston and others added 8 commits February 24, 2020 11:43

[SYCL] Formatting update for device_selector.cpp

23b179e

Signed-off-by: Alexander Johnston <[email protected]>

[SYCL][CUDA] Refactor __SYCL_INLINE macro

52736fd

Synchronise the CUDA backend with the general SYCL changes from #1121. Signed-off-by: Andrea Bocci <[email protected]>

[SYCL][CUDA] Code style and cleanup to CUDA support

62afe84

Signed-off-by: Alexander Johnston <[email protected]>

[SYCL][CUDA] Remove PI_CUDA specific details from clang

54678ab

Removes PI_CUDA specific code paths and tests from clang, opting to always enable them. Signed-off-by: Alexander Johnston <[email protected]>

[SYCL][CUDA] Disable linear_id/opencl-interop.cpp for cuda

fb4521e

Signed-off-by: Alexander Johnston <[email protected]>

[SYCL][CUDA] Further fixes to CUDA device selection

ab9f4be

Fix platform string comparison for CUDA platform detection. Fix device info platform query so that it uses the device's plugin, rather than the GlobalPlugin. Signed-off-by: Alexander Johnston <[email protected]>

[SYCL] Enable asserts in all buildbot builds

cdab838

Signed-off-by: Alexander Johnston <[email protected]>

Alexander-Johnston force-pushed the cuda-dev branch from b7ed055 to cdab838 Compare February 24, 2020 12:36

bader previously approved these changes Feb 24, 2020

View reviewed changes

bader reviewed Feb 24, 2020

View reviewed changes

sycl/test/basic_tests/image.cpp Outdated Show resolved Hide resolved

sycl/test/basic_tests/image.cpp Outdated Show resolved Hide resolved

sycl/test/basic_tests/stream/stream.cpp Outdated Show resolved Hide resolved

sycl/test/hier_par/hier_par_basic.cpp Outdated Show resolved Hide resolved

bader reviewed Feb 24, 2020

View reviewed changes

clang/test/Misc/nvptx.languageOptsOpenCL.cl Outdated Show resolved Hide resolved

bader reviewed Feb 24, 2020

View reviewed changes

sycl/CMakeLists.txt Outdated Show resolved Hide resolved

bader reviewed Feb 24, 2020

View reviewed changes

sycl/CMakeLists.txt Outdated Show resolved Hide resolved

[SYCL][CUDA] Minor test and build configuration

5b1ff35

Fix minor test and build configuration issues introduced in the development of the CUDA backend. Signed-off-by: Alexander Johnston <[email protected]>

bader approved these changes Feb 24, 2020

View reviewed changes

bader merged commit 7a9a425 into intel:sycl Feb 24, 2020

AlexeySachkov mentioned this pull request Feb 25, 2020

[CUDA] Upstream the SPIR-V translator changes into KhronosGroup/SPIRV-LLVM-Translator #1166

Closed

AlexeySachkov mentioned this pull request Feb 26, 2020

[SYCL] Trick to avoid strict pointer aliasing warnings #1203

Merged

garimagu reviewed Feb 27, 2020

View reviewed changes

jbrodman reviewed Feb 27, 2020

View reviewed changes

bader mentioned this pull request Feb 28, 2020

sycl-for-cuda causes build error #1124

Closed

alexbatashev mentioned this pull request Mar 9, 2021

Does dpc++ use clang plugin technology to support CUDA programming front-end? #3320

Closed

	if (Major == 10 && (Minor == 1 \|\| Minor == 2))
	// Enable CUDA 10.2 toolkit acting as 10.1 until tip clang catches up
	if (Major == 10 && (Minor == 1 \|\| Minor == 2))

		if (!getenv("DISABLE_INFER_AS"))
		Builder.defineMacro("__SYCL_ENABLE_INFER_AS__", "1");

		@@ -4,6 +4,7 @@
		#cmake_policy(SET CMP0057 NEW)
		#include(AddLLVM)

		@@ -0,0 +1,479 @@
		//===-- pi_cuda.hpp - CUDA Plugin -----------------------------------------===//

	context::context(const async_handler &AsyncHandler, bool usePrimaryContext)
	context::context(const async_handler &AsyncHandler, bool UsePrimaryContext)

	InteropTask(std::function<void(cl::sycl::interop_handler)> Func)
	InteropTask(function_class<void(cl::sycl::interop_handler)> Func)

	EventImplPtr addHostAccessor(Requirement *Req, const bool destructor = false);
	EventImplPtr addHostAccessor(Requirement *Req, const bool Destructor = false);

	bool useCUDAPrimaryContext = false);
	bool UseCUDAPrimaryContext = false);

	#cmakedefine SYCL_HAVE_PI_CUDA ${SYCL_HAVE_PI_CUDA}
	#cmakedefine01 SYCL_HAVE_PI_CUDA

[SYCL][CUDA] Initial CUDA backend support #1091

[SYCL][CUDA] Initial CUDA backend support #1091

Conversation

Alexander-Johnston commented Feb 3, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bader Feb 19, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexeySachkov left a comment

Choose a reason for hiding this comment

Ruyk commented Feb 6, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fwyzard commented Feb 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ruyk commented Feb 14, 2020

Alexander-Johnston commented Feb 20, 2020

smaslov-intel commented Feb 20, 2020

bader left a comment

Choose a reason for hiding this comment

bader left a comment

Choose a reason for hiding this comment

bader left a comment

Choose a reason for hiding this comment

bader commented Feb 24, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bader Feb 19, 2020 •

edited

Loading

fwyzard commented Feb 8, 2020 •

edited

Loading