Skip to content

Commit

Permalink
Merge pull request #540 from argonne-lcf/TApplencourt-patch-1
Browse files Browse the repository at this point in the history
Update sycl-aurora.md
  • Loading branch information
TApplencourt authored Nov 8, 2024
2 parents 0f6f4b6 + 1913088 commit 2993d1c
Showing 1 changed file with 59 additions and 1 deletion.
60 changes: 59 additions & 1 deletion docs/aurora/programming-models/sycl-aurora.md
Original file line number Diff line number Diff line change
@@ -1 +1,59 @@
SYCL on Aurora
# SYCL on Aurora

## Overview

SYCL is an open, royalty-free, cross-platform abstraction layer that enables code for heterogeneous and offload processors to be written using modern ISO C++, and provides APIs and abstractions to find devices (CPUs, GPUs, FPGAs ...) on which code can be executed and to manage data resources and code execution on those devices.

The specification can be found here: https://registry.khronos.org/SYCL/specs/sycl-2020/

## Setting the environment to use SYCL on Aurora

The Intel oneAPI Programming Environment is the main environment on Aurora. oneAPI has SYCL support. The oneAPI module is loaded by default in your environment:

```
$ module list
Currently Loaded Modules:
1) gcc-runtime/12.2.0-267awrk 5) gcc/12.2.0 9) libfabric/1.20.1
2) gmp/6.2.1-yctcuid 6) intel_compute_runtime/release/996.26 10) cray-pals/1.4.0
3) mpfr/4.2.1-fhgnwe7 7) oneapi/eng-compiler/2024.07.30.002 11) cray-libpals/1.4.0
4) mpc/1.3.1-ygprpb4 8) mpich/icc-all-pmix-gpu/20240717
```

## Building on Aurora

Simply use `-fsycl`.
For CMAKE, use `find_package(IntelSYCL REQUIRED)` see `cat $CMPLR_ROOT/lib/cmake/IntelSYCL/IntelSYCLConfig.cmake` for more details.

## Example

```
$ cat hello_sycl.cpp
#include <sycl/sycl.hpp>
int main(int argc, char **argv) {
int global_range = 10;
// Default Queue
sycl::queue Q;
// Queue introspection
std::cout << "Running on " << Q.get_device().get_info<sycl::info::device::name>() << std::endl;
// Allocate device memory
int *A = sycl::malloc_device<int>(global_range, Q);
// Blocking kernel that use the memory
Q.parallel_for(global_range, [=](auto id) { A[id] = id; }).wait();
// Allocate Host Memory
std::vector<int> A_host(global_range);
// Blocking copy the device memory to the host
Q.copy(A, A_host.data(), global_range).wait();
// Free Device Memory
sycl::free(A, Q);
for (size_t i = 0; i < global_range; i++)
std::cout << "A_host[ " << i << " ] = " << A_host[i] << std::endl;
return 0;
}
$ icpx -fsycl hello_sycl.cpp
$ ./a.out
```

More examples can be found here https://github.com/argonne-lcf/sycltrain/tree/master/9_sycl_of_hell

0 comments on commit 2993d1c

Please sign in to comment.