Merge branch 'makefile_gpu' into 'develop'

GPU makefile See merge request smilei/smilei!136
SmileiPIC · Feb 29, 2024 · 692fedd · 692fedd
2 parents 2da02f0 + 2a8c0fb
commit 692fedd
Show file tree

Hide file tree

Showing 14 changed files with 308 additions and 387 deletions.
diff --git a/doc/Sphinx/Overview/releases.rst b/doc/Sphinx/Overview/releases.rst
@@ -23,10 +23,14 @@ You can find older, `unsupported versions here <https://github.com/SmileiPIC/Smi
 Changes made in the repository (not released)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
+* GPU:
+
+  * Compilation simplified and better documented.
+
 * Happi:
 
   * In ``Scalar``, it is now possible to make an operation on scalars such as ``"Uelm+Ukin"``.
-    The list of available scalars can be obtained from ``getScalars()``.
+  * The list of available scalars can be obtained from ``getScalars()``.
   * New arguments ``xoffset`` and ``yoffset`` to shift plot coordinates.
   * New argument ``timestep_indices`` as an alternative to ``timesteps``.
   * Changed coordinate reference for 2D probe in 3D or AM geometry

diff --git a/doc/Sphinx/Use/installation.rst b/doc/Sphinx/Use/installation.rst
@@ -1,39 +1,61 @@
 Install
 -------
 
-Before installing :program:`Smilei`, you need to install a few dependencies:
+Installing Smilei requires several steps:
 
-* A C++11 compiler, optionally implementing openMP version > 4.5
-  (gcc users: v6.0 or newer recommended)
-* an MPI library (by default a version supporting ``MPI_THREAD_MULTIPLE``
-  is required: v4.0 or newer recommended)
-* an HDF5 library compatible with your versions of C++ and MPI
-* Python 2.7 or Python 3+ (with header files)
-
-Optional dependencies are:
-
-* Git
-* Python modules: sphinx, h5py, numpy, matplotlib, pint
-* ffmpeg
-* CUDA for NVIDIA GPUs or HIP-SYCL for AMD GPUs (it is recommended to use the already installed software stack and the support team of a supercomputer you have access to). 
+#. Install compilers and libraries that Smilei needs (*dependencies*)
+#. Download Smilei
+#. Setup your environment (*environment variables*)
+#. Compile
 
 ----
 
 Install the dependencies
 ^^^^^^^^^^^^^^^^^^^^^^^^
 
+The **necessary** dependencies are:
+
+* A C++11 compiler, optionally implementing openMP version > 4.5.
+* An MPI library (by default a version supporting ``MPI_THREAD_MULTIPLE``).
+  IntelMPI or OpenMPI are recommended.
+* The **parallel** HDF5 library compiled with your versions of C++ and MPI.
+* Python 3+ with header files.
+
+When compiling on GPU:
+
+* The C++ compiler must be GPU-aware (typically ``nvc++`` for NVIDIA or ``clang`` for AMD)
+* A CUDA or HIP compiler is necessary (typically ``nvcc`` for NVIDIA or ``hipcc`` for AMD)
+
+Optional dependencies are:
+
+* `Git <https://git-scm.com/>`_ for version control
+* Python modules for post-processing: sphinx, h5py, numpy, matplotlib, pint
+* `FFmpeg <https://ffmpeg.org/>`_ for converting animations to videos
+
 There are various ways to install all dependencies, depending on the platform:
 
 * :doc:`On MacOs<install_macos>`
 * :doc:`On Linux<install_linux>`
 * :doc:`On a supercomputer<install_supercomputer>`
 
-The command ``make help`` can give you some information about your environment.
-
 If you have successfully installed these dependencies on other platforms,
 please :doc:`contact us </Overview/partners>` and share!
 
 
+----
+
+Download the Smilei source
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Clone the latest :program:`Smilei` version from Github:
+
+.. code-block:: bash
+
+  cd /path/of/your/choice/
+  git clone https://github.com/SmileiPIC/Smilei.git
+
+If you prefer a direct download, see :ref:`here <latestVersion>`.
+
 ----
 
 Setup environment variables for compilation
@@ -43,44 +65,34 @@ Several environment variables may be required, depending on your setup.
 
 * ``SMILEICXX``: the MPI-C++ compiler.
   Defaults to ``mpicxx``.
-* ``HDF5_ROOT_DIR``: the folder for the HDF5 library.
+* ``HDF5_ROOT_DIR``: the folder of the HDF5 library.
   Defaults to ``$HDF5_ROOT``.
 * ``BUILD_DIR``: the folder where the compilation should occur.
   Defaults to ``./build``.
 * ``PYTHONEXE``: the python executable to use in smilei.
   Defaults to ``python``.
+* ``CXXFLAGS``: flags for the C++ compiler.
+* ``LDFLAGS``: flags for the linker.
+* ``GPU_COMPILER``: the compiler for CUDA or HIP (typically ``nvcc`` or ``hipcc``).
+  Defaults to ``$CC``.
+* ``GPU_COMPILER_FLAGS``: flags for ``$GPU_COMPILER``.
 
-The usual ``CXXFLAGS`` and ``LDFLAGS`` can also be used to pass other
-arguments to the compiler and linker.
-
+The command ``make help`` can give you some information about your environment.
 
 ----
 
 .. _compile:
 
-Download and compile
+Compile Smilei
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-#. Clone the latest :program:`Smilei` version from Github:
+In a terminal, go to the folder where you downloaded :program:`Smilei` and use the commmand
 
-   .. code-block:: bash
-    
-     cd /path/of/your/choice/
-     git clone https://github.com/SmileiPIC/Smilei.git
-    
-   If you do not have ``git``, you can dowload a tarball :ref:`here <latestVersion>`
-   and extract it in a new folder.
-
-#. In a terminal, go to that location and compile:
-
-   .. code-block:: bash
+.. code-block:: bash
 
-     cd Smilei
-     make
-   
-   If the compilation is successful, you should now have a new ``smilei`` executable.
+  make
 
-#. The next step is to :doc:`write a namelist <namelist>`.
+If the compilation is successful, you should now have a new ``smilei`` executable.
 
 ----
 
@@ -91,31 +103,29 @@ Advanced compilation options
 
 .. code-block:: bash
 
-  make -j 4
+  make -j 4  # Compiles on 4 threads
 
 .. rubric:: Compilation configuration with keyword "config"
 
 .. code-block:: bash
 
-  make config=debug                        # With debugging output (slow execution)
-  make config=noopenmp                     # Without OpenMP support
-  make config=no_mpi_tm                    # Without a MPI library which supports MPI_THREAD_MULTIPLE
-  make config=scalasca                     # For the Scalasca profiler
-  make config=advisor                      # For Intel Advisor
-  make config=vtune                        # For Intel Vtune
-  make config=inspector                    # For Intel Inspector
-  make config=detailed_timers              # More detailed timers, but somewhat slower execution
-  make config="gpu_nvidia noopenmp"        # For Nvidia GPU acceleration
-  make config="gpu_amd"                    # For AMD GPU acceleration
+  make config=noopenmp        # Without OpenMP support
+  make config=no_mpi_tm       # Without a MPI library which supports MPI_THREAD_MULTIPLE
+  make config=gpu_nvidia      # For Nvidia GPU acceleration
+  make config=gpu_amd         # For AMD GPU acceleration
+  make config=debug           # With debugging output (slow execution)
+  make config=scalasca        # For the Scalasca profiler
+  make config=advisor         # For Intel Advisor
+  make config=vtune           # For Intel Vtune
+  make config=inspector       # For Intel Inspector
+  make config=detailed_timers # More detailed timers, but somewhat slower execution
 
 It is possible to combine arguments above within quotes, for instance:
 
 .. code-block:: bash
 
   make config="debug noopenmp" # With debugging output, without OpenMP
 
-However, some arguments may not be compatible, e.g. ``noopenmp`` and ``omptasks``. 
-
 .. rubric:: Obtain some information about the compilation
 
 .. code-block:: bash
@@ -140,46 +150,45 @@ executed before compilation. If you successfully write such a file for
 a common supercomputer, please share it with developpers so that it can
 be included in the next release of :program:`Smilei`.
 
+----
 
-.. rubric:: Compilation for GPU accelerated nodes:
-
-As each supercomputer has a different environnment to compile for GPUs and since the nvhpc + CUDA/ cray + HIP modules evolve quickly, a machine file is required for the compilation.
-Several machine files are already available as an example in smilei/scripts/compile_tools/machine/ ; such as: jean_zay_gpu_V100, jean_zay_gpu_A100, adastra, ruche_gpu2.
-
-Typically we need it to specify ACCELERATOR_GPU_FLAGS += -ta=tesla:cc80 for nvhpc <23.4 and ACCELERATOR_GPU_FLAGS += -gpu=cc80 -acc for the more recent versions of nvhpc.
-
-.. code-block:: bash
-
-	make -j 12 machine="jean_zay_gpu_A100" config="gpu_nvidia noopenmp verbose" # for Nvidia GPU
-	make -j 12 machine="adastra" config="gpu_amd" 			            # for AMD GPU
-
+Compilation for GPU accelerated nodes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-Furthermore, here are 2 examples of known working ennvironments, first for AMD GPUs, second for Nvidia GPUs:
+On GPU, two compilers are used: a C++ compiler for the main code
+(defined by the variable ``$SMILEICXX``) and a compiler for 
+``.cu`` CUDA files (defined by the variable ``$GPU_COMPILER``).
+For NVIDIA, it is recommended to use the ``nvhpc`` software kit
+which includes the compilers ``nvc++`` and ``nvcc``.
+For AMD, the equivalent ``ROCm`` software kit includes ``clang`` and ``hipcc``.
 
-.. code-block:: bash
+Generally, several flags must be supplied to these compilers in order
+to target properly your system architecture. They must
+be supplied in ``$CXXFLAGS`` and ``$GPU_COMPILER_FLAGS``.
+Please refer to the system administrators to find available compilers
+and the required flags for your machine, as well as the commands
+needed to load the correct environment.
 
-	module purge
-	module load craype-accel-amd-gfx90a craype-x86-trento
-	module load PrgEnv-cray/8.3.3
-	module load cpe/23.02
-	module load cray-mpich/8.1.24 cray-hdf5-parallel/1.12.2.1 cray-python/3.9.13.1
-	module load amd-mixed/5.2.3
+The compilation of Smilei must include a special ``config`` keyword equal to either
+``gpu_nvidia`` or ``gpu_amd``.
+Two examples are provided as guidance:
 
 .. code-block:: bash
 
-	module purge
-	module load anaconda-py3/2020.11  # python is fine as well if you can pip install the required modules
-	module load nvidia-compilers/23.1
-	module load cuda/11.2
-	module load openmpi/4.1.1-cuda
-	module load hdf5/1.12.0-mpi-cuda
-	# For HDF5, note that module show can give you the right path
-	export HDF5_ROOT_DIR=/DIRECTORY_NAME/hdf5/1.12.0/pgi-20.4-HASH/
+  make -j 12 machine="jean_zay_gpu_A100" config="gpu_nvidia" # example for Nvidia GPU
+  make -j 12 machine="adastra" config="gpu_amd"              # example for AMD GPU
 
-Note: 
+In these cases, the environment variables were included in *machine files* that
+you can find in ``scripts/compile_tools/machine/``.
+Typically ``CXXFLAGS += -ta=tesla:cc80`` for ``nvhpc`` <23.4 and
+``CXXFLAGS += -gpu=cc80 -acc`` for the more recent versions of ``nvhpc``.
 
-* we are aware of issues with CUDA >12.0, fixes are being tested but are not deployed yet. We recommend CUDA 11.x at the moment.
-* The hdf5 module should be compiled with the nvidia/cray compiler ; openmpi as well, but depending on the nvhpc module it might not be needed as it can be included in the nvhpc module 
+.. warning::
+
+  * We are aware of issues with CUDA >12.0, fixes are being tested but are not deployed yet.
+    We recommend CUDA 11.x at the moment.
+  * The hdf5 module should be compiled with the nvidia/cray compiler;
+    openmpi as well, but depending on the nvhpc module it might not be needed as it can be included in the nvhpc module.
 
 ----