Skip to content

Commit

Permalink
Merge pull request #183 from sigdml/dev
Browse files Browse the repository at this point in the history
docs(README): deploy slurm cluster in a namespace aligned with the cluster name
  • Loading branch information
asteny authored Nov 18, 2024
2 parents 4ba09fc + ce5ed98 commit 2a20b12
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 2 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,8 @@ In general, you need to follow these steps:
2. Install the [NVIDIA GPU Operator](https://github.com/NVIDIA/gpu-operator).
3. If you use InfiniBand, install the [NVIDIA Network Operator](https://github.com/Mellanox/network-operator).
4. Install Soperator by applying the [soperator](helm/soperator) Helm chart.
5. Create a Slurm cluster by applying the [slurm-cluster](helm/slurm-cluster) Helm chart.
5. Create a Slurm cluster in a namespace with the same name as the slurm cluster by
applying the [slurm-cluster](helm/slurm-cluster) Helm chart.
6. Wait until the `slurm.nebius.ai/SlurmCluster` resource becomes `Available`.

[//]: # (TODO: Refer to Helm OCI images instead of file directories when the repo is open)
Expand Down
2 changes: 1 addition & 1 deletion images/common/scripts/complement_jail.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

# Complement jaildir by bind-mounting virtual filesystems, users, and NVIDIA binaries from the host filesystem

set -x # Print actual command when before
set -x # Print actual command before executing it
set -e # Exit immediately if any command returns a non-zero error code

usage() { echo "usage: ${0} -j <path_to_jail_dir> -u <path_to_upper_jail_dir> [-w] [-h]" >&2; exit 1; }
Expand Down

0 comments on commit 2a20b12

Please sign in to comment.