generated from AgnostiqHQ/covalent-executor-template
-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
f8f6ffc
commit 8e78962
Showing
7 changed files
with
177 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Ignore the public key for ssh | ||
*.pub |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
FROM turuncu/slurm:latest | ||
|
||
RUN apt update && apt install openssh-server vim less wget sudo -y | ||
|
||
# Create a user “slurmuser” and group “slurmgroup” | ||
RUN groupadd slurmgroup && useradd -ms /bin/bash -g slurmgroup slurmuser | ||
RUN echo 'slurmuser:root123' | chpasswd | ||
|
||
# Create slurmuser directory in home | ||
RUN mkdir -p /home/slurmuser/.ssh | ||
|
||
# Copy the ssh public key in the authorized_keys file. The idkey.pub below is a public key file you get from ssh-keygen. They are under ~/.ssh directory by default. | ||
COPY slurm_test.pub /home/slurmuser/.ssh/authorized_keys | ||
|
||
# Copy the test.job file to the home directory | ||
COPY test.job /home/slurmuser/test.job | ||
|
||
# Copy covalent install file to the home directory | ||
COPY covalent_install.sh /home/slurmuser/covalent_install.sh | ||
|
||
# Run the covalent install file | ||
RUN chmod +x /home/slurmuser/covalent_install.sh && /home/slurmuser/covalent_install.sh | ||
|
||
# change ownership of the key file. | ||
RUN chown slurmuser:slurmgroup /home/slurmuser/.ssh/authorized_keys && chmod 600 /home/slurmuser/.ssh/authorized_keys | ||
|
||
# Start SSH service | ||
RUN service ssh start | ||
|
||
# Expose docker port 22 | ||
EXPOSE 22 | ||
CMD ["/usr/sbin/sshd","-D"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
# Testing with a Slurm docker container | ||
|
||
## Prerequisites | ||
|
||
Ensure you have docker installed and running on your system. You can check this by running `docker ps` and ensuring that you get a list of running containers. | ||
|
||
We will be using the slurm docker image from [here](https://hub.docker.com/r/turuncu/slurm). You can pull this image by running: | ||
|
||
```bash | ||
docker pull turuncu/slurm | ||
``` | ||
|
||
Also make sure your current directory is `tests/docker_tests` for all the commands mentioned below. | ||
|
||
## Building the image | ||
|
||
### Generating a keypair | ||
|
||
We need to generate a keypair to allow the executor to ssh into the container. To do this run: | ||
|
||
```bash | ||
ssh-keygen -t ed25519 -f slurm_test -N '' | ||
``` | ||
|
||
This will generate a keypair called `slurm_test` and `slurm_test.pub` in the current directory. We will use these keys to allow the executor to ssh into the container. | ||
|
||
The name of the key is important as it is used in the `Dockerfile` to copy the public key into the container. | ||
|
||
### Running docker build | ||
|
||
We do some additional setup to the image so that the executor is able to ssh into the container. To do this make sure you are in the right directory (tests/docker_tests) and run: | ||
|
||
```bash | ||
docker build -t slurm-image . | ||
``` | ||
|
||
This will build the image and tag it as `slurm-image`. | ||
|
||
## Running the container | ||
|
||
To run the container, run: | ||
|
||
```bash | ||
docker run -d -p 22:22 --name slurm-container slurm-image | ||
``` | ||
|
||
This will run the container in the background with name `slurm-container` and map port 22 of the container to port 22 of the host machine. This will allow us to ssh into the container. | ||
|
||
### Changing the permissions of the slurm config | ||
|
||
We need to change the permissions of the slurm config file so that `slurmuser` can read it. To do this run: | ||
|
||
```bash | ||
docker exec slurm-container chmod +r /etc/slurm/slurm.conf | ||
``` | ||
|
||
## (Optional) Try running a basic slurm job | ||
|
||
To test that the container is working, we can try running a basic slurm job. To do this, ssh into the container by running: | ||
|
||
```bash | ||
ssh -i slurm_test slurmuser@localhost | ||
``` | ||
|
||
Then inside the container, run: | ||
|
||
```bash | ||
sbatch test.job | ||
``` | ||
|
||
This will submit the test job to the slurm scheduler and create two new files in the current directory (should be `/home/slurmuser`) as `test_<job-id>.out` and `test_<job-id>.err`. The `.out` file should contain the stdout output of the job (should be "Hello World") and the `.err` file should contain any errors (should contain the python version as it is redirected to stderr). | ||
|
||
## Running the tests | ||
|
||
Now that we have everything set up, use your favourite workflow and assign the executor as `@ct.electron(executor=slurm_executor)` for any of the electrons, where the `slurm_executor` is defined as: | ||
|
||
```python | ||
from covalent_slurm_plugin import SlurmExecutor | ||
|
||
slurm_executor = SlurmExecutor(username="slurmuser", address="localhost", ssh_key_file="./slurm_test", conda_env="covalent", ignore_versions=True) | ||
``` | ||
|
||
You can mark `ignore_versions` as `False` (which is the default) if you want to make sure the same versions of python, covalent, and cloudpickle are used in the slurm job as on your local machine. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
#!/bin/bash | ||
|
||
set -eu -o pipefail | ||
export HOME=/home/slurmuser | ||
|
||
sed -i '/^case \\$-.*/,+3d' /home/slurmuser/.bashrc | ||
cd $HOME | ||
|
||
MINICONDA_EXE="Miniconda3-py38_23.3.1-0-Linux-x86_64.sh" | ||
wget https://repo.anaconda.com/miniconda/$MINICONDA_EXE | ||
chmod +x $MINICONDA_EXE | ||
./$MINICONDA_EXE -b -p $HOME/miniconda3 | ||
rm $MINICONDA_EXE | ||
|
||
export PATH=$HOME/miniconda3/bin:$PATH | ||
eval "$(conda shell.bash hook)" | ||
conda init bash | ||
|
||
conda create -n covalent python=3.10 -y | ||
echo "conda activate covalent" >> $HOME/.bashrc | ||
|
||
chown -R slurmuser:slurmgroup $HOME/{.cache,.conda,miniconda3} | ||
conda run -n covalent python -m pip install covalent |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
#!/bin/bash | ||
# | ||
#SBATCH --job-name=test | ||
#SBATCH --nodes=1 | ||
#SBATCH --ntasks=1 | ||
##SBATCH --mem=1G | ||
##SBATCH --partition=debug | ||
#SBATCH --time=00:10:00 | ||
#SBATCH --output=%x_%j.out | ||
#SBATCH --error=%x_%j.err | ||
|
||
echo "Hello World" | ||
echo "$(which python)" 1>&2 |