-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for OCI containers #292
Open
cartalla
wants to merge
1
commit into
main
Choose a base branch
from
291-feature-add-support-for-rootless-oci-containers
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
# Containers | ||
|
||
Slurm supports [running jobs in unprivileged OCI containers](https://slurm.schedmd.com/containers.html). | ||
OCI is the [Open Container Initiative](https://opencontainers.org/), an open governance structure with the purpose of creating open industry standards around container formats and runtimes. | ||
|
||
I'm going to document how to add OCI support to your EDA Slurm cluster. | ||
Note that most EDA tools are not containerized and that some won't run in containers and that some may run in a container, but not correctly. | ||
I recommend following the guidance of your EDA vendor and consult with them. | ||
|
||
I've seen a couple of main motivations for using containers for EDA tools. | ||
The first is because orchestration tools like Kubernetes and AWS Batch require jobs to run in containers. | ||
The other is to have more flexibility managing the run time environment of the tools. | ||
Since the EDA tools themselves aren't containerized, the container is usually used to manage file system mounts and packages that are used by the tools. | ||
If new packages are required by a new tool, then it is easy to update and distribute a new version of the container. | ||
|
||
## Compute node configuration | ||
|
||
The compute node must be configured to use an unprivileged container runtime. | ||
We'll show how to install and configure rootless Docker. | ||
|
||
The following directions have been automated in the [creation of a custom EDA compute node AMI](custom-amis.md). | ||
|
||
First, [install the latest Docker from the Docker yum repo](https://docs.docker.com/engine/install/rhel/). | ||
|
||
Next, [configure Docker to run rootless](https://docs.docker.com/engine/security/rootless/). | ||
|
||
Configure subuid and subgid. | ||
|
||
Each user that will run Docker must have an entry in `/etc/subuid` and `/etc/subgid`. | ||
|
||
## Per user configuration | ||
|
||
You must configure docker to use a non-NFS storage location for storing images. | ||
|
||
`~/.config/docker/daemon.json`: | ||
|
||
``` | ||
{ | ||
"data-root": "/var/tmp/${USER}/containers/storage" | ||
} | ||
``` | ||
|
||
## Create OCI Bundle | ||
|
||
Each container requires an [OCI bundle](https://slurm.schedmd.com/containers.html#bundle). | ||
|
||
The bundle directories can be stored on NFS and shared between users. | ||
For example, you could create an oci-bundles directory on your shared file system. | ||
|
||
This shows how to create an ubuntu bundle. | ||
You can do this as root with the docker service running, but it would be better to run | ||
it using rootless Docker. | ||
|
||
``` | ||
export OCI_BUNDLES_DIR=~/oci-bundles | ||
export IMAGE_NAME=ubuntu | ||
export BUNDLE_NAME=ubuntu | ||
mkdir -p $OCI_BUNDLES_DIR | ||
cd $OCI_BUNDLES_DIR | ||
mkdir -p $BUNDLE_NAME | ||
cd $BUNDLE_NAME | ||
docker pull $IMAGE_NAME | ||
docker export $(docker create $IMAGE_NAME) > $BUNDLE_NAME.tar | ||
mkdir rootfs | ||
tar -C rootfs -xf $IMAGE_NAME.tar | ||
runc spec --rootless | ||
runc run containerid | ||
``` | ||
|
||
The same process works for Rocky Linux 8. | ||
|
||
``` | ||
export OCI_BUNDLES_DIR=~/oci-bundles | ||
export IMAGE_NAME=rockylinux:8 | ||
export BUNDLE_NAME=rockylinux8 | ||
mkdir -p $OCI_BUNDLES_DIR | ||
cd $OCI_BUNDLES_DIR | ||
mkdir -p $BUNDLE_NAME | ||
cd $BUNDLE_NAME | ||
docker pull $IMAGE_NAME | ||
docker export $(docker create $IMAGE_NAME) > $BUNDLE_NAME.tar | ||
mkdir rootfs | ||
tar -C rootfs -xf $BUNDLE_NAME.tar | ||
runc spec --rootless | ||
runc run containerid2 | ||
``` | ||
|
||
## Test the bundle locally | ||
|
||
``` | ||
export OCI_BUNDLES_DIR=~/oci-bundles | ||
export BUNDLE_NAME=rockylinux8 | ||
cd $OCI_BUNDLES_DIR/$BUNDLE_NAME | ||
runc spec --rootless | ||
runc run containerid2 | ||
``` | ||
|
||
## Run a bundle on Slurm | ||
|
||
``` | ||
export OCI_BUNDLES_DIR=~/oci-bundles | ||
export BUNDLE_NAME=rockylinux8 | ||
|
||
srun -p interactive --container $OCI_BUNDLES_DIR/$BUNDLE_NAME --pty hostname | ||
|
||
srun -p interactive --container $OCI_BUNDLES_DIR/$BUNDLE_NAME --pty bash | ||
|
||
sbatch -p interactive --container $OCI_BUNDLES_DIR/$BUNDLE_NAME --wrap hostname | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
86 changes: 86 additions & 0 deletions
86
source/resources/parallel-cluster/config/bin/configure-rootless-docker.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
#!/bin/bash -ex | ||
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
# SPDX-License-Identifier: MIT-0 | ||
|
||
# Configure rootless docker for user. | ||
# The slurm config directory must exist | ||
|
||
script=$0 | ||
script_name=$(basename $script) | ||
|
||
# Jinja2 template variables | ||
assets_bucket={{assets_bucket}} | ||
assets_base_key={{assets_base_key}} | ||
export AWS_DEFAULT_REGION={{Region}} | ||
ClusterName={{ClusterName}} | ||
ErrorSnsTopicArn={{ErrorSnsTopicArn}} | ||
playbooks_s3_url={{playbooks_s3_url}} | ||
|
||
# Notify user of errors | ||
function on_exit { | ||
rc=$? | ||
set +e | ||
if [[ $rc -ne 0 ]] && [[ ":$ErrorSnsTopicArn" != ":" ]]; then | ||
tmpfile=$(mktemp) | ||
echo "See log files for more info: | ||
/var/lib/amazon/toe/TOE_* | ||
grep PCImageBuilderEDA /var/log/messages | less" > $tmpfile | ||
aws --region $AWS_DEFAULT_REGION sns publish --topic-arn $ErrorSnsTopicArn --subject "${ClusterName} configure-rootless-docker.sh failed" --message file://$tmpfile | ||
rm $tmpfile | ||
fi | ||
} | ||
trap on_exit EXIT | ||
|
||
# Redirect all IO to /var/log/messages and then echo to stderr | ||
exec 1> >(logger -s -t configure-rootless-docker) 2>&1 | ||
|
||
# Install ansible | ||
if ! yum list installed ansible &> /dev/null; then | ||
yum install -y ansible || amazon-linux-extras install -y ansible2 | ||
fi | ||
|
||
external_login_node_config_dir=/opt/slurm/${ClusterName}/config | ||
if [ -e $external_login_node_config_dir ]; then | ||
config_dir=$external_login_node_config_dir | ||
else | ||
config_dir=/opt/slurm/config | ||
fi | ||
config_bin_dir=$config_dir/bin | ||
ANSIBLE_PATH=$config_dir/ansible | ||
PLAYBOOKS_PATH=$ANSIBLE_PATH/playbooks | ||
PLAYBOOKS_ZIP_PATH=$ANSIBLE_PATH/playbooks.zip | ||
|
||
if ! [ -e $external_login_node_config_dir ]; then | ||
mkdir -p $config_bin_dir | ||
|
||
ansible_head_node_vars_yml_s3_url="s3://$assets_bucket/$assets_base_key/config/ansible/ansible_head_node_vars.yml" | ||
ansible_compute_node_vars_yml_s3_url="s3://$assets_bucket/$assets_base_key/config/ansible/ansible_compute_node_vars.yml" | ||
ansible_external_login_node_vars_yml_s3_url="s3://$assets_bucket/$assets_base_key/config/ansible/ansible_external_login_node_vars.yml" | ||
|
||
# Download ansible playbooks | ||
aws s3 cp $playbooks_s3_url ${PLAYBOOKS_ZIP_PATH}.new | ||
if ! [ -e $PLAYBOOKS_ZIP_PATH ] || ! diff -q $PLAYBOOKS_ZIP_PATH ${PLAYBOOKS_ZIP_PATH}.new; then | ||
mv $PLAYBOOKS_ZIP_PATH.new $PLAYBOOKS_ZIP_PATH | ||
rm -rf $PLAYBOOKS_PATH | ||
mkdir -p $PLAYBOOKS_PATH | ||
pushd $PLAYBOOKS_PATH | ||
yum -y install unzip | ||
unzip $PLAYBOOKS_ZIP_PATH | ||
chmod -R 0700 $ANSIBLE_PATH | ||
popd | ||
fi | ||
|
||
aws s3 cp $ansible_head_node_vars_yml_s3_url /opt/slurm/config/ansible/ansible_head_node_vars.yml | ||
|
||
aws s3 cp $ansible_compute_node_vars_yml_s3_url /opt/slurm/config/ansible/ansible_compute_node_vars.yml | ||
|
||
aws s3 cp $ansible_external_login_node_vars_yml_s3_url /opt/slurm/config/ansible/ansible_external_login_node_vars.yml | ||
fi | ||
|
||
pushd $PLAYBOOKS_PATH | ||
|
||
ansible-playbook $PLAYBOOKS_PATH/configure-rootless-docker.yml \ | ||
-i inventories/local.yml \ | ||
-e @$ANSIBLE_PATH/ansible_external_login_node_vars.yml | ||
|
||
popd |
91 changes: 91 additions & 0 deletions
91
source/resources/parallel-cluster/config/bin/install-rootless-docker.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
#!/bin/bash -ex | ||
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
# SPDX-License-Identifier: MIT-0 | ||
|
||
# This script calls an ansible playbook that installs rootless docker on a compute node. | ||
# It has 2 different use cases: | ||
# * To build ParallelCluster AMIs | ||
# * To install docker on VDIs or other login nodes using a ParallellCluster. | ||
# The location of the config directory is different for those 2 use cases. | ||
# For an AMI build, the config directory and scripts will not exist and must be downloaded from S3. | ||
# For a login node, the playbooks and scripts will already exist. | ||
|
||
script=$0 | ||
script_name=$(basename $script) | ||
|
||
# Jinja2 template variables | ||
assets_bucket={{assets_bucket}} | ||
assets_base_key={{assets_base_key}} | ||
export AWS_DEFAULT_REGION={{Region}} | ||
ClusterName={{ClusterName}} | ||
ErrorSnsTopicArn={{ErrorSnsTopicArn}} | ||
playbooks_s3_url={{playbooks_s3_url}} | ||
|
||
# Notify user of errors | ||
function on_exit { | ||
rc=$? | ||
set +e | ||
if [[ $rc -ne 0 ]] && [[ ":$ErrorSnsTopicArn" != ":" ]]; then | ||
tmpfile=$(mktemp) | ||
echo "See log files for more info: | ||
/var/lib/amazon/toe/TOE_* | ||
grep PCImageBuilderEDA /var/log/messages | less" > $tmpfile | ||
aws --region $AWS_DEFAULT_REGION sns publish --topic-arn $ErrorSnsTopicArn --subject "${ClusterName} install-rootless-docker.sh failed" --message file://$tmpfile | ||
rm $tmpfile | ||
fi | ||
} | ||
trap on_exit EXIT | ||
|
||
# Redirect all IO to /var/log/messages and then echo to stderr | ||
exec 1> >(logger -s -t install-rootless-docker) 2>&1 | ||
|
||
# Install ansible | ||
if ! yum list installed ansible &> /dev/null; then | ||
yum install -y ansible || amazon-linux-extras install -y ansible2 | ||
fi | ||
|
||
external_login_node_config_dir=/opt/slurm/${ClusterName}/config | ||
if [ -e $external_login_node_config_dir ]; then | ||
config_dir=$external_login_node_config_dir | ||
else | ||
config_dir=/opt/slurm/config | ||
fi | ||
config_bin_dir=$config_dir/bin | ||
ANSIBLE_PATH=$config_dir/ansible | ||
PLAYBOOKS_PATH=$ANSIBLE_PATH/playbooks | ||
PLAYBOOKS_ZIP_PATH=$ANSIBLE_PATH/playbooks.zip | ||
|
||
if ! [ -e $external_login_node_config_dir ]; then | ||
mkdir -p $config_bin_dir | ||
|
||
ansible_head_node_vars_yml_s3_url="s3://$assets_bucket/$assets_base_key/config/ansible/ansible_head_node_vars.yml" | ||
ansible_compute_node_vars_yml_s3_url="s3://$assets_bucket/$assets_base_key/config/ansible/ansible_compute_node_vars.yml" | ||
ansible_external_login_node_vars_yml_s3_url="s3://$assets_bucket/$assets_base_key/config/ansible/ansible_external_login_node_vars.yml" | ||
|
||
# Download ansible playbooks | ||
aws s3 cp $playbooks_s3_url ${PLAYBOOKS_ZIP_PATH}.new | ||
if ! [ -e $PLAYBOOKS_ZIP_PATH ] || ! diff -q $PLAYBOOKS_ZIP_PATH ${PLAYBOOKS_ZIP_PATH}.new; then | ||
mv $PLAYBOOKS_ZIP_PATH.new $PLAYBOOKS_ZIP_PATH | ||
rm -rf $PLAYBOOKS_PATH | ||
mkdir -p $PLAYBOOKS_PATH | ||
pushd $PLAYBOOKS_PATH | ||
yum -y install unzip | ||
unzip $PLAYBOOKS_ZIP_PATH | ||
chmod -R 0700 $ANSIBLE_PATH | ||
popd | ||
fi | ||
|
||
aws s3 cp $ansible_head_node_vars_yml_s3_url /opt/slurm/config/ansible/ansible_head_node_vars.yml | ||
|
||
aws s3 cp $ansible_compute_node_vars_yml_s3_url /opt/slurm/config/ansible/ansible_compute_node_vars.yml | ||
|
||
aws s3 cp $ansible_external_login_node_vars_yml_s3_url /opt/slurm/config/ansible/ansible_external_login_node_vars.yml | ||
fi | ||
|
||
pushd $PLAYBOOKS_PATH | ||
|
||
ansible-playbook $PLAYBOOKS_PATH/install-rootless-docker.yml \ | ||
-i inventories/local.yml \ | ||
-e @$ANSIBLE_PATH/ansible_compute_node_vars.yml | ||
|
||
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,3 +10,4 @@ | |
- security_updates | ||
- bug_fixes | ||
- ParallelClusterComputeNode | ||
- install-rootless-docker |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
--- | ||
- name: Configure rootless docker for user | ||
hosts: | ||
- ExternalLoginNode | ||
become_user: root | ||
become: yes | ||
roles: | ||
- configure-rootless-docker |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
--- | ||
- name: Install rootless docker for OCI containers | ||
hosts: | ||
- ParallelClusterComputeNode | ||
become_user: root | ||
become: yes | ||
roles: | ||
- install-rootless-docker |
6 changes: 6 additions & 0 deletions
6
source/resources/playbooks/roles/ParallelClusterHeadNode/files/opt/slurm/etc/oci.conf
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
EnvExclude="^(SLURM_CONF|SLURM_CONF_SERVER)=" | ||
RunTimeEnvExclude="^(SLURM_CONF|SLURM_CONF_SERVER)=" | ||
RunTimeQuery="runc --rootless=true --root=/run/user/%U/ state %n.%u.%j.%s.%t" | ||
RunTimeKill="runc --rootless=true --root=/run/user/%U/ kill -a %n.%u.%j.%s.%t" | ||
RunTimeDelete="runc --rootless=true --root=/run/user/%U/ delete --force %n.%u.%j.%s.%t" | ||
RunTimeRun="runc --rootless=true --root=/run/user/%U/ run %n.%u.%j.%s.%t -b |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finish docs