-
Notifications
You must be signed in to change notification settings - Fork 4
AWS
Rob Nagler edited this page Nov 20, 2019
·
35 revisions
From console:
- CentOS 7 (x86_64) - with Updates HVM
- Root partition: 10g, encrypted
- Security group: public-ssh
- Launch
- Use existing key
Once booted, get public and private IPs:
- Add IPs to named
- Setup host with
rsconf_db.components: [ docker ]
run /srv/rsconf/aws-init.sh <ip>
Proceed with post installation instructions.
# need kernel source which is always the latest so do update first
yum update -y
yum install -y kernel-devel
# if new kernel, then
reboot
yum remove $(rpm -qa | grep ^kernel-3 | grep -v $(uname -r))
yum install -y https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-10.1.243-1.x86_64.rpm
# needs to be installed manually:
# http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/yum-plugin-nvidia-0.5-1.el7.noarch.rpm: [Errno -1] Package does not match intended download.
yum install -y http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/yum-plugin-nvidia-0.5-1.el7.noarch.rpm
yum install -y cuda-drivers
nvidia-smi
curl -s -L https://nvidia.github.io/nvidia-container-runtime/centos7/nvidia-container-runtime.repo \
| install -m 444 /dev/stdin /etc/yum.repos.d/nvidia-container-runtime.repo
yum install -y nvidia-container-runtime
systemctl restart docker
docker run -it --gpus=all --net=host --rm tensorflow/tensorflow:latest-gpu python -c 'import tensorflow; tensorflow.config.experimental.list_physical_devices("gpu")'
<snip>
Found device 0 with properties:
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:1e.0
<snip>
See https://github.com/NVIDIA/nvidia-docker/wiki
docker run -it --name=gpu -u root radiasoft/beamsim-jupyter:prod bash <<'EOF'
rpm -i https://developer.download.nvidia.com/compute/cuda/repos/fedora29/x86_64/cuda-repo-fedora29-10.1.243-1.x86_64.rpm
dnf install -y kmodtool kernel-devel
dnf install -y cuda-drivers
EOF
docker commit --change 'USER vagrant' --change 'CMD ["/home/vagrant/.radia-run/tini", "--", "/home/vagrant/.radia-run/start"]' gpu gpu
docker rm gpu
You have to convert the ssh private key to PEM format then DER format, and finally compute md5:
ssh-keygen -e -m PEM -f ~/.ssh/id_rsa | openssl rsa -RSAPublicKey_in -outform DER | openssl md5 -c