Skip to content

Commit

Permalink
chore: Bumpenvs for NGC+ images (#8975)
Browse files Browse the repository at this point in the history
* chore: Bumpenvs for NGC+ images

* linting

* this didn't need to be in here

* update release notes

* release notes

* oops

* rst is hard

* Update ngc-images.rst

* run rstfmt

---------

Co-authored-by: Tara <[email protected]>
  • Loading branch information
MikhailKardash and tara-det-ai authored Mar 11, 2024
1 parent 1a35e5d commit cc2e9b4
Show file tree
Hide file tree
Showing 28 changed files with 204 additions and 163 deletions.
8 changes: 4 additions & 4 deletions .circleci/real_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ parameters:
# be referenced by --ee testing.
default-pt-gpu-image:
type: string
default: determinedai/environments:cuda-11.3-pytorch-1.12-gpu-2196775
default: determinedai/environments:cuda-11.3-pytorch-1.12-gpu-03ae7d7
# Some python, go, and react dependencies are cached by circleci via `save_cache`/`restore_cache`.
# If the dependencies stay the same, but the circleci code that would produce them is changed,
# it may be necessary to invalidate the cache by incrementing this value.
Expand Down Expand Up @@ -223,7 +223,7 @@ commands:
- when:
condition: <<parameters.tf2>>
steps:
- run: docker pull determinedai/environments:py-3.9-pytorch-1.12-tf-2.11-cpu-2196775
- run: docker pull determinedai/environments:py-3.9-pytorch-1.12-tf-2.11-cpu-03ae7d7

login-docker:
parameters:
Expand Down Expand Up @@ -1858,7 +1858,7 @@ jobs:

test-unit-harness-pytorch2-gpu:
docker:
- image: determinedai/environments:cuda-11.8-pytorch-2.0-gpu-2196775
- image: determinedai/environments:cuda-11.8-pytorch-2.0-gpu-03ae7d7
resource_class: determined-ai/container-runner-gpu
steps:
- run: mkdir -p ~/.ssh && ssh-keyscan github.com >> ~/.ssh/known_hosts
Expand All @@ -1879,7 +1879,7 @@ jobs:

test-unit-harness-pytorch2-cpu:
docker:
- image: determinedai/environments:py-3.10-pytorch-2.0-cpu-2196775
- image: determinedai/environments:py-3.10-pytorch-2.0-cpu-03ae7d7
steps:
- run: mkdir -p ~/.ssh && ssh-keyscan github.com >> ~/.ssh/known_hosts
- checkout
Expand Down
18 changes: 18 additions & 0 deletions docs/release-notes/ngc-images.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
:orphan:

**New Features**

- Include early-access NVIDIA NGC-based images in our environment offerings. These images are
accessible from `pytorch-ngc <https://hub.docker.com/r/determinedai/pytorch-ngc>`_ or
`tensorflow-ngc <https://hub.docker.com/r/determinedai/tensorflow-ngc>`_. By downloading and
using these images, users acknowledge and agree to the terms and conditions of all third-party
software licenses contained within, including the `NVIDIA Deep Learning Container License
<https://developer.download.nvidia.com/licenses/NVIDIA_Deep_Learning_Container_License.pdf>`__.
Users can build their own images from a specified NGC container version by using the
``build-pytorch-ngc`` or ``build-tensorflow-ngc`` workflows located in our environments
``MakeFile`` in the `environments repository <https://github.com/determined-ai/environments>`_.

**Improvements**

- Eliminate TensorFlow 2.8 images from our offerings. Default TensorFlow 2.11 images remain
available for TensorFlow users.
4 changes: 2 additions & 2 deletions docs/setup-cluster/deploy-cluster/slurm/singularity.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@ by default in this version of Determined are described below.
+-------------+--------------------------------------------------------------------------+
| Environment | File Name |
+=============+==========================================================================+
| CPUs | ``determinedai/environments:py-3.9-pytorch-1.12-tf-2.11-cpu-2196775`` |
| CPUs | ``determinedai/environments:py-3.9-pytorch-1.12-tf-2.11-cpu-03ae7d7`` |
+-------------+--------------------------------------------------------------------------+
| NVIDIA GPUs | ``determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-2196775`` |
| NVIDIA GPUs | ``determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-03ae7d7`` |
+-------------+--------------------------------------------------------------------------+
| AMD GPUs | ``determinedai/environments:rocm-5.0-pytorch-1.10-tf-2.7-rocm-622d512`` |
+-------------+--------------------------------------------------------------------------+
Expand Down
4 changes: 2 additions & 2 deletions docs/setup-cluster/slurm/singularity.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@ by default in this version of Determined are described below.
+-------------+--------------------------------------------------------------------------+
| Environment | File Name |
+=============+==========================================================================+
| CPUs | ``determinedai/environments:py-3.9-pytorch-1.12-tf-2.11-cpu-2196775`` |
| CPUs | ``determinedai/environments:py-3.9-pytorch-1.12-tf-2.11-cpu-03ae7d7`` |
+-------------+--------------------------------------------------------------------------+
| NVIDIA GPUs | ``determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-2196775`` |
| NVIDIA GPUs | ``determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-03ae7d7`` |
+-------------+--------------------------------------------------------------------------+
| AMD GPUs | ``determinedai/environments:rocm-5.0-pytorch-1.10-tf-2.7-rocm-622d512`` |
+-------------+--------------------------------------------------------------------------+
Expand Down
2 changes: 1 addition & 1 deletion docs/setup-cluster/slurm/slurm-requirements.rst
Original file line number Diff line number Diff line change
Expand Up @@ -510,7 +510,7 @@ platform. There may be additional per-user configuration that is required.

.. code:: bash
image=determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-2196775
image=determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-03ae7d7
cd /shared/enroot/images
enroot import docker://$image
enroot create /shared/enroot/images/${image//[\/:]/\+}.sqsh
Expand Down
12 changes: 6 additions & 6 deletions e2e_tests/tests/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,12 @@
MAX_TRIAL_BUILD_SECS = 90


DEFAULT_TF2_CPU_IMAGE = "determinedai/environments:py-3.9-pytorch-1.12-tf-2.11-cpu-2196775"
DEFAULT_TF2_GPU_IMAGE = "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-2196775"
DEFAULT_PT_CPU_IMAGE = "determinedai/environments:py-3.9-pytorch-1.12-cpu-2196775"
DEFAULT_PT_GPU_IMAGE = "determinedai/environments:cuda-11.3-pytorch-1.12-gpu-2196775"
DEFAULT_PT2_CPU_IMAGE = "determinedai/environments:py-3.10-pytorch-2.0-cpu-2196775"
DEFAULT_PT2_GPU_IMAGE = "determinedai/environments:cuda-11.8-pytorch-2.0-gpu-2196775"
DEFAULT_TF2_CPU_IMAGE = "determinedai/environments:py-3.9-pytorch-1.12-tf-2.11-cpu-03ae7d7"
DEFAULT_TF2_GPU_IMAGE = "determinedai/environments:cuda-11.3-pytorch-1.12-tf-2.11-gpu-03ae7d7"
DEFAULT_PT_CPU_IMAGE = "determinedai/environments:py-3.9-pytorch-1.12-cpu-03ae7d7"
DEFAULT_PT_GPU_IMAGE = "determinedai/environments:cuda-11.3-pytorch-1.12-gpu-03ae7d7"
DEFAULT_PT2_CPU_IMAGE = "determinedai/environments:py-3.10-pytorch-2.0-cpu-03ae7d7"
DEFAULT_PT2_GPU_IMAGE = "determinedai/environments:cuda-11.8-pytorch-2.0-gpu-03ae7d7"

TF2_CPU_IMAGE = os.environ.get("TF2_CPU_IMAGE") or DEFAULT_TF2_CPU_IMAGE
TF2_GPU_IMAGE = os.environ.get("TF2_GPU_IMAGE") or DEFAULT_TF2_GPU_IMAGE
Expand Down
20 changes: 10 additions & 10 deletions harness/determined/deploy/aws/templates/efs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,35 +3,35 @@ Mappings:
RegionMap:
ap-northeast-1:
Master: ami-00910ef9457f0df47
Agent: ami-0a06577b7707632f8
Agent: ami-0581f861a00dab473
# TODO(DET-4258) Uncomment these when we fully support all P3 regions.
# ap-northeast-2:
# Master: ami-035e3e44dc41db6a2
# Agent: ami-00d8faa0b84ddbd88
# Agent: ami-0ba8919c43d1d8cab
# ap-southeast-1:
# Master: ami-0fd1ee6c8b656f020
# Agent: ami-0435fcad25753a1a6
# Agent: ami-08190d9f352e9157f
# ap-southeast-2:
# Master: ami-0b62ecd3babd1c548
# Agent: ami-0067e868a8d4f8a1e
# Agent: ami-0af8a0dc1582a1f3b
eu-central-1:
Master: ami-0abbe417ed83c0b29
Agent: ami-0ec870b46499de494
Agent: ami-09d66f5a074e983ca
eu-west-1:
Master: ami-0e3f7dd2dc743e48a
Agent: ami-013c92fb90c7a7971
Agent: ami-027d2d1f92b4ba863
# eu-west-2:
# Master: ami-0d78429fb6af30994
# Agent: ami-0a5b2c87970d66d5b
# Agent: ami-0c1f3c5203afb4a8f
us-east-1:
Master: ami-0172070f66a8ebe63
Agent: ami-061ff7cfe905dfbd5
Agent: ami-053c295dd390f2a40
us-east-2:
Master: ami-0bafa3699418551cd
Agent: ami-072947196f5aaa3a5
Agent: ami-07e0b9bf1ea2852d8
us-west-2:
Master: ami-0ceeab680f529cc36
Agent: ami-0d9494a6bae62f03e
Agent: ami-0f74a44ed4ce75025

Parameters:
VpcCIDR:
Expand Down
20 changes: 10 additions & 10 deletions harness/determined/deploy/aws/templates/fsx.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,35 +3,35 @@ Mappings:
RegionMap:
ap-northeast-1:
Master: ami-00910ef9457f0df47
Agent: ami-0a06577b7707632f8
Agent: ami-0581f861a00dab473
# TODO(DET-4258) Uncomment these when we fully support all P3 regions.
# ap-northeast-2:
# Master: ami-035e3e44dc41db6a2
# Agent: ami-00d8faa0b84ddbd88
# Agent: ami-0ba8919c43d1d8cab
# ap-southeast-1:
# Master: ami-0fd1ee6c8b656f020
# Agent: ami-0435fcad25753a1a6
# Agent: ami-08190d9f352e9157f
# ap-southeast-2:
# Master: ami-0b62ecd3babd1c548
# Agent: ami-0067e868a8d4f8a1e
# Agent: ami-0af8a0dc1582a1f3b
eu-central-1:
Master: ami-0abbe417ed83c0b29
Agent: ami-0ec870b46499de494
Agent: ami-09d66f5a074e983ca
eu-west-1:
Master: ami-0e3f7dd2dc743e48a
Agent: ami-013c92fb90c7a7971
Agent: ami-027d2d1f92b4ba863
# eu-west-2:
# Master: ami-0d78429fb6af30994
# Agent: ami-0a5b2c87970d66d5b
# Agent: ami-0c1f3c5203afb4a8f
us-east-1:
Master: ami-0172070f66a8ebe63
Agent: ami-061ff7cfe905dfbd5
Agent: ami-053c295dd390f2a40
us-east-2:
Master: ami-0bafa3699418551cd
Agent: ami-072947196f5aaa3a5
Agent: ami-07e0b9bf1ea2852d8
us-west-2:
Master: ami-0ceeab680f529cc36
Agent: ami-0d9494a6bae62f03e
Agent: ami-0f74a44ed4ce75025

Parameters:
VpcCIDR:
Expand Down
4 changes: 2 additions & 2 deletions harness/determined/deploy/aws/templates/govcloud.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ Mappings:
RegionMap:
us-gov-east-1:
Master: ami-04ef693ebcf519dc3
Agent: ami-0b20a31b3607df583
Agent: ami-0f8b824c4dd2a429d
us-gov-west-1:
Master: ami-08bd15d820a3c087e
Agent: ami-09e52c95db3076b2a
Agent: ami-05ab84798a30661d3
Parameters:
Keypair:
Type: AWS::EC2::KeyPair::KeyName
Expand Down
20 changes: 10 additions & 10 deletions harness/determined/deploy/aws/templates/lore.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,35 +3,35 @@ Mappings:
RegionMap:
ap-northeast-1:
Master: ami-00910ef9457f0df47
Agent: ami-0a06577b7707632f8
Agent: ami-0581f861a00dab473
# TODO(DET-4258) Uncomment these when we fully support all P3 regions.
# ap-northeast-2:
# Master: ami-035e3e44dc41db6a2
# Agent: ami-00d8faa0b84ddbd88
# Agent: ami-0ba8919c43d1d8cab
# ap-southeast-1:
# Master: ami-0fd1ee6c8b656f020
# Agent: ami-0435fcad25753a1a6
# Agent: ami-08190d9f352e9157f
# ap-southeast-2:
# Master: ami-0b62ecd3babd1c548
# Agent: ami-0067e868a8d4f8a1e
# Agent: ami-0af8a0dc1582a1f3b
eu-central-1:
Master: ami-0abbe417ed83c0b29
Agent: ami-0ec870b46499de494
Agent: ami-09d66f5a074e983ca
eu-west-1:
Master: ami-0e3f7dd2dc743e48a
Agent: ami-013c92fb90c7a7971
Agent: ami-027d2d1f92b4ba863
# eu-west-2:
# Master: ami-0d78429fb6af30994
# Agent: ami-0a5b2c87970d66d5b
# Agent: ami-0c1f3c5203afb4a8f
us-east-1:
Master: ami-0172070f66a8ebe63
Agent: ami-061ff7cfe905dfbd5
Agent: ami-053c295dd390f2a40
us-east-2:
Master: ami-0bafa3699418551cd
Agent: ami-072947196f5aaa3a5
Agent: ami-07e0b9bf1ea2852d8
us-west-2:
Master: ami-0ceeab680f529cc36
Agent: ami-0d9494a6bae62f03e
Agent: ami-0f74a44ed4ce75025

Parameters:
VpcCIDR:
Expand Down
20 changes: 10 additions & 10 deletions harness/determined/deploy/aws/templates/secure.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,44 +4,44 @@ Mappings:
RegionMap:
ap-northeast-1:
Master: ami-00910ef9457f0df47
Agent: ami-0a06577b7707632f8
Agent: ami-0581f861a00dab473
Bastion: ami-00910ef9457f0df47
# TODO(DET-4258) Uncomment these when we fully support all P3 regions.
# ap-northeast-2:
# Master: ami-035e3e44dc41db6a2
# Agent: ami-00d8faa0b84ddbd88
# Agent: ami-0ba8919c43d1d8cab
# Bastion: ami-035e3e44dc41db6a2
# ap-southeast-1:
# Master: ami-0fd1ee6c8b656f020
# Agent: ami-0435fcad25753a1a6
# Agent: ami-08190d9f352e9157f
# Bastion: ami-0fd1ee6c8b656f020
# ap-southeast-2:
# Master: ami-0b62ecd3babd1c548
# Agent: ami-0067e868a8d4f8a1e
# Agent: ami-0af8a0dc1582a1f3b
# Bastion: ami-0b62ecd3babd1c548
eu-central-1:
Master: ami-0abbe417ed83c0b29
Agent: ami-0ec870b46499de494
Agent: ami-09d66f5a074e983ca
Bastion: ami-0abbe417ed83c0b29
eu-west-1:
Master: ami-0e3f7dd2dc743e48a
Agent: ami-013c92fb90c7a7971
Agent: ami-027d2d1f92b4ba863
Bastion: ami-0e3f7dd2dc743e48a
# eu-west-2:
# Master: ami-0d78429fb6af30994
# Agent: ami-0a5b2c87970d66d5b
# Agent: ami-0c1f3c5203afb4a8f
# Bastion: ami-0d78429fb6af30994
us-east-1:
Master: ami-0172070f66a8ebe63
Agent: ami-061ff7cfe905dfbd5
Agent: ami-053c295dd390f2a40
Bastion: ami-0172070f66a8ebe63
us-east-2:
Master: ami-0bafa3699418551cd
Agent: ami-072947196f5aaa3a5
Agent: ami-07e0b9bf1ea2852d8
Bastion: ami-0bafa3699418551cd
us-west-2:
Master: ami-0ceeab680f529cc36
Agent: ami-0d9494a6bae62f03e
Agent: ami-0f74a44ed4ce75025
Bastion: ami-0ceeab680f529cc36

Parameters:
Expand Down
20 changes: 10 additions & 10 deletions harness/determined/deploy/aws/templates/simple-rds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,35 +5,35 @@ Mappings:
RegionMap:
ap-northeast-1:
Master: ami-00910ef9457f0df47
Agent: ami-0a06577b7707632f8
Agent: ami-0581f861a00dab473
# TODO(DET-4258) Uncomment these when we fully support all P3 regions.
# ap-northeast-2:
# Master: ami-035e3e44dc41db6a2
# Agent: ami-00d8faa0b84ddbd88
# Agent: ami-0ba8919c43d1d8cab
# ap-southeast-1:
# Master: ami-0fd1ee6c8b656f020
# Agent: ami-0435fcad25753a1a6
# Agent: ami-08190d9f352e9157f
# ap-southeast-2:
# Master: ami-0b62ecd3babd1c548
# Agent: ami-0067e868a8d4f8a1e
# Agent: ami-0af8a0dc1582a1f3b
eu-central-1:
Master: ami-0abbe417ed83c0b29
Agent: ami-0ec870b46499de494
Agent: ami-09d66f5a074e983ca
eu-west-1:
Master: ami-0e3f7dd2dc743e48a
Agent: ami-013c92fb90c7a7971
Agent: ami-027d2d1f92b4ba863
# eu-west-2:
# Master: ami-0d78429fb6af30994
# Agent: ami-0a5b2c87970d66d5b
# Agent: ami-0c1f3c5203afb4a8f
us-east-1:
Master: ami-0172070f66a8ebe63
Agent: ami-061ff7cfe905dfbd5
Agent: ami-053c295dd390f2a40
us-east-2:
Master: ami-0bafa3699418551cd
Agent: ami-072947196f5aaa3a5
Agent: ami-07e0b9bf1ea2852d8
us-west-2:
Master: ami-0ceeab680f529cc36
Agent: ami-0d9494a6bae62f03e
Agent: ami-0f74a44ed4ce75025

Parameters:
Keypair:
Expand Down
Loading

0 comments on commit cc2e9b4

Please sign in to comment.