Skip to content

This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow and PyTorch that have been optimized for Intel platforms. Scaling is done with python, Docker, kubernetes, kubeflow, cnvrg.io, Helm, and other container orchestration frameworks for use in the cloud and on-premise

License

Notifications You must be signed in to change notification settings

intel/ai-containers

AI Containers

OpenSSF Best Practices OpenSSF Scorecard pre-commit.ci status Coverage Status CodeQL Docs Lint Test Runner CI Helm Chart CI Weekly Tests

This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow and PyTorch that have been optimized for Intel platforms. Scaling is done with python, Docker, kubernetes, kubeflow, cnvrg.io, Helm, and other container orchestration frameworks for use in the cloud and on-premise.

Project Setup

Define your project's registry and repository each time you use the project:

# REGISTRY/REPO:TAG
export REGISTRY=<registry_name>
export REPO=<repo_name>

docker login $REGISTRY

# Verify your access permissions
docker pull $REGISTRY/$REPO:latest

The maintainers of AI Containers use Azure to store containers, but an open source container registry like harbor is preferred.

Warning

You can optionally skip this step and use some placeholder values, however some container groups depend on other images and will pull from a registry that you have not defined and result in an error.

Set Up Docker Engine

You'll need to install Docker Engine on your development system. Note that while Docker Engine is free to use, Docker Desktop may require you to purchase a license. See the Docker Engine Server installation instructions for details.

Set Up Docker Compose

Ensure you have Docker Compose installed on your machine. If you don't have this tool installed, consult the official Docker Compose installation documentation.

DOCKER_CONFIG=${DOCKER_CONFIG:-$HOME/.docker}
mkdir -p $DOCKER_CONFIG/cli-plugins
curl -SL https://github.com/docker/compose/releases/download/v2.26.1/docker-compose-linux-x86_64 -o $DOCKER_CONFIG/cli-plugins/docker-compose
chmod +x $DOCKER_CONFIG/cli-plugins/docker-compose
docker compose version

Caution

Docker compose v2.25.0 is the minimum required version for some container groups.

Build Containers

Select your framework of choice (TensorFlow*, PyTorch*, Classical ML) and run the docker compose commands:

cd <framework>
docker compose up --build

To configure these containers, simply append the relevant environment variable to the docker compose command based on the build arguments in the compose file. For example:

# I want to build ipex-base with Intel® Distribution for Python
cd pytorch
PACKAGE_OPTION=idp docker compose up --build ipex-base

Note

If you didn't specify REGISTRY or REPO, you also need to add the idp service to the list to build the dependent python image.

Test Containers

To test the containers, use the Test Runner Framework:

# I want to test ipex-base with Intel® Distribution for Python
# 1. build the container in the above section
# 2. push it to a relevant registry
PACKAGE_OPTION=idp docker compose push ipex-base
cd ..
# 3. install the test runner python requirements
pip install -r test-runner/requirements.txt
# 4. Run the test file
PACKAGE_OPTION=idp python test-runner/test_runner.py -f pytorch/tests/tests.yaml

Tip

To test a container built by GitHub Actions CI/CD, find the run number associated with the workflow run and set the GITHUB_RUN_NUMBER environment variable during execution to pull the desired image.

Deploy Containers

Install Helm

This assumes you've setup kubectl and have a KUBECONFIG.

curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 && \
chmod 700 get_helm.sh && \
./get_helm.sh

Deploy a Helm Chart

cd workflows/charts
# Select a Chart and check its README for a list of customization options and other steps required.
helm install <name> \
  --namespace=<namespace> \
  --set <key>=<value> \
  <chart-folder>

Test a Helm Chart

Install Chart Testing.

pip install -r workflows/charts/dev-requirements.txt
brew install chart-testing

Utilize the ct CLI to run helm lint, helm install, and helm test.

ct lint-and-install --namespace=<namespace> --config .github/ct.yaml --charts workflow/charts/<chart>

Troubleshooting

  • See the Docker Troubleshooting Article.
  • Verify that Docker Engine Post-Install Steps are completed.
  • When facing socket error check the group membership of the user and ensure they are part of the docker group.
  • After changing any docker files or configs, restart the docker service sudo systemctl restart docker.
  • Enable Docker Desktop for WSL 2.
  • If you are trying to access a container UI from the browser, make sure you have port forwarded and reconnect.
  • If your environment requires a proxy to access the internet, export your development system's proxy settings to the docker environment:
export DOCKER_BUILD_ARGS="--build-arg ftp_proxy=${ftp_proxy} \
  --build-arg FTP_PROXY=${FTP_PROXY} --build-arg http_proxy=${http_proxy} \
  --build-arg HTTP_PROXY=${HTTP_PROXY} --build-arg https_proxy=${https_proxy} \
  --build-arg HTTPS_PROXY=${HTTPS_PROXY} --build-arg no_proxy=${no_proxy} \
  --build-arg NO_PROXY=${NO_PROXY} --build-arg socks_proxy=${socks_proxy} \
  --build-arg SOCKS_PROXY=${SOCKS_PROXY}"
export DOCKER_RUN_ENVS="-e ftp_proxy=${ftp_proxy} \
  -e FTP_PROXY=${FTP_PROXY} -e http_proxy=${http_proxy} \
  -e HTTP_PROXY=${HTTP_PROXY} -e https_proxy=${https_proxy} \
  -e HTTPS_PROXY=${HTTPS_PROXY} -e no_proxy=${no_proxy} \
  -e NO_PROXY=${NO_PROXY} -e socks_proxy=${socks_proxy} \
  -e SOCKS_PROXY=${SOCKS_PROXY}"
docker build $DOCKER_BUILD_ARGS -t my:tag .
docker run $DOCKER_RUN_ENVS --rm -it my:tag

Support

The Intel AI MLOps team tracks bugs and enhancement requests using GitHub issues. Before submitting a suggestion or bug report, search the existing GitHub issues to see if your issue has already been reported.


About

This repository contains Dockerfiles, scripts, yaml files, Helm charts, etc. used to scale out AI containers with versions of TensorFlow and PyTorch that have been optimized for Intel platforms. Scaling is done with python, Docker, kubernetes, kubeflow, cnvrg.io, Helm, and other container orchestration frameworks for use in the cloud and on-premise

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks