Installation

The FIL backend is a part of Triton and can be installed via the methods described in the main Triton documentation. To quickly get up and running with a Triton Docker image, follow these steps.

Note: Looking for instructions to build the FIL backend yourself? Check out our build guide.

Prerequisites

Docker
The NVIDIA container toolkit

Getting the container

Triton containers are available from NGC and may be pulled down via

docker pull nvcr.io/nvidia/tritonserver:22.10-py3

Note that the FIL backend cannot be used in the 21.06 version of this container; the 21.06.1 patch release is the earliest Triton version with a working FIL backend implementation.

Starting the container

In order to actually deploy a model, you will need to provide the serialized model and configuration file in a specially-structured directory called the "model repository." Check out the configuration guide for details on how to do this for your model.

Assuming your model repository is on the host system, you can bind-mount it into the container and start the server via the following command:

docker run --gpus all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v ${MODEL_REPO}:/models --name tritonserver nvcr.io/nvidia/tritonserver:22.11-py3 tritonserver --model-repository=/models

Remember that bind-mounts require an absolute path to the host directory, so ${MODEL_REPO} should be replaced by the absolute path to the model repository directory on the host.

Assuming you started your container with the name "tritonserver" as in the above snippet, you can bring the server down again and remove the container with:

docker rm -f tritonserver

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

install.md

install.md

Installation

Prerequisites

Getting the container

Starting the container

Files

install.md

Latest commit

History

install.md

File metadata and controls

Installation

Prerequisites

Getting the container

Starting the container