Multi Model NVIDIA Triton Server

Requirements

Kubernetes
NVIDIA GPU Node(s)
NVIDIA GPU Operator
Kube Prometheus Stack
Docker image repo with push/pull authorization
Docker Compose v3

Available Models

Deploy up to ten models using Triton Inference Server on Kubernetes load balancing provided by Traefik and horizontal pod autoscaling based on metrics from Triton Metrics Server.

Name	Platform	Inputs	Outputs	Batch
densenet-onnx	onnxruntime_onnx	1	1	1
inception-graphdef	tensorflow_graphdef	1	1	128
log-parsing-onnx	onnxruntime_onnx	2	1	32
phishing-bert-onnx	onnxruntime_onnx	2	1	32
sid-minibert-onnx	onnxruntime_onnx	2	1	32
simple	tensorflow_graphdef	2	2	8
simple-dyna-sequence	tensorflow_graphdef	1	1	1
simple-identity	tensorflow_savedmodel	1	1	8
simple-int8	tensorflow_graphdef	2	2	8
simple-sequence	tensorflow_graphdef	1	1	1

With the exception of inception-graphdef and densenet-onnx, models are stored using git LFS. To download these you must have git LFS installed.

Getting Started

Clone this repository.

git clone https://github.com/tuttlebr/triton-server-demo.git

cd triton-server-demo

Update the model repo with models stored on git LFS.
```
git lfs pull
```
Build the required Docker images and download the inception-graphdef and densenet-onnx models. This will also attempt to push the containers to a repository accessible by your Kubernetes nodes and should be updated as needed. This will also deploy the Helm chart.
```
bash 00-start.sh
```
This will query the Triton Server. It may take a while for all pods to come online and for Triton to start all models. You can use this to check the status.
```
bash 01-querytriton.sh
```
Run the NVIDIA perf_analyzer from within a Triton Client pod.
```
bash 02-perfanalyzer.sh
```
You can monitor the Triton Server by uploading the triton-server-dashboard.json file to your Grafana dashboard.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
model_repository		model_repository
tritoninferenceserver		tritoninferenceserver
.dockerignore		.dockerignore
.env		.env
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
00-start.sh		00-start.sh
01-querytriton.sh		01-querytriton.sh
02-perfanalyzer.sh		02-perfanalyzer.sh
Dockerfile		Dockerfile
Grafana Dashboard Sample.png		Grafana Dashboard Sample.png
README.md		README.md
deployed_models.txt		deployed_models.txt
docker-compose.yaml		docker-compose.yaml
fetch_models.sh		fetch_models.sh
triton-server-dashboard.json		triton-server-dashboard.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi Model NVIDIA Triton Server

Requirements

Available Models

Getting Started

About

Releases

Packages

Languages

tuttlebr/triton-server-demo

Folders and files

Latest commit

History

Repository files navigation

Multi Model NVIDIA Triton Server

Requirements

Available Models

Getting Started

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages