The example script image_classification.py
runs inference using a number of
popular image classification models. This script is included in the NVIDIA
TensorFlow Docker containers under /workspace/nvidia-examples
. See Preparing
To Use NVIDIA
Containers
for more information.
You can enable TF-TRT integration by passing the --use_trt
flag to the script.
This causes the script to apply TensorRT inference optimization to speed up
execution for portions of the model's graph where supported, and to fall back on
native TensorFlow for layers and operations which are not supported. See
Accelerating Inference In TensorFlow With TensorRT User
Guide for
more information.
When using the TF-TRT integration flag, you can use the precision option
(--precision
) to control precision. float32 is the default (--precision fp32
) with float16 (--precision fp16
) or int8 (--precision int8
) allowing
further performance improvements.
int8 mode requires a calibration step (which is done automatically), but you
also must specificy the directory in which the calibration dataset is stored
with --calib_data_dir /imagenet_validation_data
. You can use the same data
for both calibration and validation.
We have verified the following models.
- MobileNet v1
- MobileNet v2
- NASNet - Large
- NASNet - Mobile
- ResNet50 v1
- ResNet50 v2
- VGG16
- VGG19
- Inception v3
- Inception v4
For the accuracy numbers of these models on the ImageNet validation dataset, see Verified Models.
If you are running these examples within the NVIDIA TensorFlow docker
container under
/workspace/nvidia-examples/tensorrt/tftrt/examples/image-classification
, run
the install_dependencies.sh
setup script. Then skip below to the
Data section.
cd /workspace/nvidia-examples/tensorrt/tftrt/examples/image-classification
./install_dependencies.sh
cd ../third_party/models
export PYTHONPATH="$PYTHONPATH:$PWD"
If you are running these examples within your own TensorFlow environment, perform the following steps:
# Clone this repository (tensorflow/tensorrt) if you haven't already.
git clone https://github.com/tensorflow/tensorrt.git
# Clone tensorflow/models.
git clone https://github.com/tensorflow/models.git
# Add the models directory to PYTHONPATH to install tensorflow/models.
cd models
export PYTHONPATH="$PYTHONPATH:$PWD"
# Run the TensorFlow Slim setup.
cd research/slim
python setup.py install
# Install the requests package.
pip install requests
The PYTHONPATH
environment variable is not saved between different shell
sessions. To avoid having to set PYTHONPATH
in each new shell session, you
can add the following line to your .bashrc
file:
export PYTHONPATH="$PYTHONPATH:/path/to/tensorflow_models"
replacing /path/to/tensorflow_models
with the path to your tensorflow/models
repository).
Also see Setting Up The Environment for more information.
The example script supports either using a dataset (for validation
mode - TFRecord format, for benchmark mode - jpeg format) or using
autogenerated synthetic data (with the --use_synthetic
flag). If you use
TFRecord files, the script assumes that the TFRecords are named according to the
pattern: validation-*-of-00128
.
Note: The reported accuracy numbers are the results of running the scripts on the ImageNet validation dataset.
To download and process the ImageNet data, you can:
- Use the scripts provided in the
nvidia-examples/build_imagenet_data
directory in the NVIDIA TensorFlow Docker containerworkspace
directory. Follow theREADME
file in that directory for instructions on how to use these scripts.
or
- Use the scripts provided by TF Slim in the
tensorflow/models
repository atresearch/slim
. Consult theREADME
file under `research/slim for instructions on how to use these scripts. Also please note that these scripts download both the training and validation sets, and this example only requires the validation set.
Also see Obtaining The ImageNet Data for more information.
You can run the examples as a Jupyter notebook (image-classification.ipynb
)
from this directory:
jupyter notebook --ip=0.0.0.0
If you want to run these examples as a Jupyter notebook within an NVIDIA
TensorFlow Docker container, first you need to run the container with the
--publish 0.0.0.0:8888:8888
option to publish Jupyter's port 8888
to the
host machine at port 8888
over all network interfaces (0.0.0.0
). Then you
can use the following command in the
/workspace/nvidia-examples/tensorrt/tftrt/examples/image-classification
directory:
jupyter notebook --ip=0.0.0.0 --allow-root
The main Python script is image_classification.py
. Assuming that the ImageNet
validation data are located under /data/imagenet/train-val-tfrecord
, you can
evaluate inference with TF-TRT integration using the pre-trained ResNet V1 50
model as follows:
python image_classification.py --model resnet_v1_50 \
--data_dir /data/imagenet/train-val-tfrecord \
--use_trt \
--precision fp16 \
--mode validation
Where:
--model
: Which model to use to run inference, in this case ResNet V1 50.
--data_dir
: Path to the ImageNet TFRecord validation files.
--use_trt
: Convert the graph to a TensorRT graph.
--precision
: Precision mode to use, in this case FP16.
--mode
: Which mode to use (validation or benchmark). In validation we run inference with accuracy and performance measurments, in benchmark only performance.
Run with --help
to see all available options.
Also see General Script Usage for more information.
The script first loads the pre-trained model. If given the flag --use_trt
,
the model is converted to a TensorRT graph, and the script displays (in addition
to its initial configuration options):
-
the number of nodes before conversion (
num_nodes(native_tf)
) -
the number of nodes after conversion (
num_nodes(trt_total)
) -
the number of separate TensorRT nodes (
num_nodes(trt_only)
) -
the size of the graph before conversion (
graph_size(MB)(native_tf)
) -
the size of the graph after conversion (
graph_size(MB)(trt)
) -
how long the conversion took (
time(s)(trt_conversion)
)
For example:
num_nodes(native_tf): 741
num_nodes(trt_total): 10
num_nodes(trt_only): 1
graph_size(MB)(native_tf): ***
graph_size(MB)(tft): ***
time(s)(trt_conversion): ***
Note: For a list of supported operations that can be converted to a TensorRT graph, see the Supported Ops section of the Accelerating Inference In TensorFlow With TensorRT User Guide.
The script then begins running inference on the ImageNet validation set,
displaying run times of each iteration after the interval defined by the
--display_every
option (default: 100
):
running inference...
step 100/6202, iter_time(ms)=**.****, images/sec=***
step 200/6202, iter_time(ms)=**.****, images/sec=***
step 300/6202, iter_time(ms)=**.****, images/sec=***
...
On completion, the script prints overall accuracy and timing information over the inference session:
results of resnet_v1_50:
accuracy: 75.95
images/sec: ***
99th_percentile(ms): ***
total_time(s): ***
latency_mean(ms): ***
The accuracy metric measures the percentage of predictions from inference that match the labels on the ImageNet Validation set. The remaining metrics capture various performance measurements:
-
number of images processed per second (
images/sec
) -
total time of the inference session (
total_time(s)
) -
the mean duration for each iteration (
latency_mean(ms)
) -
the slowest duration for an iteration (
99th_percentile(ms)
)