We provide a collection of detection models pre-trained on the COCO dataset, the Kitti dataset, the Open Images dataset and the AVA v2.1 dataset. These models can be useful for out-of-the-box inference if you are interested in categories already in COCO (e.g., humans, cars, etc) or in Open Images (e.g., surfboard, jacuzzi, etc). They are also useful for initializing your models when training on novel datasets.
In the table below, we list each such pre-trained model including:
- a model name that corresponds to a config file that was used to train this
model in the
samples/configs
directory, - a download link to a tar.gz file containing the pre-trained model,
- model speed --- we report running time in ms per 600x600 image (including all pre and post-processing), but please be aware that these timings depend highly on one's specific hardware configuration (these timings were performed using an Nvidia GeForce GTX TITAN X card) and should be treated more as relative timings in many cases. Also note that desktop GPU timing does not always reflect mobile run time. For example Mobilenet V2 is faster on mobile devices than Mobilenet V1, but is slightly slower on desktop GPU.
- detector performance on subset of the COCO validation set or Open Images test split as measured by the dataset-specific mAP measure. Here, higher is better, and we only report bounding box mAP rounded to the nearest integer.
- Output types (
Boxes
, andMasks
if applicable )
You can un-tar each tar.gz file via, e.g.,:
tar -xzvf ssd_mobilenet_v1_coco.tar.gz
Inside the un-tar'ed directory, you will find:
- a graph proto (
graph.pbtxt
) - a checkpoint
(
model.ckpt.data-00000-of-00001
,model.ckpt.index
,model.ckpt.meta
) - a frozen graph proto with weights baked into the graph as constants
(
frozen_inference_graph.pb
) to be used for out of the box inference (try this out in the Jupyter notebook!) - a config file (
pipeline.config
) which was used to generate the graph. These directly correspond to a config file in the samples/configs) directory but often with a modified score threshold. In the case of the heavier Faster R-CNN models, we also provide a version of the model that uses a highly reduced number of proposals for speed.
Some remarks on frozen inference graphs:
- If you try to evaluate the frozen graph, you may find performance numbers for some of the models to be slightly lower than what we report in the below tables. This is because we discard detections with scores below a threshold (typically 0.3) when creating the frozen graph. This corresponds effectively to picking a point on the precision recall curve of a detector (and discarding the part past that point), which negatively impacts standard mAP metrics.
- Our frozen inference graphs are generated using the v1.8.0 release version of Tensorflow and we do not guarantee that these will work with other versions; this being said, each frozen inference graph can be regenerated using your current version of Tensorflow by re-running the exporter, pointing it at the model directory as well as the corresponding config file in samples/configs.
Note: The asterisk (☆) at the end of model name indicates that this model supports TPU training.
Model name | Speed (ms) | Pascal [email protected] | Outputs |
---|---|---|---|
faster_rcnn_resnet101_kitti | 79 | 87 | Boxes |
Model name | Speed (ms) | Open Images [email protected]2 | Outputs |
---|---|---|---|
faster_rcnn_inception_resnet_v2_atrous_oid | 727 | 37 | Boxes |
faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid | 347 | Boxes |
Model name | Speed (ms) | Pascal [email protected] | Outputs |
---|---|---|---|
faster_rcnn_resnet101_ava_v2.1 | 93 | 11 | Boxes |
Footnotes
-
This is PASCAL mAP with a slightly different way of true positives computation: see Open Images evaluation protocol. ↩