diff --git a/LICENSE b/LICENSE index 04c6c8c..acf9694 100644 --- a/LICENSE +++ b/LICENSE @@ -218,10 +218,10 @@ Apache License Advanced Micro Devices software license terms, and open source software license terms. These separate license terms govern your use of the third party programs as set forth in the "THIRD-PARTY-PROGRAMS" file. - -=============================================================================== - -ADVANCED MICRO DEVICES, INC. + + ========================================================================= + + ADVANCED MICRO DEVICES, INC. LICENSE AGREEMENT FOR NON-COMMERCIAL MODELS @@ -298,14 +298,13 @@ OFA-depthwise-resnet50, This License Agreement for Non-Commercial Models (“Agreement”) is a legal agreement between you (either an individual or an entity) and Advanced Micro Devices, Inc. on behalf of itself and its subsidiaries and affiliates (collectively -“AMD”). DO NOT USE THE TRAINED MODELS IDENTIFIED ABOVE UNTIL YOU HAVE CAREFULLY -READ THIS AGREEMENT. BY USING, INSTALLING, MODIFYING, COPYING, TRAINING, -BENCHMARKING, OR DISTRIBUTING THE TRAINED MODELS, YOU AGREE TO AND ACCEPT ALL -TERMS AND CONDITIONS OF THIS AGREEMENT. If you do not accept these terms, do not -use the Trained Models. - -1. Subject to your compliance with this Agreement, AMD grants you a license to -use, modify, and distribute the Trained Models solely for non-commercial and research +“AMD”). DO NOT USE THE TRAINED MODELS IDENTIFIED ABOVE UNTIL YOU HAVE CAREFULLY READ +THIS AGREEMENT. BY USING, INSTALLING, MODIFYING, COPYING, TRAINING, BENCHMARKING, OR +DISTRIBUTING THE TRAINED MODELS, YOU AGREE TO AND ACCEPT ALL TERMS AND CONDITIONS OF +THIS AGREEMENT. If you do not accept these terms, do not use the Trained Models. + +1. Subject to your compliance with this Agreement, AMD grants you a license to use, +modify, and distribute the Trained Models solely for non-commercial and research purposes. This means you may use the Trained Models for benchmarking, testing, and evaluating the Trained Models (including non-commercial research undertaken by or funded by a commercial entity) but you cannot use the Trained Models in any commercial @@ -314,17 +313,18 @@ exchange for money or other consideration. 2. Your license to the Trained Models is subject to the following conditions: (a) you cannot alter any copyright, trademark, or other notice in the Trained Models; -(b) you cannot sublicense or distribute the Trained Models under any other terms or conditions; -(c) you cannot use AMD’s trademarks in your applications or technologies in a way that suggests -your applications or technologies are endorsed by AMD; (d) if you distribute a Trained Model, -you must provide corresponding source code for such Trained Model; and (e) if the -Trained Models include any code or content subject to an open source license or third party -license (“Third Party Materials”), you agree to comply with such license terms. - -3. THE TRAINED MODELS (INCLUDING THIRD PARTY MATERIALS, IF ANY) ARE PROVIDED “AS IS” -AND WITHOUT A WARRANTY OF ANY KIND, WHETHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED -TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. -YOU BEAR ALL RISK OF USING THE TRAINED MODELS (INCLUDING THIRD PARTY MATERIALS, IF ANY) AND -YOU AGREE TO RELEASE AMD FROM ANY LIABILITY OR DAMAGES FOR ANY CLAIM OR ACTION ARISING OUT -OF OR IN CONNECTION WITH YOUR USE OF THE TRAINED MODELS AND/OR THIRD PARTY MATERIALS. +(b) you cannot sublicense or distribute the Trained Models under any other terms or conditions; +(c) you cannot use AMD’s trademarks in your applications or technologies in a way that suggests +your applications or technologies are endorsed by AMD; (d) if you distribute a Trained Model, +you must provide corresponding source code for such Trained Model; and +(e) if the Trained Models include any code or content subject to an open source license or +third party license (“Third Party Materials”), you agree to comply with such license terms. + +3. THE TRAINED MODELS (INCLUDING THIRD PARTY MATERIALS, IF ANY) ARE PROVIDED “AS IS” AND +WITHOUT A WARRANTY OF ANY KIND, WHETHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO +THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. +YOU BEAR ALL RISK OF USING THE TRAINED MODELS (INCLUDING THIRD PARTY MATERIALS, IF ANY) +AND YOU AGREE TO RELEASE AMD FROM ANY LIABILITY OR DAMAGES FOR ANY CLAIM OR ACTION ARISING +OUT OF OR IN CONNECTION WITH YOUR USE OF THE TRAINED MODELS AND/OR THIRD PARTY MATERIALS. + diff --git a/README.md b/README.md index ab368de..3d72fe2 100644 --- a/README.md +++ b/README.md @@ -1,77 +1,71 @@ -

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

# Unified Inference Frontend -Unified Inference Frontend (UIF) is an effort to consolidate the following compute platforms under one AMD inference solution with unified tools and runtime: +Unified Inference Frontend (UIF) consolidates the following compute platforms under one AMD inference solution with unified tools and runtime: -- AMD EPYC™ processors -- AMD Instinct™ GPUs -- AMD Ryzen™ processors -- Versal™ ACAP +- AMD EPYC™ and AMD Ryzen™ processors +- AMD Instinct™ and AMD Radeon™ GPUs +- AMD Versal™ Adaptive SoCs - Field Programmable Gate Arrays (FPGAs) -UIF accelerates deep learning inference applications on all AMD compute platforms for popular machine learning frameworks, including TensorFlow, PyTorch, and ONNXRT. It consists of tools, libraries, models, and example designs optimized for AMD platforms that enable deep learning applications and framework developers to improve inference performance across various workloads such as computer vision, natural language processing, and recommender systems. +UIF accelerates deep learning inference applications on all AMD compute platforms for popular machine learning frameworks, including TensorFlow, PyTorch, and ONNXRT. It consists of tools, libraries, models, and example designs optimized for AMD platforms. These enable deep learning application and framework developers to enhance inference performance across various workloads, including computer vision, natural language processing, and recommender systems. +# Release Highlights -![](/images/slide24.png) - -* **Note:** WinML is supported on Windows OS only. - -# Unified Inference Frontend 1.1 - -UIF 1.1 extends the support to AMD Instinct GPUs in addition to EPYC CPUs starting from UIF 1.0. Currently, [MIGraphX](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX) is the acceleration library for Instinct GPUs for Deep Learning Inference. UIF 1.1 provides 45 optimized models for Instinct GPUs and 84 for EPYC CPUs. The Vitis™ AI Optimizer tool is released as part of the Vitis AI 3.0 stack. UIF Quantizer is released in the PyTorch and TensorFlow Docker® images. Leveraging the UIF Optimizer and Quantizer enables performance benefits for customers when running with the MIGraphX and ZenDNN backends for Instinct GPUs and EPYC CPUs, respectively. This release also adds MIGraphX backend for [AMD Inference Server](https://github.com/Xilinx/inference-server). This document provides information about downloading, building, and running the UIF 1.1 release. - -## AMD Instinct GPU - -UIF 1.1 targets support for AMD GPUs. While UIF 1.0 enabled Vitis AI Model Zoo for TensorFlow+ZenDNN and PyTorch+ZenDNN, UIF v1.1 adds support for AMD Instinct™ GPUs. +UIF 1.2 adds support for AMD Radeon™ GPUs in addition to AMD Instinct™ GPUs. Currently, [MIGraphX](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX) is the acceleration library for both Radeon and Instinct GPUs for Deep Learning Inference. UIF supports 50 optimized models for Instinct and Radeon GPUs and 84 for EPYC CPUs. The AMD Vitis™ AI Optimizer tool is released as part of the Vitis AI 3.5 stack. UIF Quantizer is released in the PyTorch and TensorFlow Docker® images. Leveraging the UIF Optimizer and Quantizer enables performance benefits for customers when running with the MIGraphX and ZenDNN backends for Instinct and Radeon GPUs and EPYC CPUs, respectively. This release also adds MIGraphX backend for [AMD Inference Server](https://github.com/Xilinx/inference-server). This document provides information about downloading, building, and running the UIF v1.2 release. -UIF 1.1 also introduces tools for optimizing inference models. GPU support includes the ability to use AMD GPUs for optimizing inference as well the ability to deploy inference using the AMD ROCm™ platform. Additionally, UIF 1.1 has expanded the set of models available for AMD CPUs and introduces new models for AMD GPUs as well. +The highlights of this release are as follows: -# Release Highlights +AMD Radeon™ GPU: +* Support for AMD Radeon™ PRO V620 and W6800 GPUs. + For more information about the product, see https://www.amd.com/en/products/professional-graphics/amd-radeon-pro-w6800. +* Tools for optimizing inference models and deploying inference using the AMD ROCm™ platform. +* Inclusion of the [rocAL](https://docs.amd.com/projects/rocAL/en/docs-5.5.0/user_guide/ch1.html) library. -The highlights of this release are as follows: +Model Zoo: +* Expanded set of models for AMD CPUs and new models for AMD GPUs. ZenDNN: * TensorFlow, PyTorch, and ONNXRT with ZenDNN packages for download (from the ZenDNN web site) -* 84 model packages containing FP32/BF16/INT8 models enabled to be run on TensorFlow+ZenDNN, PyTorch+ZenDNN and ONNXRT+ZenDNN -* Up to 20.5x the throughput (images/second) running Medical EDD RefineDet with the Xilinx Vitis AI Model Zoo 3.0 88% pruned INT8 model on 2P AMD Eng Sample: 100-000000894-04 -of the EPYC 9004 96-core processor powered server with ZenDNN v4.0 compared to the baseline FP32 Medical EDD RefineDet model from the same Model Zoo. ([ZD-036](#zd036)) -* Docker containers for running AMD Inference Server ROCm: * Docker containers containing tools for optimizing models for inference -* 30 quantized models enabled to run on AMD ROCm platform using MIGraphX inference engine -* Up to 5.3x the throughput (images/second) running PT-OFA-ResNet50 with the Xilinx Vitis AI Model Zoo 3.0 88% pruned FP16 model on an AMD MI100 accelerator powered production server compared to the baseline FP32 PT- ResNet50v1.5 model from the same Model Zoo. ([ZD-041](#zd041)) +* 50 models enabled to run on AMD ROCm platform using MIGraphX inference engine +* Up to 5.3x the throughput (images/second) running PT-OFA-ResNet50 with 78% pruned FP16 model on an AMD MI100 accelerator powered production server compared to the baseline FP32 PT- ResNet50v1.5 model. ([ZD-041](#zd041)) * Docker containers for running AMD Inference Server AMD Inference Server provides a common interface for all inference modes: * Common C++ and server APIs for model deployment * Backend interface for using TensorFlow/PyTorch in inference for ZenDNN - * Additional UIF 1.1 optimized models examples for Inference Server + * Additional UIF 1.2 optimized models examples for Inference Server * Integration with KServe +[Introducing Once-For-All (OFA)](/docs/2_model_setup/uifmodelsetup.md#213-once-for-all-ofa-efficient-model-customization-for-various-platforms), a neural architecture search method that efficiently customizes sub-networks for diverse hardware platforms, avoiding high computation costs. OFA can achieve up to 1.69x speedup on MI100 GPUs compared to ResNet50 baselines. + # Prerequisites The following prerequisites must be met for this release of UIF: - -* Hardware based on target platform: - * CPU: AMD EPYC [9004](https://www.amd.com/en/processors/epyc-9004-series) or [7003](https://www.amd.com/en/processors/epyc-7003-series) Series Processors - * GPU: AMD Instinct™ [MI200](https://www.amd.com/en/graphics/instinct-server-accelerators) or [MI100](https://www.amd.com/en/products/server-accelerators/instinct-mi100) Series GPU - * FPGA/AI Engine: Zynq™ SoCs or Versal devices supported in [Vitis AI 3.0](https://github.com/Xilinx/Vitis-AI) - -* Software based on target platform: - * OS: Ubuntu® 18.04 LTS and later, Red Hat® Enterprise Linux® (RHEL) 8.0 and later, CentOS 7.9 and later - * ZenDNN 4.0 for AMD EPYC CPU - * MIGraphX 2.4 for AMD Instinct GPU - * Vitis AI 3.0 FPGA/AIE - * Vitis AI 3.0 Model Zoo - * Inference Server 0.3 - -## Implementing UIF 1.1 +| Component | Supported Hardware | +|--------------------|---------------------------------------------------------| +| CPU | AMD EPYC 9004 or 7003 Series Processors | +| GPU | AMD Radeon™ PRO V620 and W6800, AMD Instinct™ MI200 or MI100 Series GPU | +| FPGA/AI Engine | AMD Zynq™ SoCs or Versal devices supported in Vitis AI 3.5
**Note**: The inference server currently supports Vitis AI 3.0 devices| + +| Component | Supported Software | +|-----------------------|-------------------------------------------------------| +| Operating Systems | Ubuntu® 20.04 LTS and later, Red Hat® Enterprise Linux® 8.0 and later, CentOS 7.9 and later | +| ZenDNN | Version 4.0 for AMD EPYC CPU | +| MIGraphX | Version 2.6 for AMD Instinct GPU | +| Vitis AI | Version 3.5 for FPGA/AIE, Model Zoo | +| Inference Server | Version 0.4 | + + +## Getting Started with UIF v1.2 ### Step 1: Installation @@ -115,16 +109,8 @@ The following pages outline debugging and profiling strategies: - 5.1: Debug on GPU - 5.2: Debug on CPU - 5.3: Debug on FPGA - - - ### Step 6: Deploying on PyTorch and Tensorflow - -The following pages outline deploying strategies on PyTorch and Tensorflow: - - PyTorch - - Tensorflow - -
+
[Next >](/docs/1_installation/installation.md) @@ -166,11 +152,11 @@ AOCC CPU OPTIMIZATIONS BINARY IS SUBJECT TO THE LICENSE AGREEMENT ENCLOSED IN TH #### ZD036: -Testing conducted by AMD Performance Labs as of Thursday, January 12, 2023, on the ZenDNN v4.0 software library, Xilinx Vitis AI Model Zoo 3.0, on test systems comprising of AMD Eng Sample of the EPYC 9004 96-core processor, dual socket, with hyperthreading on, 2150 MHz CPU frequency (Max 3700 MHz), 786GB RAM (12 x 64GB DIMMs @ 4800 MT/s; DDR5 - 4800MHz 288-pin Low Profile ECC Registered RDIMM 2RX4), NPS1 mode, Ubuntu® 20.04.5 LTS version, kernel version 5.4.0-131-generic, BIOS TQZ1000F, GCC/G++ version 11.1.0, GNU ID 2.31, Python 3.8.15, AOCC version 4.0, AOCL BLIS version 4.0, TensorFlow version 2.10. Pruning was performed by the Xilinx Vitis AI pruning and quantization tool v3.0. Performance may vary based on use of latest drivers and other factors. ZD036 +Testing conducted by AMD Performance Labs as of Thursday, January 12, 2023, on the ZenDNN v4.0 software library, Xilinx Vitis AI Model Zoo 3.5, on test systems comprising of AMD Eng Sample of the EPYC 9004 96-core processor, dual socket, with hyperthreading on, 2150 MHz CPU frequency (Max 3700 MHz), 786GB RAM (12 x 64GB DIMMs @ 4800 MT/s; DDR5 - 4800MHz 288-pin Low Profile ECC Registered RDIMM 2RX4), NPS1 mode, Ubuntu® 20.04.5 LTS version, kernel version 5.4.0-131-generic, BIOS TQZ1000F, GCC/G++ version 11.1.0, GNU ID 2.31, Python 3.8.15, AOCC version 4.0, AOCL BLIS version 4.0, TensorFlow version 2.10. Pruning was performed by the Xilinx Vitis AI pruning and quantization tool v3.5. Performance may vary based on use of latest drivers and other factors. ZD036 #### ZD041: -Testing conducted by AMD Performance Labs as of Wednesday, January 18, 2023, on test systems comprising of: AMD MI100, 1200 MHz CPU frequency, 8x32GB GPU Memory, NPS1 mode, Ubuntu® 20.04 version, kernel version 4.15.0-166-generic, BIOS 2.5.6, GCC/G++ version 9.4.0, GNU ID 2.34, Python 3.7.13, xcompiler version 3.0.0, pytorch-nndct version 3.0.0, xir version 3.0.0, target_factory version 3.0.0, unilog version 3.0.0, ROCm version 5.4.1.50401-84~20.04. Pruning was performed by the Xilinx Vitis AI pruning and quantization tool v3.0. Performance may vary based on use of latest drivers and other factors. ZD-041 +Testing conducted by AMD Performance Labs as of Wednesday, January 18, 2023, on test systems comprising of: AMD MI100, 1200 MHz CPU frequency, 8x32GB GPU Memory, NPS1 mode, Ubuntu® 20.04 version, kernel version 4.15.0-166-generic, BIOS 2.5.6, GCC/G++ version 9.4.0, GNU ID 2.34, Python 3.7.13, xcompiler version 3.5.0, pytorch-nndct version 3.5.0, xir version 3.5.0, target_factory version 3.5.0, unilog version 3.5.0, ROCm version 5.4.1.50401-84~20.04. Pruning was performed by the Xilinx Vitis AI pruning and quantization tool v3.5. Performance may vary based on use of latest drivers and other factors. ZD-041 diff --git a/docs/1_installation/installation.md b/docs/1_installation/installation.md index 4ffc98f..2249137 100644 --- a/docs/1_installation/installation.md +++ b/docs/1_installation/installation.md @@ -1,6 +1,6 @@ - @@ -28,8 +28,7 @@ - [1.4.2: Build an Inference Server Docker Image](#142-build-an-inference-server-docker-image) - _Click [here](/README.md#implementing-uif-11) to go back to the UIF User Guide home page._ - + _Click [here](/README.md#implementing-uif-11) to go back to the UIF User Guide home page._ # 1.1: Pull PyTorch/TensorFlow Docker (for GPU Users) @@ -39,35 +38,11 @@ ROCm™ Userspace API is guaranteed to be compatible with specific older and newer ROCm base driver installations. -**Note**: The ROCm userspace is delivered using a Docker® container based on ROCm v5.4. Consult this matrix when running a Docker container with a different version of ROCm than installed on the host. - -**Legend** - -* Green: Shows compatibility between the same versions. - -* Blue: Shows compatibility tested versions. - -* Gray: Not tested. - -**Note:** The color in the figures may vary slightly. - -![](./images/5.4Matrix1.png) - -Kernel space compatibility meets the following condition: - -* Userspace works with -/+ 2 releases of kernel space - -### Framework Compatibility - -The UIF v1.1 release supports the most recent and two prior releases of PyTorch and TensorFlow. - -* UIF v1.1 is based on TensorFlow 2.10 and PyTorch 1.12. - -* UIF v1.1 has been tested with TensorFlow 2.3 to 2.10 and PyTorch 1.2 to 1.12. +**Note**: The ROCm userspace is delivered using a Docker® container based on ROCm v5.6.1. Refer to the following matrix for details on the supported PyTorch and TensorFlow versions: https://rocm.docs.amd.com/en/latest/release/3rd_party_support_matrix.html. ### ROCm Installation -For general information on ROCm installation, refer to the [ROCm Installation Guide](https://docs.amd.com). +For general information on ROCm installation, see [Deploy ROCm on Linux](https://rocm.docs.amd.com/en/latest/deploy/linux/index.html). ### Verifying ROCm Installation @@ -84,7 +59,8 @@ Install the Docker software. If Docker is not installed on your machine, see the The UIF/TensorFlow Docker image provides a superset functionality from ROCm/TensorFlow, and the UIF/PyTorch Docker image provides a superset functionality from ROCm/PyTorch. When the UIF Docker images were created, no items were deleted from underlying PyTorch or TensorFlow Docker images. The items that have been added in the superset include: -* Quantizer and pruner tools as plugins to TensorFlow/PyTorch to enable the use of UIF Docker images to quantize models on a ROCm platform (for GPU or CPU). **Note:** To use the pruner, use the Vitis™ AI 3.0 ROCm Dockers. See the [Host Installation Instructions](https://xilinx.github.io/Vitis-AI/docs/install/install.html#docker-install-and-verification) in the Vitis AI documentation for details. +* Quantizer and pruner tools as plugins to TensorFlow/PyTorch to enable the use of UIF Docker images to quantize models on a ROCm platform (for GPU or CPU). + **Note:** To use the pruner, use the Vitis™ AI 3.5 ROCm Dockers. See the [Host Installation Instructions](https://xilinx.github.io/Vitis-AI/3.5/html/docs/install/install.html#docker-install-and-verification) in the Vitis AI documentation for details. * MIGraphX to enable the use of UIF Docker images for GPU inference ### PyTorch @@ -95,15 +71,15 @@ Follow these steps: 1. Obtain the latest Docker image. - docker pull amdih/uif-pytorch:uif1.1_rocm5.4.1_vai3.0_py3.7_pytorch1.12 + docker pull amdih/uif-pytorch:uif1.2_rocm5.6.1_vai3.5_py3.8_pytorch1.13 - The above instruction will download the UIF container, including PyTorch and optimization tools. + The previous instruction downloads the UIF container, including PyTorch and optimization tools. 2. Start a Docker container using the image. - docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G amdih/uif-pytorch:uif1.1_rocm5.4.1_vai3.0_py3.7_pytorch1.12 + docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G amdih/uif-pytorch:uif1.2_rocm5.6.1_vai3.5_py3.8_pytorch1.13 You can also pass the `-v` argument to mount any data directories from the host onto the container. @@ -115,15 +91,15 @@ Follow these steps: 1. Obtain the latest Docker image. - docker pull amdih/uif-tensorflow:uif1.1_rocm5.4.1_vai3.0_tf2.10 + docker pull amdih/uif-tensorflow:uif1.2_rocm5.6.1_vai3.5_tensorflow2.12 - The above instruction will download the UIF container, including TensorFlow and optimization tools. + The previous instruction downloads the UIF container, including TensorFlow and optimization tools. 2. Start a Docker container using the image. - docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G amdih/uif-tensorflow:uif1.1_rocm5.4.1_vai3.0_tf2.10 + docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G amdih/uif-tensorflow:uif1.2_rocm5.6.1_vai3.5_tensorflow2.12 You can also pass the `-v` argument to mount any data directories from the host onto the container. @@ -135,7 +111,7 @@ Install the Docker software. If Docker is not installed on your machine yet, see ## 1.2.2: Pull a Vitis AI Docker Image -For instuctions on how to pull a Docker image for the Vitis AI development environment, see the [Vitis AI Docker Installation](https://gitenterprise.xilinx.com/Vitis/vitis-ai-staging/blob/vai3.0_update/docs/docs/install/install.html). +For instuctions on how to pull a Docker image for the Vitis AI development environment, see the [Vitis AI Docker Installation](https://xilinx.github.io/Vitis-AI/3.5/html/docs/install/install#docker-install-and-verification). # 1.3: Install ZenDNN Package (for CPU Users) @@ -164,7 +140,7 @@ TensorFlow+ZenDNN installation completes. To run inference on the PyTorch model using ZenDNN, download and install the PyTorch+ZenDNN package. Perform the following steps to complete the PyTorch+ZenDNN installation: -1. Download PTv1.12+ZenDNN_v4.0 release package from the [AMD ZenDNN page](https://www.amd.com/en/developer/zendnn.html). +1. Download PTv1.13+ZenDNN_v4.0 release package from the [AMD ZenDNN page](https://www.amd.com/en/developer/zendnn.html). 2. Unzip the package. For example: `PT_v1.12.0_ZenDNN_v4.0_Python_v3.8.zip`. @@ -202,19 +178,18 @@ To run inference on the ONNXRT model using ZenDNN, download and install the ONNX The AMD Inference Server is integrated with [ZenDNN](https://www.amd.com/en/developer/zendnn.html), [MIGraphX](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX) and [Vitis AI](https://www.xilinx.com/products/design-tools/vitis/vitis-ai.html) and can be used for [model serving](/docs/4_deploy_your_own_model/serve_model/servingmodelwithinferenceserver.md). To use the inference server, you need a Docker image for it, which you can get by using a prebuilt image or building one from the [inference server repository](https://github.com/Xilinx/inference-server) on GitHub. -The instructions provided here are an overview, but you can see more complete information about the AMD Inference Server in the [documentation](https://xilinx.github.io/inference-server/0.3.0/index.html). +The instructions provided here are an overview, but you can see more complete information about the AMD Inference Server in the [documentation](https://xilinx.github.io/inference-server/0.4.0/index.html). ## 1.4.1: Use a Prebuilt Docker Image You can pull the appropriate deployment Docker image(s) from DockerHub using: ``` -docker pull amdih/serve:uif1.1_zendnn_amdinfer_0.3.0 -docker pull amdih/serve:uif1.1_migraphx_amdinfer_0.3.0 +docker pull amdih/serve:uif1.2_migraphx_amdinfer_0.4.0 +docker pull amdih/serve:uif1.2_vai_amdinfer_0.4.0 +docker pull amdih/serve:uif1.2_zendnn_amdinfer_0.4.0 ``` -The Vitis AI deployment image and development images for all platforms are not prebuilt and must be built by the user. - ## 1.4.2: Build an Inference Server Docker Image You need Docker (18.09+) to build the image. @@ -224,8 +199,8 @@ You need Docker (18.09+) to build the image. ``` git clone https://github.com/Xilinx/inference-server cd inference-server -# version 0.3.0 corresponds to this documentation -git checkout v0.3.0 +# version 0.4.0 corresponds to this documentation +git checkout v0.4.0 python3 docker/generate.py ./amdinfer dockerize ``` @@ -235,10 +210,10 @@ python3 docker/generate.py - `--production`: Builds the deployment version of the image instead of the default development one. - `--vitis`: Enables FPGAs with Vitis AI in the image. - `--migraphx`: Enables GPUs with MIGraphX in the image. -- `--tfzendnn=`: Enables CPUs with TF+ZenDNN in the image. You need to download [TF_v2.10_ZenDNN_v4.0_C++_API.zip](https://www.amd.com/en/developer/zendnn.html) and pass the path to it. -- `--ptzendnn=`: Enables CPUs with PT+ZenDNN in the image. You need to download [PT_v1.12_ZenDNN_v4.0_C++_API.zip](https://www.amd.com/en/developer/zendnn.html) and pass the path to it. +- `--tfzendnn=`: Enables CPUs with TF+ZenDNN in the image. You need to download [TF_v2.12_ZenDNN_v4.0_C++_API.zip](https://www.amd.com/en/developer/zendnn.html) and pass the path to it. +- `--ptzendnn=`: Enables CPUs with PT+ZenDNN in the image. You need to download [PT_v1.13_ZenDNN_v4.0_C++_API.zip](https://www.amd.com/en/developer/zendnn.html) and pass the path to it. - **Note:** The downloaded ZenDNN package(s) must be inside the inference-server folder since the Docker will not be able to access files outside the repository. + **Note:** The downloaded ZenDNN package(s) must be inside the inference-server folder since the Docker cannot access files outside the repository. You can pass these flags in any combination. Use `./amdinfer dockerize --help` for the full documentation on the available flags. diff --git a/docs/2_model_setup/gpu_model_example.md b/docs/2_model_setup/gpu_model_example.md index b86d830..4b85cca 100644 --- a/docs/2_model_setup/gpu_model_example.md +++ b/docs/2_model_setup/gpu_model_example.md @@ -1,6 +1,6 @@

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

- @@ -19,7 +19,7 @@ _Click [here](/README.md#implementing-uif-11) to go back to the UIF User Guide home page._ -UIF accelerates deep learning inference applications on all AMD compute platforms for popular machine learning frameworks, including TensorFlow, PyTorch, and ONNXRT. UIF 1.1 extends the support to AMD Instinct™ GPUs. Currently, [MIGraphX](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX) is the acceleration library for Deep Learning Inference running on AMD Instinct GPUs. +UIF accelerates deep learning inference applications on all AMD compute platforms for popular machine learning frameworks, including TensorFlow, PyTorch, and ONNXRT. UIF 1.2 extends the support to AMD Radeon™ GPUs in addition to AMD Instinct™ GPUs. Currently, [MIGraphX](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX) is the acceleration library for Deep Learning Inference running on AMD Instinct GPUs. The following example takes a PyTorch ResNet-50-v1.5 model selected from UIF Model Zoo as an example to show how it works on different GPU platforms. @@ -31,7 +31,7 @@ After unzipping the PyTorch ResNet50-v1.5 model, you need to set environment and 1. Download the UIF Docker image with the following instruction: - docker pull amdih/uif-pytorch:uif1.1_rocm5.4.1_vai3.0_py3.7_pytorch1.12 + docker pull amdih/uif-pytorch:uif1.2_rocm5.6.1_vai3.5_py3.8_pytorch1.13 **Note**: AMD GPU dependencies and PyTorch are pre-installed in the Docker. @@ -43,7 +43,7 @@ After unzipping the PyTorch ResNet50-v1.5 model, you need to set environment and ```shell $ pip install --user -r requirements.txt ``` - +**Note**: Modern servers commonly feature multi-socket or multi-core architectures. GPU nodes possess an affinity for specific CPU cores, crucial for efficient memory transfers between the GPU and the host. Ensuring these transfers occur through the designated CPU cores is essential. Neglecting GPU-CPU affinity, especially for larger models, can result in unstable performance. To address this, you can enhance GPU-CPU affinity by setting up Non-Uniform Memory Access (NUMA) bindings when running the Docker image using the `--cpuset-cpus` and `ROCR_VISIBLE_DEVICES` arguments. # 2.6.2: Data Preparation @@ -104,7 +104,7 @@ Loading model: float/resnet50_pretrained.pth The onnx models are provided in the `float` folder, so you can choose to skip this step. -- You can then run inference and evaluation with MIGraphX. Here, scripts are provided to evaluate the model's accuracy with MIGraphX. +- Run inference and evaluation with MIGraphX. Here, scripts are provided to evaluate the model's accuracy with MIGraphX. ``` $ sh run_test_migraphx.sh @@ -168,13 +168,11 @@ Overhead: 1%, -14% # 2.6.4: Performance -The accuracy and performance of the FP32/16 onnx model on AMD GPU MI100 (MIGraphX driver 2.4) are evaluated as follows: - -|Resnet50 Model |Input Size|FLOPs| Top-1/Top-5 Accuracy, %| Performance Rate, /sec | -|----|---|---|---|---| -|PyTorch model| 224x224 | 8.2G| 76.1/92.9 | - | -|FP32 onnx model| 224x224 | 8.2G| 76.1/92.9 | bs=1: 484.147
bs=64: 5176.15| -|FP16 onnx model| 224x224 | 8.2G| 76.1/92.9 | bs=1: 734.367
bs=64: 5176.15| +|Resnet50 Model |Input Size|FLOPs| Top-1/Top-5 Accuracy, %| +|----|---|---|---| +|PyTorch model| 224x224 | 8.2G| 76.1/92.9 | +|FP32 onnx model| 224x224 | 8.2G| 76.1/92.9 | +|FP16 onnx model| 224x224 | 8.2G| 76.1/92.9 | # 2.6.5: Data Preprocessing for Inference diff --git a/docs/2_model_setup/model-list/pt_albert_basev1.5_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_albert_basev1.5_1.2_M2.6/model.yaml new file mode 100755 index 0000000..7bfd437 --- /dev/null +++ b/docs/2_model_setup/model-list/pt_albert_basev1.5_1.2_M2.6/model.yaml @@ -0,0 +1,45 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: Albert base v1.5 trained on Wikitext2 +float ops: 70.66G +task: Language Modeling +framework: pytorch +prune: 'no' +version: 1.2 +files: +- name: pt_albert_base_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_albert_base_1.2_M2.6.tar.gz + checksum: 781b8934d5aae7f85947e3aa9e8877c8 +- name: pt_albert_base_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_albert_base_1.2_M2.6_MI100.tar.gz + checksum: 82c17890e6e3c02f851f2f8c45d6adcf +- name: pt_albert_base_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_albert_base_1.2_M2.6_MI210.tar.gz + checksum: ee7ed14378e72374f5674417207078c0 +- name: pt_albert_base_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_albert_base_1.2_M2.6_NAVI2.tar.gz + checksum: 7dc7754fa82488dcb0070d87b2e3d881 +license: https://github.com/amd/UIF/blob/main/LICENSE + + diff --git a/docs/2_model_setup/model-list/pt_albert_largev1.5_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_albert_largev1.5_1.2_M2.6/model.yaml new file mode 100755 index 0000000..9684207 --- /dev/null +++ b/docs/2_model_setup/model-list/pt_albert_largev1.5_1.2_M2.6/model.yaml @@ -0,0 +1,47 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: Albert large v1.5 trained on Wikitext2 +float ops:246.44G +task: Language Modeling +framework: pytorch +prune: 'no' +version: 1.2 +files: +- name: pt_albert_large_1.2_M2.6.tar.gz b7c624a91924e9bcc668ae7c5fb67de9 +pt_albert_large_1.2_M2.6_MI100.tar.gz 88cc36b5e5f38efea243b7bbb63b7905 +pt_albert_large_1.2_M2.6_MI210.tar.gz b7f63736b5fb93934b3c0d9b74b0ffa8 +pt_albert_large_1.2_M2.6_NAVI2.tar.gz df92b53eff089eeb23b98a36049544de + + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_albert_base_1.2_M2.6.tar.gz + checksum: b7c624a91924e9bcc668ae7c5fb67de9 +- name: pt_albert_large_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_albert_large_1.2_M2.6_MI100.tar.gz + checksum: 88cc36b5e5f38efea243b7bbb63b7905 +- name: pt_albert_large_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_albert_large_1.2_M2.6_MI210.tar.gz + checksum: b7f63736b5fb93934b3c0d9b74b0ffa8 +- name: pt_albert_large_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_albert_large_1.2_M2.6_NAVI2.tar.gz + checksum: df92b53eff089eeb23b98a36049544de +license: https://github.com/amd/UIF/blob/main/LICENSE \ No newline at end of file diff --git a/docs/2_model_setup/model-list/pt_bert_base_1.1_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_bert_base_1.1_M2.6/model.yaml new file mode 100755 index 0000000..d5e9b5e --- /dev/null +++ b/docs/2_model_setup/model-list/pt_bert_base_1.1_M2.6/model.yaml @@ -0,0 +1,47 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: BERT Base for SQuADv1.1 Question Answering. +input size: '384' +float ops: 70.66G +task: Question Answering +framework: pytorch +prune: 'no' +version: 1.1 +files: +- name: pt_bert_base_1.1_M2.6.tar.gz +pt_bert_base_1.1_M2.6_MI100.tar.gz 6b683b0e2956ffa2c0d928b22ebc292a +pt_bert_base_1.1_M2.6_MI210.tar.gz 07d3a041b5b8c3a423d0bb645f42c5e8 +pt_bert_base_1.1_M2.6_NAVI2.tar.gz f153a8dc283c2fe1b1c594dc7739ba70 + + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_base_1.1_M2.6.tar.gz + checksum: 13c9f3bd24153ebcbb704b24118d44c9 +- name: pt_bert_base_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_base_1.1_M2.6_MI100.tar.gz + checksum: 6b683b0e2956ffa2c0d928b22ebc292a +- name: pt_bert_base_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_base_1.1_M2.6_MI210.tar.gz + checksum: 07d3a041b5b8c3a423d0bb645f42c5e8 +- name: pt_bert_base_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_base_1.1_M2.6_NAVI2.tar.gz + checksum: f153a8dc283c2fe1b1c594dc7739ba70 diff --git a/docs/2_model_setup/model-list/pt_bert_large_SQuADv1.1_384_246.42G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_bert_large_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 59% rename from docs/2_model_setup/model-list/pt_bert_large_SQuADv1.1_384_246.42G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_bert_large_1.1_M2.6/model.yaml index 4d87666..37927ad --- a/docs/2_model_setup/model-list/pt_bert_large_SQuADv1.1_384_246.42G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_bert_large_1.1_M2.6/model.yaml @@ -21,19 +21,23 @@ framework: pytorch prune: 'no' version: 1.1 files: -- name: +- name: pt_bert_large_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_large_SQuADv1.1_384_246.42G_1.1_M2.4.tar - checksum: a2169bc97a997ac7aac3d5c8ad815153 -- name: pt_bert_large_SQuADv1.1_384_246.42G_1.1_M2.4_MI100 - type: Ymodel - board: M100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_large_SQuADv1.1_384_246.42G_1.1_M2.4_MI100.tar - checksum: b0f41ea68e212f801438d5e562e9c39e -- name: pt_bert_large_SQuADv1.1_384_246.42G_1.1_M2.4_MI210 - type: Ymodel - board: M210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_large_SQuADv1.1_384_246.42G_1.1_M2.4_MI210.tar - checksum: 1ecea49b21fc76582600b29bbf18080c -license: license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_large_1.1_M2.6.tar.gz + checksum: 44b7101a59eeef1152ba3857bc5a9832 +- name: pt_bert_large_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_large_1.1_M2.6_MI100.tar.gz + checksum: 774b99d85a806ca0fe753398538ad532 +- name: pt_bert_large_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_large_1.1_M2.6_MI210.tar.gz + checksum: 4420fcc5cd998b98713fd62edd3a508c +- name: pt_bert_large_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_large_1.1_M2.6_NAVI2.tar.gz + checksum: c2533aac489c90e02195d16fe331b8b6 diff --git a/docs/2_model_setup/model-list/pt_detr_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_detr_1.2_M2.6/model.yaml new file mode 100755 index 0000000..ee49f62 --- /dev/null +++ b/docs/2_model_setup/model-list/pt_detr_1.2_M2.6/model.yaml @@ -0,0 +1,43 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: DeTR for Object Detection +float ops: 72.0G +framework: pytorch +prune: 'no' +version: 1.2 +files: +- name: pt_detr_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_detr_1.2_M2.6.tar.gz + checksum: 5393d7233a5fef344dd477bac9c893da +- name: pt_detr_1.2_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_detr_1.2_M2.6_MI100.tar.gz + checksum: 3cd9f2429a67e507440c94e3ae3af17f +- name: pt_detr_1.2_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_detr_1.2_M2.6_MI210.tar.gz + checksum: e3e065a5641ffbf6dc8d935e6597e7ad +- name: pt_detr_1.2_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_detr_1.2_M2.6_NAVI2.tar.gz + checksum: 465c19721638efdea8d42fd44e8da3c3 +license: https://github.com/amd/UIF/blob/main/LICENSE + diff --git a/docs/2_model_setup/model-list/pt_distilbertv1.5_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_distilbertv1.5_1.2_M2.6/model.yaml new file mode 100755 index 0000000..28b5015 --- /dev/null +++ b/docs/2_model_setup/model-list/pt_distilbertv1.5_1.2_M2.6/model.yaml @@ -0,0 +1,55 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: Distill BERT v1.5 for SQuADv1.1 Question Answering +input size: '384' +float ops: 35.34G +task: Question Answering +framework: PyTorch +prune: 'no' +version: 1.2 +files: + + +pt_distilbert_1.2_M2.6.tar.gz 986c3ddaa0966819f39f4a63bb4b62f0 +pt_distilbert_1.2_M2.6_MI100.tar.gz 69aee805421f160f1d682882d1a7699e +pt_distilbert_1.2_M2.6_MI210.tar.gz b3feaf2f91c838be22c53ab81f2b6d3b +pt_distilbert_1.2_M2.6_NAVI2.tar.gz 2154e8a2b9959fb89a5817cf226fa270 + + +- name: pt_distilbert_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilbert_1.2_M2.6.tar.gz + checksum: 986c3ddaa0966819f39f4a63bb4b62f0 +- name: pt_distilbert_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilbert_1.2_M2.6_MI100.tar.gz + checksum: 69aee805421f160f1d682882d1a7699e +- name: pt_distilbert_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilbert_1.2_M2.6_MI210.tar.gz + checksum: b3feaf2f91c838be22c53ab81f2b6d3b +- name: pt_distilbert_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilbert_1.2_M2.6_NAVI2.tar.gz + checksum: 2154e8a2b9959fb89a5817cf226fa270 +license: https://github.com/amd/UIF/blob/main/LICENSE + + + diff --git a/docs/2_model_setup/model-list/pt_distilgpt2_wikitext2_1024_106.3G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_distilgpt2_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 59% rename from docs/2_model_setup/model-list/pt_distilgpt2_wikitext2_1024_106.3G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_distilgpt2_1.1_M2.6/model.yaml index 199e062..34fa2b4 --- a/docs/2_model_setup/model-list/pt_distilgpt2_wikitext2_1024_106.3G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_distilgpt2_1.1_M2.6/model.yaml @@ -21,21 +21,23 @@ framework: pytorch prune: 'no' version: 1.1 files: -- name: +- name: pt_distilgpt2_1.1_M2.6.tar type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilgpt2_wikitext2_1024_106.3G_1.1_M2.4.tar - checksum: fdd6f0c9fa2cae4985e734281267a6cd -- name: pt_distilgpt2_wikitext2_1024_106.3G_1.1_M2.4_MI100 - type: Ymodel - board: M100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilgpt2_wikitext2_1024_106.3G_1.1_M2.4_MI100.tar - checksum: 207f1c411138d7e46393c5879ba2c5bb -- name: pt_distilgpt2_wikitext2_1024_106.3G_1.1_M2.4_MI210 - type: Ymodel - board: M210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilgpt2_wikitext2_1024_106.3G_1.1_M2.4_MI210.tar - checksum: 57baa2fc8044b45b8c9337a9fdc1153c -license: license: https://github.com/amd/UIF/blob/main/LICENSE - - + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilgpt2_1.1_M2.6.tar + checksum: 29df8558eb91e90e1dd77c95b63f7857 +- name: pt_distilgpt2_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilgpt2_1.1_M2.6_MI100.tar.gz + checksum: 29df8558eb91e90e1dd77c95b63f7857 +- name: pt_distilgpt2_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilgpt2_1.1_M2.6_MI210.tar.gz + checksum: f1b7ca8d0f36e97b577ef5761e1712f5 +- name: pt_distilgpt2_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_distilgpt2_1.1_M2.6_NAVI2.tar.gz + checksum: f324690ae8fa2ec52dc4f883b95159df \ No newline at end of file diff --git a/docs/2_model_setup/model-list/pt_reid-resnet50_market1501_256_128_0.6_2.1G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_dlrm40M_1.2_M2.6/model.yaml similarity index 83% rename from docs/2_model_setup/model-list/pt_reid-resnet50_market1501_256_128_0.6_2.1G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_dlrm40M_1.2_M2.6/model.yaml index ed50941..3365593 100755 --- a/docs/2_model_setup/model-list/pt_reid-resnet50_market1501_256_128_0.6_2.1G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_dlrm40M_1.2_M2.6/model.yaml @@ -59,7 +59,10 @@ # Reid_resnet50 pruned0.5, # Reid_resnet50 pruned0.6, # Reid_resnet50 pruned0.7, -# DLRM +# DLRM +# ViT +# PointPillars +# Resnet50_v1.5 ofa # # This License Agreement for Non-Commercial Models (“Agreement”) is a legal agreement between you (either an individual or # an entity) and Advanced Micro Devices, Inc. (“AMD”). DO NOT USE THE TRAINED MODELS IDENTIFIED ABOVE UNTIL YOU HAVE @@ -83,27 +86,32 @@ # YOUR USE OF THE TRAINED MODEL AND/OR THIRD PARTY MATERIALS. -description: resnet50-based person re-identification model. -input size: 256x128 -float ops: 2.1G -task: person reid +description: DLRM 40M Trained on CriteoTerabyte Dataset +input size: 13,26 +float ops: 20K +task: Recommendation System framework: pytorch -prune: 'yes' -version: 1.1 +prune: 'no' +version: 1.2 files: -- name: pt_reid-resnet50_market1501_256_128_0.6_2.1G_1.1_M2.4 +- name: pt_dlrm40M_1.2_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid-resnet50_market1501_256_128_0.6_2.1G_1.1_M2.4.zip - checksum: 80af92d85628143bb04f5d5e89a7d6ba -- name: pt_reid-resnet50_market1501_256_128_0.6_2.1G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm40M_1.2_M2.6.tar.gz + checksum: 7e8ec40f4cc93d70efe625afa1ee9a21 +- name: pt_dlrm40m_1.2_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid-resnet50_market1501_256_128_0.6_2.1G_1.1_M2.4_MI100.zip - checksum: 95943bcf0b67396712ca9d37a5d36ebb -- name: pt_reid-resnet50_market1501_256_128_0.6_2.1G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm40m_1.2_M2.6_MI100.tar.gz + checksum: 642951c34c99bea3fefa06f8f72c675c +- name: pt_dlrm40m_1.2_M2.6_MI100.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid-resnet50_market1501_256_128_0.6_2.1G_1.1_M2.4_MI210.zip - checksum: 2e945a5eca7b5127c3a82ff69cdfa01d -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm40m_1.2_M2.6_MI210.tar.gz + checksum: e69ef0943393a8af137da0388dd946a5 +- name: pt_dlrm40m_1.2_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm40m_1.2_M2.6_NAVI2.tar.gz + checksum: deaa620c6cac0ad32e7bb66cd2c5db8d + license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/pt_dlrm_terabytes_13_26_20K_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_dlrm_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 64% rename from docs/2_model_setup/model-list/pt_dlrm_terabytes_13_26_20K_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_dlrm_1.1_M2.6/model.yaml index 0732d7e..9dbbbce --- a/docs/2_model_setup/model-list/pt_dlrm_terabytes_13_26_20K_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_dlrm_1.1_M2.6/model.yaml @@ -21,19 +21,23 @@ framework: pytorch prune: 'no' version: 1.1 files: -- name: pt_dlrm_CriteoTerabyte_13_26_20K_1.1_M2.4 +- name: pt_dlrm_1.1_M2.6.zip type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm_criteoterabyte_13_26_20k_1.1_M2.4.zip - checksum: bd56f87bcb04f2481225f40a68c57368 -- name: pt_dlrm_CriteoTerabyte_13_26_20K_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm_1.1_M2.6.zip + checksum: fefaf59393b8920a9c28a8fbf2ca72ad +- name: pt_dlrm10m_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm_criteoterabyte_13_26_20k_1.1_M2.4_M100.zip - checksum: ead98fabe589f69f0e655510cd3455a2 -- name: pt_dlrm_CriteoTerabyte_13_26_20K_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm10m_1.1_M2.6_MI100.tar.gz + checksum: 4eea1ffe7b98033696fbdc73e50c83c0 +- name: pt_dlrm10m_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm_criteoterabyte_13_26_20k_1.1_M2.4_M210.zip - checksum: 4a8b35d92e9dae77cee77381b694821e -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm10m_1.1_M2.6_MI210.tar.gz + checksum: 79e9ce6f8fd014fa31028f11669203e8 +- name: pt_dlrm10m_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_dlrm10m_1.1_M2.6_NAVI2.tar.gz + checksum: 69ee508ca8cda10a56f2b0b546dcebd3 \ No newline at end of file diff --git a/docs/2_model_setup/model-list/pt_gpt2_large_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_gpt2_large_1.2_M2.6/model.yaml new file mode 100755 index 0000000..d8dd4ef --- /dev/null +++ b/docs/2_model_setup/model-list/pt_gpt2_large_1.2_M2.6/model.yaml @@ -0,0 +1,45 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: GPT2 large trained on Wikitext2 +float ops: 1.6T +task: Language Modeling +framework: pytorch +prune: 'no' +version: 1.2 +files: +- name: pt_gpt2_large_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_large_1.2_M2.6.tar.gz + checksum: 7e80499eb9bcce46ef3f306940f4bf9f +- name: pt_gpt2_large_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_large_1.2_M2.6_MI100.tar.gz + checksum: daa09797759e46f64db2ef11f007f113 +- name: pt_gpt2_large_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_large_1.2_M2.6_MI210.tar.gz + checksum: 21e4b5e7cc93f5623b3f216c00f807f0 +- name: pt_gpt2_large_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_large_1.2_M2.6_NAVI2.tar.gz + checksum: 999ab8b1c5079a466a4832a8dd55d9a9 +license: https://github.com/amd/UIF/blob/main/LICENSE + + diff --git a/docs/2_model_setup/model-list/pt_gpt2_medium_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_gpt2_medium_1.2_M2.6/model.yaml new file mode 100755 index 0000000..8898a2a --- /dev/null +++ b/docs/2_model_setup/model-list/pt_gpt2_medium_1.2_M2.6/model.yaml @@ -0,0 +1,44 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: GPT2 medium trained on Wikitext2 +float ops: 1.6T +task: Language Modeling +framework: pytorch +prune: 'no' +version: 1.2 +files: +- name: pt_gpt2_medium_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_medium_1.2_M2.6.tar.gz + checksum: 75bec314bf2a24d81d4dcbea26b28802 +- name: pt_gpt2_medium_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_medium_1.2_M2.6_MI100.tar.gz + checksum: 89f3566a20fea5c1c6fa15b32809fb34 +- name: pt_gpt2_medium_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_medium_1.2_M2.6_MI210.tar.gz + checksum: cef07ea0e16329451085bad6b9001d9f +- name: pt_gpt2_medium_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_medium_1.2_M2.6_NAVI2.tar.gz + checksum: 72023855fc0bfdb376cddf83fad54476 +license: https://github.com/amd/UIF/blob/main/LICENSE + diff --git a/docs/2_model_setup/model-list/pt_gpt2_small_wikitext2_1024_212.6G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_gpt2_small_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 59% rename from docs/2_model_setup/model-list/pt_gpt2_small_wikitext2_1024_212.6G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_gpt2_small_1.1_M2.6/model.yaml index f42a7a3..059c5ec --- a/docs/2_model_setup/model-list/pt_gpt2_small_wikitext2_1024_212.6G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_gpt2_small_1.1_M2.6/model.yaml @@ -21,21 +21,23 @@ framework: pytorch prune: 'no' version: 1.1 files: -- name: +- name: pt_gpt2_small_1.1_M2.6.tar type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_small_wikitext2_1024_212.6G_1.1_M2.4.tar - checksum: 4d43357e6a4fe96e61e4b58a91dafc56 -- name: pt_gpt2_small_wikitext2_1024_212.6G_1.1_M2.4_MI100 - type: Ymodel - board: M100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_small_wikitext2_1024_212.6G_1.1_M2.4_MI100.tar - checksum: befa26246a709a39d21524c6c0a221f4 -- name: pt_gpt2_small_wikitext2_1024_212.6G_1.1_M2.4_MI210 - type: Ymodel - board: M210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_small_wikitext2_1024_212.6G_1.1_M2.4_MI210.tar - checksum: ccf8af9a49c19f23b48c47026d38ced3 -license: license: https://github.com/amd/UIF/blob/main/LICENSE - - + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_small_1.1_M2.6.tar + checksum: 464712362167f99ecaa5d164a95b83fa +- name: pt_gpt2_small_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_small_1.1_M2.6_MI100.tar.gz + checksum: 7e247d5ad73f26a8fef7a0fb4599ccde +- name: pt_gpt2_small_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_small_1.1_M2.6_MI210.tar.gz + checksum: 0de36a9489a4aadbe8e03de3a1ae8d0b +- name: pt_gpt2_small_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_small_1.1_M2.6_NAVI2.tar.gz + checksum: 9b46b1778b1c73061da0dc7f7d7ec315 diff --git a/docs/2_model_setup/model-list/pt_gpt2_xl_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_gpt2_xl_1.2_M2.6/model.yaml new file mode 100755 index 0000000..c165b39 --- /dev/null +++ b/docs/2_model_setup/model-list/pt_gpt2_xl_1.2_M2.6/model.yaml @@ -0,0 +1,46 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: GPT2 XL trained on Wikitext2 +float ops: 3.3T +task: Language Modeling +framework: pytorch +prune: 'no' +version: 1.2 +files: +- name: pt_gpt2_xl_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_xl_1.2_M2.6.tar.gz + checksum: 4413a1a610cc4b25971fa69500d8a0e6 +- name: pt_gpt2_xl_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_xl_1.2_M2.6_MI100.tar.gz + checksum: 01dfe2ad3646fb3a9da8d19a9a8522f3 +- name: pt_gpt2_xl_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_xl_1.2_M2.6_MI210.tar.gz + checksum: 9d0dfa60fc3c8dab161352e17da824ad +- name: pt_gpt2_xl_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_gpt2_xl_1.2_M2.6_NAVI2.tar.gz + checksum: 8b47faae99cd0d8567aa93c46a0604f8 +license: https://github.com/amd/UIF/blob/main/LICENSE + + + diff --git a/docs/2_model_setup/model-list/pt_inceptionv3_imagenet_299_299_0.4_6.8G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_inceptionv3_0.4_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 61% rename from docs/2_model_setup/model-list/pt_inceptionv3_imagenet_299_299_0.4_6.8G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_inceptionv3_0.4_1.1_M2.6/model.yaml index 4f4fa9c..61d47ae --- a/docs/2_model_setup/model-list/pt_inceptionv3_imagenet_299_299_0.4_6.8G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_inceptionv3_0.4_1.1_M2.6/model.yaml @@ -21,20 +21,23 @@ framework: pytorch prune: 'yes' version: 1.1 files: -- name: pt_inceptionv3_imagenet_299_299_0.4_6.8G_1.1_M2.4 +- name: pt_inceptionv3_0.4_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_imagenet_299_299_0.4_6.8G_1.1_M2.4.zip - checksum: 04a2de0728b1d721f42bf8b09b1ec754 -- name: pt_inceptionv3_imagenet_299_299_0.4_6.8G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_0.4_1.1_M2.6.tar.gz + checksum: 9fc47a630be685fbda7a5fcc6d3aece6 +- name: pt_inception_0.4_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_imagenet_299_299_0.4_6.8G_1.1_M2.4_MI100.zip - checksum: feb5fdef257a3e638c9b2dfee63bfeff -- name: pt_inceptionv3_imagenet_299_299_0.4_6.8G_1.1_M2.4_MI200 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inception_0.4_1.1_M2.6_MI100.tar.gz + checksum: 292c312477ff6887ba791b144653ecf3 +- name: pt_inception_0.4_1.1_M2.6_MI210.tar.gz type: ymodel - board: MI200 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_imagenet_299_299_0.4_6.8G_1.1_M2.4_MI210.zip - checksum: e46074151d38d6879941d9bde46a3e6a -license: https://github.com/amd/UIF/blob/main/LICENSE - + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inception_0.4_1.1_M2.6_MI210.tar.gz + checksum: b0c3725d615168ccd97646bf0ae44c6f +- name: pt_inception_0.4_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inception_0.4_1.1_M2.6_NAVI2.tar.gz + checksum: 17dcd39b3cad25f6bf8216a8528ee566 diff --git a/docs/2_model_setup/model-list/pt_inceptionv3_imagenet_299_299_0.6_4.5G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_inceptionv3_0.6_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 62% rename from docs/2_model_setup/model-list/pt_inceptionv3_imagenet_299_299_0.6_4.5G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_inceptionv3_0.6_1.1_M2.6/model.yaml index b30480f..3c8aff4 --- a/docs/2_model_setup/model-list/pt_inceptionv3_imagenet_299_299_0.6_4.5G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_inceptionv3_0.6_1.1_M2.6/model.yaml @@ -21,19 +21,23 @@ framework: pytorch prune: 'yes' version: 1.1 files: -- name: pt_inceptionv3_imagenet_299_299_0.6_4.5G_1.1_M2.4 +- name: pt_inceptionv3_0.6_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_imagenet_299_299_0.6_4.5G_1.1_M2.4.zip - checksum: df88526caf0155e0ff16fe737d233f66 -- name: pt_inceptionv3_imagenet_299_299_0.6_4.5G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_0.6_1.1_M2.6.tar.gz + checksum: dcc3f88838375c0b2bce22273fddbafd +- name: pt_inception_0.6_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_imagenet_299_299_0.6_4.5G_1.1_M2.4_MI100.zip - checksum: 72da869b6364e2ad3f14b8417d470104 -- name: pt_inceptionv3_imagenet_299_299_0.6_4.5G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inception_0.6_1.1_M2.6_MI100.tar.gz + checksum: b8ee4450dab0b4b2f2d913fb23fa7beb +- name: pt_inception_0.6_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_imagenet_299_299_0.6_4.5G_1.1_M2.4_MI210.zip - checksum: a510ca8a93e276af6d699702dd056f08 -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inception_0.6_1.1_M2.6_MI210.tar.gz + checksum: 49f2b0df22f3b8146e0b95397866a80c +- name: pt_inception_0.6_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inception_0.6_1.1_M2.6_NAVI2.tar.gz + checksum: 2f4dc51231e0feb9369bcf63793f016f \ No newline at end of file diff --git a/docs/2_model_setup/model-list/pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_inceptionv3_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 62% rename from docs/2_model_setup/model-list/pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_inceptionv3_1.1_M2.6/model.yaml index f2c9046..8c737d9 --- a/docs/2_model_setup/model-list/pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_inceptionv3_1.1_M2.6/model.yaml @@ -21,19 +21,23 @@ framework: pytorch prune: 'no' version: 1.1 files: -- name: pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4 +- name: pt_inceptionv3_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4.zip - checksum: fadb2c13d7d6cb7077c2be2218d1967e -- name: pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_1.1_M2.6.tar.gz + checksum: 4ca00d96033ea067c9765318eb5fdc3a +- name: pt_inception_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4_M100.zip - checksum: 8140158f99bfc3a88ab45c11e2101aa1 -- name: pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4_MI200 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inception_1.1_M2.6_MI100.tar.gz + checksum: 214cc71af0cd17efa17958f017c0879a +- name: pt_inception_1.1_M2.6_MI210.tar.gz type: ymodel - board: MI200 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4_M210.zip - checksum: b85996948055463c2103105d431f6fa9 -license: https://github.com/amd/UIF/blob/main/LICENSE + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inception_1.1_M2.6_MI210.tar.gz + checksum: 81bad07219e41645e1682684f8dd5263 +- name: pt_inception_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_inception_1.1_M2.6_NAVI2.tar.gz + checksum: 2f33685eb97311b93841b37f9d38b866 diff --git a/docs/2_model_setup/model-list/pt_mobilebertv1.5_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_mobilebertv1.5_1.2_M2.6/model.yaml new file mode 100755 index 0000000..9b52f6a --- /dev/null +++ b/docs/2_model_setup/model-list/pt_mobilebertv1.5_1.2_M2.6/model.yaml @@ -0,0 +1,43 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: MobileBERT v1.5 for SQuADv1.1 Question Answering +float ops: 30.8G +task: Question Answering +framework: pytorch +prune: 'no' +version: 1.2 +files: +- name: pt_mobilebert_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_large_SQuADv1.1_384_246.42G_1.1_M2.4.tar + checksum: 76ecd09d231c4e3534aa5ac078a52143 +- name: pt_mobilebert_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_mobilebert_1.2_M2.6_MI100.tar.gz + checksum: 0ba43546d6495e2ea3b075962e507db1 +- name: pt_mobilebert_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_mobilebert_1.2_M2.6_MI210.tar.gz + checksum: 207ee94a8c157ee72d550bab41a88e45 +- name: pt_mobilebert_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_mobilebert_1.2_M2.6_NAVI2.tar.gz + checksum: 127a9dffdabf14deaa19d5bee4c5875f +license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_224_224_0.45_8.2G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_ofa_resnet50_0.45_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 55% rename from docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_224_224_0.45_8.2G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_ofa_resnet50_0.45_1.1_M2.6/model.yaml index a476631..cd8b01e --- a/docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_224_224_0.45_8.2G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_ofa_resnet50_0.45_1.1_M2.6/model.yaml @@ -15,25 +15,28 @@ description: ofa-resnet50 for Image Classification. input size: 224*224 -float ops: 8.2G task: classification framework: pytorch -prune: 0.45 +prune: '0.45' version: 1.1 files: -- name: +- name: pt_ofa_resnet_0.45_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_224_224_0.45_8.2G_1.1_M2.4.tar.gz - checksum: 594f2fca42ea63e54e5c7bb51d0e9e7f -- name: pt_ofa_resnet50_imagenet_224_224_0.45_8.2G_1.1_M2.4_MI100 - type: Ymodel - board: M100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_224_224_0.45_8.2G_1.1_M2.4_MI100.zip - checksum: 7be957abddded2b40f97cca6fca9e1aa -- name: pt_ofa_resnet50_imagenet_224_224_0.45_8.2G_1.1_M2.4_MI210 - type: Ymodel - board: M210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_224_224_0.45_8.2G_1.1_M2.4_MI210.zip - checksum: 9bb2f244ca24d2228308618475666166 -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet_0.45_1.1_M2.6.tar.gz + checksum: 09730abb2a95603e27a8c9ace04735d5 +- name: pt_ofa_resnet50_0.45_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.45_1.1_M2.6_MI100.tar.gz + checksum: aeb87ee1f910f802562ac8a6298ebf7b +- name: pt_ofa_resnet50_0.45_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.45_1.1_M2.6_MI210.tar.gz + checksum: 94076e4f144022ab3e5732124a2a1848 +- name: pt_ofa_resnet50_0.45_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.45_1.1_M2.6_NAVI2.tar.gz + checksum: 76802555ea0047ddc33a830bfe38fda1 \ No newline at end of file diff --git a/docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_224_224_0.60_6.0G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_ofa_resnet50_0.60_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 55% rename from docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_224_224_0.60_6.0G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_ofa_resnet50_0.60_1.1_M2.6/model.yaml index 17d567b..3981d57 --- a/docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_224_224_0.60_6.0G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_ofa_resnet50_0.60_1.1_M2.6/model.yaml @@ -15,25 +15,28 @@ description: ofa-resnet50 for Image Classification. input size: 224*224 -float ops: 6.0G task: classification framework: pytorch -prune: 0.6 +prune: '0.60' version: 1.1 files: -- name: +- name: pt_ofa_resnet_0.60_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_224_224_0.60_6.0G_1.1_M2.4.tar.gz - checksum: 05b0bf1de20211c55cd461708e694db0 -- name: pt_ofa_resnet50_imagenet_224_224_0.60_6.0G_1.1_M2.4_MI100 - type: Ymodel - board: M100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_224_224_0.60_6.0G_1.1_M2.4_MI100.zip - checksum: 0f8ad6fcd1444c4bfc82f0f65f8017f1 -- name: pt_ofa_resnet50_imagenet_224_224_0.60_6.0G_1.1_M2.4_MI210 - type: Ymodel - board: M210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_224_224_0.60_6.0G_1.1_M2.4_MI210.zip - checksum: 0236ca7df689734ffb0a00499caa0c92 -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet_0.60_1.1_M2.6.tar.gz + checksum: 3fb3e57dff86a83fc9fa265a529b26dd +- name: pt_ofa_resnet50_0.60_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.60_1.1_M2.6_MI100.tar.gz + checksum: ce6708e58df930efa86c9789f8c0bf38 +- name: pt_ofa_resnet50_0.60_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.60_1.1_M2.6_MI210.tar.gz + checksum: c48e511c0dca808399124de12c5f9482 +- name: pt_ofa_resnet50_0.60_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.60_1.1_M2.6_NAVI2.tar.gz + checksum: f601a883ec178e60963a7c1ef5af16b7 \ No newline at end of file diff --git a/docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_192_192_0.74_3.6G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_ofa_resnet50_0.74_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 55% rename from docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_192_192_0.74_3.6G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_ofa_resnet50_0.74_1.1_M2.6/model.yaml index 9dad4c7..71181b6 --- a/docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_192_192_0.74_3.6G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_ofa_resnet50_0.74_1.1_M2.6/model.yaml @@ -15,26 +15,28 @@ description: ofa-resnet50 for Image Classification. input size: 192*192 -float ops: 3.6G task: classification framework: pytorch -prune: None +prune: '0.74' version: 1.1 files: -- name: pt_ofa_resnet50_imagenet_192_192_0.74_3.6G_1.1_M2.4 +- name: pt_ofa_resnet_0.74_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_192_192_0.74_3.6G_1.1_M2.4.tar.gz - checksum: ffd0c5f8922a2b6befb697cc65b32690 -- name: pt_ofa_resnet50_imagenet_192_192_0.74_3.6G_1.1_M2.4_MI100 - type: Ymodel - board: M100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_192_192_0.74_3.6G_1.1_M2.4_MI100.zip - checksum: 3fa7c3bcddeebaeff121e5566a95f96a -- name: pt_ofa_resnet50_imagenet_192_192_0.74_3.6G_1.1_M2.4_MI210 - type: Ymodel - board: M210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_192_192_0.74_3.6G_1.1_M2.4_MI210.zip - checksum: ea8a6c2ac929e4ccbad89d038d74a03e -license: license: https://github.com/amd/UIF/blob/main/LICENSE - + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet_0.74_1.1_M2.6.tar.gz + checksum: d4dadd0511a0245fe27edb0d8da943fa +- name: pt_ofa_resnet50_0.74_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.74_1.1_M2.6_MI100.tar.gz + checksum: 3c51125ec14945a509e88ab36bcd3dec +- name: pt_ofa_resnet50_0.74_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.74_1.1_M2.6_MI210.tar.gz + checksum: c94de1c0d54cc193da4bc9529928cf41 +- name: pt_ofa_resnet50_0.74_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.74_1.1_M2.6_NAVI2.tar.gz + checksum: 74124a98cb83de215f18cd0b82c3590b diff --git a/docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_160_160_0.88_1.8G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_ofa_resnet50_0.88_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 54% rename from docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_160_160_0.88_1.8G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_ofa_resnet50_0.88_1.1_M2.6/model.yaml index cd4bb45..20c9d0b --- a/docs/2_model_setup/model-list/pt_ofa_resnet50_imagenet_160_160_0.88_1.8G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_ofa_resnet50_0.88_1.1_M2.6/model.yaml @@ -15,25 +15,28 @@ description: ofa-resnet50 for Image Classification. input size: 160*160 -float ops: 1.8G task: classification framework: pytorch -prune: 0.88 -version: 2.4 +prune: '0.88' +version: 1.1 files: -- name: +- name: pt_ofa_resnet_0.88_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_160_160_0.88_1.8G_1.1_M2.4.tar.gz - checksum: 8a69ce6ffba92e2af31d09c858d3434e -- name: pt_ofa_resnet50_imagenet_160_160_0.88_1.8G_1.1_M2.4_MI100 - type: Ymodel - board: M100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_160_160_0.88_1.8G_1.1_M2.4_MI100.zip - checksum: 6fda304ec7a2dcd881107614b0603f0b -- name: pt_ofa_resnet50_imagenet_160_160_0.88_1.8G_1.1_M2.4_MI210 - type: Ymodel - board: M210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_imagenet_160_160_0.88_1.8G_1.1_M2.4_MI210.zip - checksum: 8f6b0130e412a6e4aa4e820a560db125 -license: license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet_0.88_1.1_M2.6.tar.gz + checksum: d57ecb4619c7b4b6b13ed9620e6cbd9a +- name: pt_ofa_resnet50_0.88_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.88_1.1_M2.6_MI100.tar.gz + checksum: 36f5fcd46b14711fd0ce3520f09502c7 +- name: pt_ofa_resnet50_0.88_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.88_1.1_M2.6_MI210.tar.gz + checksum: a6cc6898680f17772d195be9e259eb2e +- name: pt_ofa_resnet50_0.88_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet50_0.88_1.1_M2.6_NAVI2.tar.gz + checksum: 598dc97335b935d33466e03c652110b2 diff --git a/docs/2_model_setup/model-list/pt_ofa_resnet_0.88_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_ofa_resnet_0.88_1.2_M2.6/model.yaml new file mode 100755 index 0000000..b090ea7 --- /dev/null +++ b/docs/2_model_setup/model-list/pt_ofa_resnet_0.88_1.2_M2.6/model.yaml @@ -0,0 +1,50 @@ +# Copyright 2019 Xilinx Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: ofa-resnet50 for Image Classification. +input size: 160*160 +float ops: 1.8G +task: classification +framework: PyTorch +prune: '0.88' +version: 1.2 +files: + +pt_ofa_resnet_0.88_1.2_M2.6.tar.gz f17ec850364e6aede8ffe9b11d5477a1 +pt_ofa_resnet_0.88_1.2_M2.6_NAVI2.tar.gz 2ae2058543996204996bb1149ff9424f +pt_ofa_resnet_1.2_M2.6_MI100.tar.gz 1df797727806a90ae6fffe27d185d3e5 +pt_ofa_resnet_1.2_M2.6_MI210.tar.gz a401e30ac197a861995d000a52d271b8 + +- name: pt_ofa_resnet_0.88_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet_0.88_1.2_M2.6.tar.gz + checksum: f17ec850364e6aede8ffe9b11d5477a1 +- name: pt_ofa_resnet_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet_1.2_M2.6_MI100.tar.gz + checksum: 2ae2058543996204996bb1149ff9424f +- name: pt_ofa_resnet_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet_1.2_M2.6_MI210.tar.gz + checksum: 1df797727806a90ae6fffe27d185d3e5 +- name: pt_ofa_resnet_0.88_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_ofa_resnet_0.88_1.2_M2.6_NAVI2.tar.gz + checksum: a401e30ac197a861995d000a52d271b8 +license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/pt_reid-resnet50_market1501_256_128_5.3G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_pointpillars_1.2_M2.6/model.yaml similarity index 84% rename from docs/2_model_setup/model-list/pt_reid-resnet50_market1501_256_128_5.3G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_pointpillars_1.2_M2.6/model.yaml index 0f390ba..587a94c 100755 --- a/docs/2_model_setup/model-list/pt_reid-resnet50_market1501_256_128_5.3G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_pointpillars_1.2_M2.6/model.yaml @@ -59,7 +59,10 @@ # Reid_resnet50 pruned0.5, # Reid_resnet50 pruned0.6, # Reid_resnet50 pruned0.7, -# DLRM +# DLRM +# ViT +# PointPillars +# Resnet50_v1.5 ofa # # This License Agreement for Non-Commercial Models (“Agreement”) is a legal agreement between you (either an individual or # an entity) and Advanced Micro Devices, Inc. (“AMD”). DO NOT USE THE TRAINED MODELS IDENTIFIED ABOVE UNTIL YOU HAVE @@ -82,28 +85,30 @@ # PURPOSE AND NON-INFRINGEMENT. YOU BEAR ALL RISK OF USING THE TRAINED MODELS (INCLUDING THIRD PARTY PART MATERIALS, IF ANY) # AND YOU AGREE TO RELEASE AMD FROM ANY LIABILITY OR DAMAGES FOR ANY CLAIM OR ACTION ARISING OUT OF OR IN CONNECTION WITH # YOUR USE OF THE TRAINED MODEL AND/OR THIRD PARTY MATERIALS. - -description: resnet50-based person re-identification model. -input size: 256x128 -float ops: 5.3G -task: person reid +description: PointPillars for 3D Detection +float ops: 10.8G framework: pytorch prune: 'no' -version: 1.1 +version: 1.2 files: -- name: pt_reid-resnet50_market1501_256_128_5.3G_1.1_M2.4 +- name: pt_pointpillars_1.2_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid-resnet50_market1501_256_128_5.3G_1.1_M2.4.zip - checksum: d74982e0328426199f7b88a716a08b20 -- name: pt_reid-resnet50_market1501_256_128_5.3G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_pointpillars_1.2_M2.6.tar.gz + checksum: b3b9d655955682d4749692ef2929f959 +- name: pt_pointpillars_1.2_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid-resnet50_market1501_256_128_5.3G_1.1_M2.4_MI100.zip - checksum: e5cd3e2da3e652ade5821eb2495a4b0b -- name: pt_reid-resnet50_market1501_256_128_5.3G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_pointpillars_1.2_M2.6_MI100.tar.gz + checksum: cbfcf0db0c8dd63080d4d10eb70a544f +- name: pt_pointpillars_1.2_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid-resnet50_market1501_256_128_5.3G_1.1_M2.4_M210.zip - checksum: ecdd87801809694939c69c9a6e431268 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_pointpillars_1.2_M2.6_MI210.tar.gz + checksum: 77f46c5cc65281663719767d103808b4 +- name: pt_pointpillars_1.2_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_pointpillars_1.2_M2.6_NAVI2.tar.gz + checksum: af7d97c98232c39adb8a05b821433e66 license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/pt_reid_resnet50_0.6_1.1_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_reid_resnet50_0.6_1.1_M2.6/model.yaml new file mode 100755 index 0000000..935fbf6 --- /dev/null +++ b/docs/2_model_setup/model-list/pt_reid_resnet50_0.6_1.1_M2.6/model.yaml @@ -0,0 +1,43 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: resnet50-based person re-identification model. +input size: 256x128 +float ops: 2.1G +task: person reid +framework: pytorch +prune: '0.6' +version: 1.1 +files: +- name: pt_reid_resnet50_0.6_1.1_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_0.6_1.1_M2.6.tar.gz + checksum: ade7c9f0a658d14a77fe36ae8c95e6fb +- name: pt_reid_resnet50_0.6_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_0.6_1.1_M2.6_MI100.tar.gz + checksum: a45fc0591e71110652a41378a00f58e7 +- name: pt_reid_resnet50_0.6_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_0.6_1.1_M2.6_MI210.tar.gz + checksum: 6a30505f65d417b510c11c5a648899b5 +- name: pt_reid_resnet50_0.6_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_0.6_1.1_M2.6_NAVI2.tar.gz + checksum: 92919185501dd45247180b4fd0483e88 diff --git a/docs/2_model_setup/model-list/pt_reid_resnet50_0.7_1.1_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_reid_resnet50_0.7_1.1_M2.6/model.yaml new file mode 100755 index 0000000..35951c3 --- /dev/null +++ b/docs/2_model_setup/model-list/pt_reid_resnet50_0.7_1.1_M2.6/model.yaml @@ -0,0 +1,43 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: resnet50-based person re-identification model. +input size: 256x128 +float ops: 1.6G +task: person reid +framework: pytorch +prune: '0.7' +version: 1.1 +files: +- name: pt_reid_resnet50_0.7_1.1_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_0.7_1.1_M2.6.tar.gz + checksum: 0bcf55d14ca3c839baf9b7d4a07afdd4 +- name: pt_reid_resnet50_0.7_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_0.7_1.1_M2.6_MI100.tar.gz + checksum: 231a509fbb18d8a3c08262dea17f6cd4 +- name: pt_reid_resnet50_0.7_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_0.7_1.1_M2.6_MI210.tar.gz + checksum: 595907af84ef63dde61a80a40c00a4e0 +- name: pt_reid_resnet50_0.7_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_0.7_1.1_M2.6_NAVI2.tar.gz + checksum: e9a722459e729bb0c69cc429f33db859 diff --git a/docs/2_model_setup/model-list/pt_reid_resnet50_1.1_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_reid_resnet50_1.1_M2.6/model.yaml new file mode 100755 index 0000000..0c74db4 --- /dev/null +++ b/docs/2_model_setup/model-list/pt_reid_resnet50_1.1_M2.6/model.yaml @@ -0,0 +1,43 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: resnet50-based person re-identification model. +input size: 256x128 +float ops: 5.3G +task: person reid +framework: pytorch +prune: 'no' +version: 1.1 +files: +- name: pt_reid_resnet50_1.1_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_1.1_M2.6.tar.gz + checksum: 3f202a236943128cbf2335acb98be1e0 +- name: pt_reid_resnet50_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_1.1_M2.6_MI100.tar.gz + checksum: 4b8e3bda9d2f439c22f81896aee30122 +- name: pt_reid_resnet50_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_1.1_M2.6_MI210.tar.gz + checksum: e24622d6a0228c3a0c589c7193e147be +- name: pt_reid_resnet50_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid_resnet50_1.1_M2.6_NAVI2.tar.gz + checksum: 44c578ab8db675f08c1bd524a355d9e6 \ No newline at end of file diff --git a/docs/2_model_setup/model-list/pt_resnet50v1.5_imagenet_224_224_0.4_4.9G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_resnet50v1.5_0.4_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 61% rename from docs/2_model_setup/model-list/pt_resnet50v1.5_imagenet_224_224_0.4_4.9G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_resnet50v1.5_0.4_1.1_M2.6/model.yaml index 56c289a..9b012f2 --- a/docs/2_model_setup/model-list/pt_resnet50v1.5_imagenet_224_224_0.4_4.9G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_resnet50v1.5_0.4_1.1_M2.6/model.yaml @@ -18,23 +18,26 @@ input size: '224' float ops: 4.9G task: image classfication framework: pytorch -prune: 'yes' +prune: '0.4' version: 1.1 files: -- name: pt_resnet50v1.5_imagenet_224_224_0.4_4.9G_1.1_M2.4 +- name: pt_resnet50v1.5_0.4_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_imagenet_224_224_0.4_4.9G_1.1_M2.4.zip - checksum: e10d50b8a8bba45f355e46cb74c79630 -- name: pt_resnet50v1.5_imagenet_224_224_0.4_4.9G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_0.4_1.1_M2.6.tar.gz + checksum: 9051a7272df49c27e009474e2727ab97 +- name: pt_resnet50_0.4_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_imagenet_224_224_0.4_4.9G_1.1_M2.4_MI100.zip - checksum: bc796bd467e0d77b14583d1d42b6c72e -- name: pt_resnet50v1.5_imagenet_224_224_0.4_4.9G_1.1_M2.4_MI200 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50_0.4_1.1_M2.6_MI100.tar.gz + checksum: 6140010a395c433d89480f986b2355fe +- name: pt_resnet50_0.4_1.1_M2.6_MI210.tar.gz type: ymodel - board: MI200 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_imagenet_224_224_0.4_4.9G_1.1_M2.4._MI210.zip - checksum: fe03401c6ba455c5397a6f3b9d1ba1a1 -license: https://github.com/amd/UIF/blob/main/LICENSE - + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50_0.4_1.1_M2.6_MI210.tar.gz + checksum: fcd7c0d84dfbfe89ffe614c23744d314 +- name: pt_resnet50_0.4_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50_0.4_1.1_M2.6_NAVI2.tar.gz + checksum: de89180191bd00878e47b22c9905fe9d diff --git a/docs/2_model_setup/model-list/pt_resnet50v1.5_imagenet_224_224_0.6_3.3G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_resnet50v1.5_0.6_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 62% rename from docs/2_model_setup/model-list/pt_resnet50v1.5_imagenet_224_224_0.6_3.3G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_resnet50v1.5_0.6_1.1_M2.6/model.yaml index 0b3f86b..426fb8e --- a/docs/2_model_setup/model-list/pt_resnet50v1.5_imagenet_224_224_0.6_3.3G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_resnet50v1.5_0.6_1.1_M2.6/model.yaml @@ -21,19 +21,24 @@ framework: pytorch prune: '0.6' version: 1.1 files: -- name: pt_resnet50v1.5_imagenet_224_224_0.6_3.3G_1.1_M2.4 +- name: pt_resnet50v1.5_0.6_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_imagenet_224_224_0.6_3.3G_1.1_M2.4.zip - checksum: 0c89ea324adcb6c49689b58649634a16 -- name: pt_resnet50v1.5_imagenet_224_224_0.6_3.3G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_0.6_1.1_M2.6.tar.gz + checksum: 935d68911625b6f53298ab862fbcccf5 +- name: pt_resnet50_0.6_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_imagenet_224_224_0.6_3.3G_1.1_M2.4_MI100.zip - checksum: 1aefea808a915178aac97160784345d1 -- name: pt_resnet50v1.5_imagenet_224_224_0.6_3.3G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50_0.6_1.1_M2.6_MI100.tar.gz + checksum: e179ea3f0bd0b6d8e3090f60c1a8f16d +- name: pt_resnet50_0.6_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_imagenet_224_224_0.6_3.3G_1.1_M2.4_MI210.zip - checksum: a2acaa38414f706b626e12dfa37d8bd3 -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50_0.6_1.1_M2.6_MI210.tar.gz + checksum: 3e137fe209b16c8e364e7a3333fcc72f +- name: pt_resnet50_0.6_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50_0.6_1.1_M2.6_NAVI2.tar.gz + checksum: f1a6dd89eb60401714159050cb3ff9c3 + diff --git a/docs/2_model_setup/model-list/pt_resnet50v1.5_imagenet_224_224_8.2G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_resnet50v1.5_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 63% rename from docs/2_model_setup/model-list/pt_resnet50v1.5_imagenet_224_224_8.2G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_resnet50v1.5_1.1_M2.6/model.yaml index a2c50e2..515faf1 --- a/docs/2_model_setup/model-list/pt_resnet50v1.5_imagenet_224_224_8.2G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_resnet50v1.5_1.1_M2.6/model.yaml @@ -21,19 +21,24 @@ framework: pytorch prune: 'no' version: 1.1 files: -- name: pt_resnet50v1.5_imagenet_224_224_8.2G_1.1_M2.4 +- name: pt_resnet50v1.5_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_imagenet_224_224_8.2G_1.1_M2.4.zip - checksum: 66f3c4ac2e2ab650eed49fd2b17acc4f -- name: pt_resnet50v1.5_imagenet_224_224_8.2G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_1.1_M2.6.tar.gz + checksum: 100109b2861e0a5f6ce0687942b5ba90 +- name: pt_resnet50_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_imagenet_224_224_8.2G_1.1_M2.4_MI100.zip - checksum: a9c87c7035e6eef69ccd4afb7d97c524 -- name: pt_resnet50v1.5_imagenet_224_224_8.2G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50_1.1_M2.6_MI100.tar.gz + checksum: 31949f05d91af239fef9f98f78fa5931 +- name: pt_resnet50_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50v1.5_imagenet_224_224_8.2G_1.1_M2.4_MI210.zip - checksum: 4336de53480c81ab3bf6fdcf0518860e -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50_1.1_M2.6_MI210.tar.gz + checksum: f202b10656ba6c6d5e07c27a49c77d89 +- name: pt_resnet50_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_resnet50_1.1_M2.6_NAVI2.tar.gz + checksum: e6c604075329ac6fa3a2b47678391e28 + diff --git a/docs/2_model_setup/model-list/pt_retinanet_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_retinanet_1.2_M2.6/model.yaml new file mode 100755 index 0000000..daa5e6e --- /dev/null +++ b/docs/2_model_setup/model-list/pt_retinanet_1.2_M2.6/model.yaml @@ -0,0 +1,42 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: Retinanet for Object Detection +float ops: 403.2G +framework: pytorch +prune: 'no' +version: 1.2 +files: +- name: pt_retinanet_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_retinanet_1.2_M2.6.tar.gz + checksum: 246568913595b0f8b090e37fe572d662 +- name: pt_retinanet_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_retinanet_1.2_M2.6_MI100.tar.gz + checksum: e00aa198c9961e693bd9043e15a39105 +- name: pt_retinanet_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_retinanet_1.2_M2.6_MI210.tar.gz + checksum: eeafad0ddc105da53557d9975141cd09 +- name: pt_retinanet_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_retinanet_1.2_M2.6_NAVI2.tar.gz + checksum: fe1afeb059188c95f18655b11b2225a0 +license: license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/pt_reid-resnet50_market1501_256_128_0.7_1.6G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/pt_vit_1.2_M2.6/model.yaml similarity index 83% rename from docs/2_model_setup/model-list/pt_reid-resnet50_market1501_256_128_0.7_1.6G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/pt_vit_1.2_M2.6/model.yaml index 0e97557..4d7ccdc 100755 --- a/docs/2_model_setup/model-list/pt_reid-resnet50_market1501_256_128_0.7_1.6G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/pt_vit_1.2_M2.6/model.yaml @@ -59,7 +59,10 @@ # Reid_resnet50 pruned0.5, # Reid_resnet50 pruned0.6, # Reid_resnet50 pruned0.7, -# DLRM +# DLRM +# ViT +# PointPillars +# Resnet50_v1.5 ofa # # This License Agreement for Non-Commercial Models (“Agreement”) is a legal agreement between you (either an individual or # an entity) and Advanced Micro Devices, Inc. (“AMD”). DO NOT USE THE TRAINED MODELS IDENTIFIED ABOVE UNTIL YOU HAVE @@ -83,28 +86,33 @@ # YOUR USE OF THE TRAINED MODEL AND/OR THIRD PARTY MATERIALS. -description: resnet50-based person re-identification model. -input size: 256x128 -float ops: 1.6G -task: person reid -framework: pytorch -prune: 'yes' -version: 1.1 +description: Vision Transformer model for Image Classification +input size: '224*224' +float ops: 98.7G +task: classification +framework: PyTorch +prune: 'no' +version: 1.2 files: -- name: pt_reid-resnet50_market1501_256_128_0.7_1.6G_1.1_M2.4 +- name: pt_vit_1.2_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid-resnet50_market1501_256_128_0.7_1.6G_1.1_M2.4.zip - checksum: 8849e2fd4803ef3ebba05aa4d4da2b72 -- name: pt_reid-resnet50_market1501_256_128_0.7_1.6G_1.1_M2.4_MI100 - type: ymodel - board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid-resnet50_market1501_256_128_0.7_1.6G_1.1_M2.4_MI100.zip - checksum: 1a9a103f142c8aad29df2bff540fb4cb -- name: pt_reid-resnet50_market1501_256_128_0.7_1.6G_1.1_M2.4_MI210 - type: ymodel - board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_reid-resnet50_market1501_256_128_0.7_1.6G_1.1_M2.4_MI210.zip - checksum: 0ae6cf6d9a034b5bac7b6edb7fde570c + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_vit_1.2_M2.6.tar.gz + checksum: cc9e86ec36676347ca5ac6e103e0e462 +- name: pt_vit_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_vit_1.2_M2.6_MI100.tar.gz + checksum: a07982e10b85b66f7f6b23ffb04da347 +- name:pt_vit_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_vit_1.2_M2.6_MI210.tar.gz + checksum: 239606755cd27909bcb9cc6aab5b62d6 +- name: pt_vit_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_vit_1.2_M2.6_NAVI2.tar.gz + checksum: 2060cad93e3ef0693fa82426dfe31e2d license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/pt_wd_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/pt_wd_1.2_M2.6/model.yaml new file mode 100755 index 0000000..ddd7f9a --- /dev/null +++ b/docs/2_model_setup/model-list/pt_wd_1.2_M2.6/model.yaml @@ -0,0 +1,43 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: Wide&Deep model for recommend system +float ops: 10.384k +task: recommend system +framework: PyTorch +prune: 'no' +version: 1.2 +files: +- name: pt_wd_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_wd_1.2_M2.6.tar.gz + checksum: 4a6d089a57a4b70457156d4d1dd2348c +- name: pt_wd_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_wd_1.2_M2.6_MI100.tar.gz + checksum: 52d70e8aa8fcde19a8a923db875bb1a8 +- name: pt_wd_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_wd_1.2_M2.6_MI210.tar.gz + checksum: 6c702bc25ce8056b19c434f7a75b052e +- name: pt_wd_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_wd_1.2_M2.6_NAVI2.tar.gz + checksum: e742846fb4c740681ad4c2b77854f744 +license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/tf2_2dunet_0.7_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/tf2_2dunet_0.7_1.2_M2.6/model.yaml new file mode 100755 index 0000000..d0fde97 --- /dev/null +++ b/docs/2_model_setup/model-list/tf2_2dunet_0.7_1.2_M2.6/model.yaml @@ -0,0 +1,42 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: 2D Unet for Medical Segmentation +float ops: 2.27G +framework: tensorflow2 +prune: '0.7' +version: 1.2 +files: +- name: tf2_2dunet_0.7_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_2dunet_0.7_1.2_M2.6.tar.gz + checksum: 1b59e6973df57af0091df39572b923d9 +- name: tf2_2dunet_0.7_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_2dunet_0.7_1.2_M2.6_MI100.tar.gz + checksum: 5684348efc7c5114192ed1806093eecf +- name: tf2_2dunet_0.7_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_2dunet_0.7_1.2_M2.6_MI210.tar.gz + checksum: 68b39a526aeb9be2659bf773270d613f +- name: tf2_2dunet_0.7_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_2dunet_0.7_1.2_M2.6_NAVI2.tar.gz + checksum: 864377d8a414278e30d6c9178df98969 +license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/tf2_2dunet_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/tf2_2dunet_1.2_M2.6/model.yaml new file mode 100755 index 0000000..5155569 --- /dev/null +++ b/docs/2_model_setup/model-list/tf2_2dunet_1.2_M2.6/model.yaml @@ -0,0 +1,42 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: 2D Unet for Medical Segmentation +float ops: 7.66G +framework: tensorflow2 +prune: 'no' +version: 1.2 +files: +- name: tf2_2dunet_0.7_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_2dunet_0.7_1.2_M2.6.tar.gz + checksum: 1b59e6973df57af0091df39572b923d9 +- name: tf2_2dunet_0.7_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_2dunet_0.7_1.2_M2.6_MI100.tar.gz + checksum: 5684348efc7c5114192ed1806093eecf +- name: tf2_2dunet_0.7_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_2dunet_0.7_1.2_M2.6_MI210.tar.gz + checksum: 68b39a526aeb9be2659bf773270d613f +- name: tf2_2dunet_0.7_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_2dunet_0.7_1.2_M2.6_NAVI2.tar.gz + checksum: 864377d8a414278e30d6c9178df98969 +license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/pt_bert_base_SQuADv1.1_384_70.66G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_bert_basev1.5_1.2_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 59% rename from docs/2_model_setup/model-list/pt_bert_base_SQuADv1.1_384_70.66G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_bert_basev1.5_1.2_M2.6/model.yaml index 973a89c..02b32a0 --- a/docs/2_model_setup/model-list/pt_bert_base_SQuADv1.1_384_70.66G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_bert_basev1.5_1.2_M2.6/model.yaml @@ -13,29 +13,34 @@ # limitations under the License. -description: BERT Base for SQuADv1.1 Question Answering. +description: BERT Base v1.5 for SQuADv1.1 Question Answering input size: '384' float ops: 70.66G task: Question Answering -framework: pytorch +framework: TensorFlow 2.x prune: 'no' -version: 1.1 +version: 1.2 files: -- name: +- name: tf2_bert_base_1.2_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_base_SQuADv1.1_384_70.66G_1.1_M2.4.tar - checksum: b5d636933e47eeda6a3c933d9962e14c -- name: pt_bert_base_SQuADv1.1_384_70.66G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_bert_base_1.2_M2.6.tar.gz + checksum: 810ae23bdc7c2eaa74b760670aa260bc +- name: tf2_bert_base_1.2_M2.6_MI100.tar.gz type: Ymodel board: M100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_base_SQuADv1.1_384_70.66G_1.1_M2.4_MI100.tar - checksum: a1faf52f27ae9664db28c3f8fcbf01ef -- name: pt_bert_base_SQuADv1.1_384_70.66G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_bert_base_1.2_M2.6_MI100.tar.gz + checksum: 8c5e56f47a83592b5b627680808b89e2 +- name: tf2_bert_base_1.2_M2.6_MI210.tar.gz type: Ymodel board: M210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=pt_bert_base_SQuADv1.1_384_70.66G_1.1_M2.4_MI210.tar - checksum: d8bdb4b6ebaa016e5cd854f68b39afd3 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_bert_base_1.2_M2.6_MI210.tar.gz + checksum: 584f593952064b208daaf7d3a8ae1e37 +- name: tf2_bert_base_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_bert_base_1.2_M2.6_NAVI2.tar.gz + checksum: 4f5f978a176731a496aad451df91c984 license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/tf2_bert_largev1.5_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/tf2_bert_largev1.5_1.2_M2.6/model.yaml new file mode 100755 index 0000000..1c405a9 --- /dev/null +++ b/docs/2_model_setup/model-list/tf2_bert_largev1.5_1.2_M2.6/model.yaml @@ -0,0 +1,46 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: BERT Large v1.5 for SQuADv1.1 Question Answering +input size: '384' +float ops: 246.42G +task: Question Answering +framework: TensorFlow 2.x +prune: 'no' +version: 1.2 +files: +- name: tf2_bert_large_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_bert_large_1.2_M2.6.tar.gz + checksum: 810ae23bdc7c2eaa74b760670aa260bc +- name: tf2_bert_large_1.2_M2.6_MI100.tar.gz + type: Ymodel + board: M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_bert_large_1.2_M2.6_MI100.tar.gz + checksum: 8c5e56f47a83592b5b627680808b89e2 +- name: tf2_bert_large_1.2_M2.6_MI210.tar.gz + type: Ymodel + board: M210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_bert_large_1.2_M2.6_MI210.tar.gz + checksum: 584f593952064b208daaf7d3a8ae1e37 +- name: tf2_bert_large_1.2_M2.6_NAVI2.tar.gz + type: Ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_bert_large_1.2_M2.6_NAVI2.tar.gz + checksum: 4f5f978a176731a496aad451df91c984 +license: https://github.com/amd/UIF/blob/main/LICENSE + + diff --git a/docs/2_model_setup/model-list/tf2_efficientdet_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/tf2_efficientdet_1.2_M2.6/model.yaml new file mode 100755 index 0000000..b42e04d --- /dev/null +++ b/docs/2_model_setup/model-list/tf2_efficientdet_1.2_M2.6/model.yaml @@ -0,0 +1,42 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: EfficientDet for Object Detection +float ops: 11.0G +framework: tensorflow2 +prune: 'no' +version: 1.2 +files: +- name: tf2_efficientdet_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_efficientdet_1.2_M2.6.tar.gz + checksum: 6882e1a9e9973621cc90575cd168e390 +- name: tf2_efficientdet_1.2_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_efficientdet_1.2_M2.6_MI100.tar.gz + checksum: 782e0c36fb83498330fe65f692425222 +- name: tf2_efficientdet_1.2_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_efficientdet_1.2_M2.6_MI210.tar.gz + checksum: 0725ee52f28d310ef3925cb18bd8b127 +- name: tf2_efficientdet_1.2_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_efficientdet_1.2_M2.6_NAVI2.tar.gz + checksum: b505c4e37ec4f3c2f4d9a1682d84d5f5 +license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/model-list/tf2_inceptionv3_imagenet_299_299_0.4_6.93G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_inceptionv3_0.4_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 62% rename from docs/2_model_setup/model-list/tf2_inceptionv3_imagenet_299_299_0.4_6.93G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_inceptionv3_0.4_1.1_M2.6/model.yaml index db30afb..b870c1c --- a/docs/2_model_setup/model-list/tf2_inceptionv3_imagenet_299_299_0.4_6.93G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_inceptionv3_0.4_1.1_M2.6/model.yaml @@ -21,19 +21,23 @@ framework: tensorflow2 prune: '0.4' version: 1.1 files: -- name: tf2_inceptionv3_imagenet_299_299_0.4_6.93G_1.1_M2.4 +- name: tf2_inceptionv3_0.4_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_imagenet_299_299_0.4_6.93G_1.1_M2.4.zip - checksum: 1f69584b9f82eebbe7e3706e1352e8cf -- name: tf2_inceptionv3_imagenet_299_299_0.4_6.93G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_0.4_1.1_M2.6.tar.gz + checksum: 1d1d202c18cc7599410d7c84788d449b +- name: tf2_inception_0.4_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_imagenet_299_299_0.4_6.93G_1.1_M2.4_MI100.zip - checksum: 2fb4ea3a2737a60414c1ea81bf86a839 -- name: tf2_inceptionv3_imagenet_299_299_0.4_6.93G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inception_0.4_1.1_M2.6_MI100.tar.gz + checksum: 03ba5af7ca85a1516d19a90d80e6864f +- name: tf2_inception_0.4_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_imagenet_299_299_0.4_6.93G_1.1_M2.4_MI210.zip - checksum: ef71a428b73608b1f4f45532225e1ea2 -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inception_0.4_1.1_M2.6_MI210.tar.gz + checksum: a3a006e484e8e5b00b3775f8def000d0 +- name: tf2_inception_0.4_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inception_0.4_1.1_M2.6_NAVI2.tar.gz + checksum: 04e412a8330f6893bdf475e8c8e6cd9e diff --git a/docs/2_model_setup/model-list/tf2_inceptionv3_imagenet_299_299_0.6_4.62G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_inceptionv3_0.6_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 62% rename from docs/2_model_setup/model-list/tf2_inceptionv3_imagenet_299_299_0.6_4.62G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_inceptionv3_0.6_1.1_M2.6/model.yaml index 6f7e033..4d84e58 --- a/docs/2_model_setup/model-list/tf2_inceptionv3_imagenet_299_299_0.6_4.62G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_inceptionv3_0.6_1.1_M2.6/model.yaml @@ -21,19 +21,23 @@ framework: tensorflow2 prune: '0.6' version: 1.1 files: -- name: tf2_inceptionv3_imagenet_299_299_0.6_4.62G_1.1_M2.4 +- name: tf2_inceptionv3_0.6_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_imagenet_299_299_0.6_4.62G_1.1_M2.4.zip - checksum: 20b3ef77f38198dbd5aa5790f4e17e1e -- name: tf2_inceptionv3_imagenet_299_299_0.6_4.62G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_0.6_1.1_M2.6.tar.gz + checksum: 087aa9e2f17fedb38fce46594fd2b88d +- name: tf2_inception_0.6_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_imagenet_299_299_0.6_4.62G_1.1_M2.4_MI100.zip - checksum: 94176d23d6a17dd8b81981949dfe8275 -- name: tf2_inceptionv3_imagenet_299_299_0.6_4.62G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inception_0.6_1.1_M2.6_MI100.tar.gz + checksum: ffa5058049e5386abb462f40635c558a +- name: tf2_inception_0.6_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_imagenet_299_299_0.6_4.62G_1.1_M2.4_MI210.zip - checksum: 554359ced8bf1b4a936a7251c31ca8b5 -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inception_0.6_1.1_M2.6_MI210.tar.gz + checksum: ff24da92e3f9f4cbaba5215adcb01e30 +- name: tf2_inception_0.6_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inception_0.6_1.1_M2.6_NAVI2.tar.gz + checksum: 0af1d3bf2310b63bd31e87d488b12f3e diff --git a/docs/2_model_setup/model-list/tf2_inceptionv3_imagenet_299_299_11.5G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_inceptionv3_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 63% rename from docs/2_model_setup/model-list/tf2_inceptionv3_imagenet_299_299_11.5G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_inceptionv3_1.1_M2.6/model.yaml index 72a943a..9bedeb1 --- a/docs/2_model_setup/model-list/tf2_inceptionv3_imagenet_299_299_11.5G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_inceptionv3_1.1_M2.6/model.yaml @@ -21,19 +21,23 @@ framework: tensorflow2 prune: 'no' version: 1.1 files: -- name: tf2_inceptionv3_imagenet_299_299_11.5G_1.1_M2.4 +- name: tf2_inceptionv3_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_imagenet_299_299_11.5G_1.1_M2.4.zip - checksum: 228a00de7c6dffb6049d7755b5c7c32e -- name: tf2_inceptionv3_imagenet_299_299_11.5G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_1.1_M2.6.tar.gz + checksum: 442d6745bc0be7eff700bdd374e4fa14 +- name: tf2_inception_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_imagenet_299_299_11.5G_1.1_M2.4_MI100.zip - checksum: 521812d73dc518793fa2bc17fb400781 -- name: tf2_inceptionv3_imagenet_299_299_11.5G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inception_1.1_M2.6_MI100.tar.gz + checksum: 03dc8ce73b8350b27a775e6a36956ff8 +- name: tf2_inception_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inceptionv3_imagenet_299_299_11.5G_1.1_M2.4_MI210.zip - checksum: bd5e674ca2c91ae0cecb8f5537c02a28 -license: license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inception_1.1_M2.6_MI210.tar.gz + checksum: 6aee0bd064eb4198f4e2e50cde8ec5f8 +- name: tf2_inception_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_inception_1.1_M2.6_NAVI2.tar.gz + checksum: 4b31867f7ec9e726b2cf030db62bb29a diff --git a/docs/2_model_setup/model-list/tf2_mobilenetv1_imagenet_224_224_0.3_0.8G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_mobilenetv1_0.3_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 60% rename from docs/2_model_setup/model-list/tf2_mobilenetv1_imagenet_224_224_0.3_0.8G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_mobilenetv1_0.3_1.1_M2.6/model.yaml index 73a3699..5d54a49 --- a/docs/2_model_setup/model-list/tf2_mobilenetv1_imagenet_224_224_0.3_0.8G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_mobilenetv1_0.3_1.1_M2.6/model.yaml @@ -15,26 +15,29 @@ description: MobileNetV1 pruned model for classification. input size: '224' -float ops: 0.8G +float ops: 0.80G task: classification framework: tensorflow2 -prune: 'yes' +prune: '0.3' version: 1.1 files: -- name: tf2_mobilenetv1_imagenet_224_224_0.3_0.8G_1.1_M2.4 +- name: tf2_mobilenetv1_0.3_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_imagenet_224_224_0.3_0.8G_1.1_M2.4.zip - checksum: 7ae9688993eef831f7fab6dca1eeca35 -- name: tf2_mobilenetv1_imagenet_224_224_0.3_0.8G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_0.3_1.1_M2.6.tar.gz + checksum: 4acabee72d9fd696f8aa22f1dd1505ef +- name: tf2_mobilenet_0.3_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_imagenet_224_224_0.3_0.8G_1.1_M2.4_MI100.zip - checksum: e05b292a32ff7de65e711e69bacc89f1 -- name: tf2_mobilenetv1_imagenet_224_224_0.3_0.8G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenet_0.3_1.1_M2.6_MI100.tar.gz + checksum: ba94c5cbbbc6e6b4992b51b84e3f1460 +- name: tf2_mobilenet_0.3_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_imagenet_224_224_0.3_0.8G_1.1_M2.4_MI210.zip - checksum: 2053dbb6f6489c43318b672d978b5303 -license: https://github.com/amd/UIF/blob/main/LICENSE - + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenet_0.3_1.1_M2.6_MI210.tar.gz + checksum: 27dec317cc169facc909cb8c7a7cea6b +- name: tf2_mobilenet_0.3_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenet_0.3_1.1_M2.6_NAVI2.tar.gz + checksum: 934d833b722e25e016b6f350f2049009 \ No newline at end of file diff --git a/docs/2_model_setup/model-list/tf2_mobilenetv1_imagenet_224_224_0.5_0.58G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_mobilenetv1_0.5_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 61% rename from docs/2_model_setup/model-list/tf2_mobilenetv1_imagenet_224_224_0.5_0.58G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_mobilenetv1_0.5_1.1_M2.6/model.yaml index f74e78e..5b3e0a7 --- a/docs/2_model_setup/model-list/tf2_mobilenetv1_imagenet_224_224_0.5_0.58G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_mobilenetv1_0.5_1.1_M2.6/model.yaml @@ -18,22 +18,26 @@ input size: '224' float ops: 0.58G task: classification framework: tensorflow2 -prune: 'yes' +prune: '0.5' version: 1.1 files: -- name: tf2_mobilenetv1_imagenet_224_224_0.5_0.58G_1.1_M2.4 +- name: tf2_mobilenetv1_0.5_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_imagenet_224_224_0.5_0.58G_1.1_M2.4.zip - checksum: a8d47e0c1458d2717dd3f2e86866d9d0 -- name: tf2_mobilenetv1_imagenet_224_224_0.5_0.58G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_0.5_1.1_M2.6.tar.gz + checksum: 3875425cdb4df535191f61bbef442506 +- name: tf2_mobilenet_0.5_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_imagenet_224_224_0.5_0.58G_1.1_M2.4_MI100.zip - checksum: e4db19daec3063b329bae20d0182a5ad -- name: tf2_mobilenetv1_imagenet_224_224_0.5_0.58G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenet_0.5_1.1_M2.6_MI100.tar.gz + checksum: e788f4b39d92917688abba48c220feb5 +- name: tf2_mobilenet_0.5_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_imagenet_224_224_0.5_0.58G_1.1_M2.4_MI210.zip - checksum: e4fd032db9b04936c9cac15125009b33 -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenet_0.5_1.1_M2.6_MI210.tar.gz + checksum: de7abc67ce471fc077478f8d1fb45bd +- name: tf2_mobilenet_0.5_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenet_0.5_1.1_M2.6_NAVI2.tar.gz + checksum: 002cfcbda5e18a1c17ce34b50e0598a5 diff --git a/docs/2_model_setup/model-list/tf2_mobilenetv1_imagenet_224_224_1.15G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_mobilenetv1_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 63% rename from docs/2_model_setup/model-list/tf2_mobilenetv1_imagenet_224_224_1.15G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_mobilenetv1_1.1_M2.6/model.yaml index 388881a..88ada0b --- a/docs/2_model_setup/model-list/tf2_mobilenetv1_imagenet_224_224_1.15G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_mobilenetv1_1.1_M2.6/model.yaml @@ -21,19 +21,24 @@ framework: tensorflow2 prune: 'no' version: 1.1 files: -- name: tf2_mobilenetv1_imagenet_224_224_1.15G_1.1_M2.4 +- name: tf2_mobilenetv1_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_imagenet_224_224_1.15G_1.1_M2.4.zip - checksum: 0093e2c9b840de04b1d716c27fe31f1d -- name: tf2_mobilenetv1_imagenet_224_224_1.15G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_1.1_M2.6.tar.gz + checksum: 988971e4e28046bbc3c5aabed2f27daa +- name: tf2_mobilenet_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_imagenet_224_224_1.15G_1.1_M2.4_MI100.zip - checksum: 4817ccb55eaec564fc785e07993d98ab -- name: tf2_mobilenetv1_imagenet_224_224_1.15G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenet_1.1_M2.6_MI100.tar.gz + checksum: 5f03aef2aff579027e42b56134c8442a +- name: tf2_mobilenet_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenetv1_imagenet_224_224_1.15G_1.1_M2.4_MI210.zip - checksum: 8793178779c9a90f46ca57f27dc6ccaa -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenet_1.1_M2.6_MI210.tar.gz + checksum: 374c8c229af8ca94088da61b416e4b21 +- name: tf2_mobilenet_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_mobilenet_1.1_M2.6_NAVI2.tar.gz + checksum: c54e211ce815a01af2ca80b871b67c30 + diff --git a/docs/2_model_setup/model-list/tf2_resnet34ssd_coco2017_1200_1200_0.19_349.6G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_resnet34ssd_0.19_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 59% rename from docs/2_model_setup/model-list/tf2_resnet34ssd_coco2017_1200_1200_0.19_349.6G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_resnet34ssd_0.19_1.1_M2.6/model.yaml index 47bd85b..d80bedf --- a/docs/2_model_setup/model-list/tf2_resnet34ssd_coco2017_1200_1200_0.19_349.6G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_resnet34ssd_0.19_1.1_M2.6/model.yaml @@ -18,22 +18,26 @@ input size: 1200*1200 float ops: 349.6G task: object detection framework: tensorflow2 -prune: 'yes' +prune: '0.19' version: 1.1 files: -- name: tf2_resnet34ssd_coco2017_1200_1200_0.19_349.6G_1.1_M2.4 +- name: tf2_resnet34ssd_0.19_1.1_M2.6.zip type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_coco2017_1200_1200_0.19_349.6G_1.1_M2.4.zip - checksum: 0cecba415882cbf1f55a4be9af37c272 -- name: tf2_resnet34ssd_coco2017_1200_1200_0.19_349.6G_1.1_M2.4_M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_0.19_1.1_M2.6.zip + checksum: aed3e1833a2c61067c6243dbc23070d4 +- name: tf2_resnet34ssd_0.19_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_coco2017_1200_1200_0.19_349.6G_1.1_M2.4_MI100.zip - checksum: 9b8966057292c4343e432627fb0a987a -- name: tf2_resnet34ssd_coco2017_1200_1200_0.19_349.6G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_0.19_1.1_M2.6_MI100.tar.gz + checksum: 51802abdf920badf978f5ec1a9e2fe7d +- name: tf2_resnet34ssd_0.19_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_coco2017_1200_1200_0.19_349.6G_1.1_M2.4_MI210.zip - checksum: 869639139d39e024bcb240b2d9113f04 -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_0.19_1.1_M2.6_MI210.tar.gz + checksum: 36f0f62ac81600343634469c34153857 +- name: tf2_resnet34ssd_0.19_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_0.19_1.1_M2.6_NAVI2.tar.gz + checksum: 3bb15262ff4e0312d69b115afea47982 diff --git a/docs/2_model_setup/model-list/tf2_resnet34ssd_coco2017_1200_1200_0.29_306.1G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_resnet34ssd_0.29_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 58% rename from docs/2_model_setup/model-list/tf2_resnet34ssd_coco2017_1200_1200_0.29_306.1G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_resnet34ssd_0.29_1.1_M2.6/model.yaml index 5becf29..e067d24 --- a/docs/2_model_setup/model-list/tf2_resnet34ssd_coco2017_1200_1200_0.29_306.1G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_resnet34ssd_0.29_1.1_M2.6/model.yaml @@ -12,27 +12,33 @@ # See the License for the specific language governing permissions and # limitations under the License. + description: resnet34 ssd input size: 1200*1200 float ops: 306.1G task: object detection framework: tensorflow2 -prune: 'yes' +prune: '0.29' version: 1.1 files: -- name: tf2_resnet34ssd_coco2017_1200_1200_0.29_306.1G_1.1_M2.4 +- name: tf2_resnet34ssd_0.29__1.1_M2.6.zip type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_coco2017_1200_1200_0.29_306.1G_1.1_M2.4.zip - checksum: cb4ed3641c6a29c29eeb5f1dd7eda3b4 -- name: tf2_resnet34ssd_coco2017_1200_1200_0.29_306.1G_1.1_M2.4_M100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_0.29__1.1_M2.6.zip + checksum: 1617115f93f0be1a0a257098d09f32bb +- name: tf2_resnet34ssd_0.29__1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_coco2017_1200_1200_0.29_306.1G_1.1_M2.4_MI100.zip - checksum: ce5ea8b8729e351e99ae0e3ec3ad8352 -- name: tf2_resnet34ssd_coco2017_1200_1200_0.29_306.1G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_0.29__1.1_M2.6_MI100.tar.gz + checksum: 78eda002e48b40c56af75fee6550c865 +- name: tf2_resnet34ssd_0.29__1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_coco2017_1200_1200_0.29_306.1G_1.1_M2.4_MI210.zip - checksum: acbdebd0443de1bfa401b9bf8ade438b -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_0.29__1.1_M2.6_MI210.tar.gz + checksum: 6f801ad19cc51438ca4bb36fc71cad68 +- name: tf2_resnet34ssd_0.29__1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_0.29__1.1_M2.6_NAVI2.tar.gz + checksum: 5637e893e5b604c46e3fac188772aa70 + diff --git a/docs/2_model_setup/model-list/tf2_resnet34ssd_coco2017_1200_1200_432.9G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_resnet34ssd_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 57% rename from docs/2_model_setup/model-list/tf2_resnet34ssd_coco2017_1200_1200_432.9G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_resnet34ssd_1.1_M2.6/model.yaml index eddf913..69f8f2e --- a/docs/2_model_setup/model-list/tf2_resnet34ssd_coco2017_1200_1200_432.9G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_resnet34ssd_1.1_M2.6/model.yaml @@ -21,19 +21,24 @@ framework: tensorflow2 prune: 'no' version: 1.1 files: -- name: tf2_resnet34ssd_coco2017_1200_1200_432.9G_1.1_M2.4 +- name: tf2_resnet34ssd_1.1_M2.6.zip type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_coco2017_1200_1200_432.9G_1.1_M2.4.zip - checksum: c76e9ee1f8220fc6ed8cc75b3ec10573 -- name: tf2_resnet34ssd_coco2017_1200_1200_432.9G_1.1_M2.4_MI100 - type: Ymodel - board: M100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_coco2017_1200_1200_432.9G_1.1_M2.4_MI100.zip - checksum: 8b4dd65738383ec6038ca94a02ce36c8 -- name: tf2_resnet34ssd_coco2017_1200_1200_432.9G_1.1_M2.4_MI210 - type: Ymodel - board: M210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_coco2017_1200_1200_432.9G_1.1_M2.4_MI210.zip - checksum: 0f6f67c63e3776b6a4c0dca2e40990b9 -license: license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_1.1_M2.6.zip + checksum: 967ba40d7d859a05b0120b068d3e1d5a +- name: tf2_resnet34ssd_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_1.1_M2.6_MI100.tar.gz + checksum: d88bb7d58c03813e1f124a2e27862709 +- name: tf2_resnet34ssd_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_1.1_M2.6_MI210.tar.gz + checksum: 36a260f1bb7f62c5f6d58ba7662a070b +- name: tf2_resnet34ssd_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet34ssd_1.1_M2.6_NAVI2.tar.gz + checksum: 4e80275c33bec383f2fb74187a85c9ea + diff --git a/docs/2_model_setup/model-list/tf2_resnet50v1_imagenet_224_224_0.5_3.84G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_resnet50v1_0.5_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 61% rename from docs/2_model_setup/model-list/tf2_resnet50v1_imagenet_224_224_0.5_3.84G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_resnet50v1_0.5_1.1_M2.6/model.yaml index e39f56a..2b189e3 --- a/docs/2_model_setup/model-list/tf2_resnet50v1_imagenet_224_224_0.5_3.84G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_resnet50v1_0.5_1.1_M2.6/model.yaml @@ -18,22 +18,26 @@ input size: '224' float ops: 3.84G task: classification framework: tensorflow2 -prune: 'yes' +prune: '0.5' version: 1.1 files: -- name: tf2_resnet50v1_imagenet_224_224_0.5_3.84G_1.1_M2.4 +- name: tf2_resnet50v1_0.5_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_imagenet_224_224_0.5_3.84G_1.1_M2.4.zip - checksum: 0cba275a9270d1ffc049e27182084149 -- name: tf2_resnet50v1_imagenet_224_224_0.5_3.84G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_0.5_1.1_M2.6.tar.gz + checksum: 7b4ed4458eb9c3f36bce78d204cd6ba1 +- name: tf2_resnet50_0.5_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_imagenet_224_224_0.5_3.84G_1.1_M2.4_MI100.zip - checksum: 86e5322b15b5676e888f9e557181961e -- name: tf2_resnet50v1_imagenet_224_224_0.5_3.84G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50_0.5_1.1_M2.6_MI100.tar.gz + checksum: 88c8036ee453d9fa0b964081b8592f07 +- name: tf2_resnet50_0.5_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_imagenet_224_224_0.5_3.84G_1.1_M2.4_MI210.zip - checksum: 4a82b71b602155318cf69687131a844a -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50_0.5_1.1_M2.6_MI210.tar.gz + checksum: a67c546531894bed323dcf9c95883608 +- name: tf2_resnet50_0.5_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50_0.5_1.1_M2.6_NAVI2.tar.gz + checksum: c62bfbcf476b2341fbd0f87ef6c50bf5 diff --git a/docs/2_model_setup/model-list/tf2_resnet50v1_imagenet_224_224_0.7_2.88G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_resnet50v1_0.7_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 61% rename from docs/2_model_setup/model-list/tf2_resnet50v1_imagenet_224_224_0.7_2.88G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_resnet50v1_0.7_1.1_M2.6/model.yaml index 3886b22..3bab3ce --- a/docs/2_model_setup/model-list/tf2_resnet50v1_imagenet_224_224_0.7_2.88G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_resnet50v1_0.7_1.1_M2.6/model.yaml @@ -18,22 +18,26 @@ input size: '224' float ops: 2.88G task: classification framework: tensorflow2 -prune: 'yes' +prune: '0.7' version: 1.1 files: -- name: tf2_resnet50v1_imagenet_224_224_0.7_2.88G_1.1_M2.4 +- name: tf2_resnet50v1_0.7_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_imagenet_224_224_0.7_2.88G_1.1_M2.4.zip - checksum: 8a19608046acd5da54d0e1326f65e64b -- name: tf2_resnet50v1_imagenet_224_224_0.7_2.88G_1.1_M2.4_MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_0.7_1.1_M2.6.tar.gz + checksum: b806451de9eef3da73372919e68f3439 +- name: tf2_resnet50_0.7_1.1_M2.6_MI100.tar.gz type: ymodel board: MI100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_imagenet_224_224_0.7_2.88G_1.1_M2.4_MI100.zip - checksum: b8883af7c35ce7fd659a068905b18f6a -- name: tf2_resnet50v1_imagenet_224_224_0.7_2.88G_1.1_M2.4_MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50_0.7_1.1_M2.6_MI100.tar.gz + checksum: 173ef2a55205639ffad84e780bace5a7 +- name: tf2_resnet50_0.7_1.1_M2.6_MI210.tar.gz type: ymodel board: MI210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_imagenet_224_224_0.7_2.88G_1.1_M2.4_MI210.zip - checksum: f88d5e650ddb27e1f6e34e24097e19e1 -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50_0.7_1.1_M2.6_MI210.tar.gz + checksum: f212f3648ae4280abad85089f93bdb27 +- name: tf2_resnet50_0.7_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50_0.7_1.1_M2.6_NAVI2.tar.gz + checksum: 4bddbf1e13313464245961d88756ea3b diff --git a/docs/2_model_setup/model-list/tf2_resnet50v1_imagenet_224_224_7.73G_1.1_M2.4/model.yaml b/docs/2_model_setup/model-list/tf2_resnet50v1_1.1_M2.6/model.yaml old mode 100644 new mode 100755 similarity index 59% rename from docs/2_model_setup/model-list/tf2_resnet50v1_imagenet_224_224_7.73G_1.1_M2.4/model.yaml rename to docs/2_model_setup/model-list/tf2_resnet50v1_1.1_M2.6/model.yaml index 2f9ffd4..02f9fc0 --- a/docs/2_model_setup/model-list/tf2_resnet50v1_imagenet_224_224_7.73G_1.1_M2.4/model.yaml +++ b/docs/2_model_setup/model-list/tf2_resnet50v1_1.1_M2.6/model.yaml @@ -21,19 +21,23 @@ framework: tensorflow2 prune: 'no' version: 1.1 files: -- name: +- name: tf2_resnet50v1_1.1_M2.6.tar.gz type: float board: GPU - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_imagenet_224_224_7.73G_1.1_M2.4.zip - checksum: 2b41c591cd633d378c85566fd8f62411 -- name: tf2_resnet50v1_imagenet_224_224_7.73G_1.1_M2.4_MI100 - type: Ymodel - board: M100 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_imagenet_224_224_7.73G_1.1_M2.4_MI100.zip - checksum: b201c9ccb9bc38693ddaa60ad9f2a8e6 -- name: tf2_resnet50v1_imagenet_224_224_7.73G_1.1_M2.4_MI210 - type: Ymodel - board: M210 - download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_imagenet_224_224_7.73G_1.1_M2.4_MI210.zip - checksum: 9994ab9bbe8e84c24184f746f4e88c94 -license: https://github.com/amd/UIF/blob/main/LICENSE + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50v1_1.1_M2.6.tar.gz + checksum: 2ad093faf2bbd3774df9b4046b1edc46 +- name: tf2_resnet50_1.1_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50_1.1_M2.6_MI100.tar.gz + checksum: d7b516b392b9452089d2e97ca9b01d0d +- name: tf2_resnet50_1.1_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50_1.1_M2.6_MI210.tar.gz + checksum: a7ce7db2daf3445bb980df2996de2599 +- name: tf2_resnet50_1.1_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_resnet50_1.1_M2.6_NAVI2.tar.gz + checksum: 896d3fe1283dadf82052ce85cd12424e diff --git a/docs/2_model_setup/model-list/tf2_yolov3_1.2_M2.6/model.yaml b/docs/2_model_setup/model-list/tf2_yolov3_1.2_M2.6/model.yaml new file mode 100755 index 0000000..7d406a9 --- /dev/null +++ b/docs/2_model_setup/model-list/tf2_yolov3_1.2_M2.6/model.yaml @@ -0,0 +1,42 @@ +# Copyright (c) 2018-2022 Advanced Micro Devices, Inc. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +description: YOLOv3 for object detection. +float ops: 1.15G +framework: tensorflow2 +prune: 'no' +version: 1.2 +files: +- name: tf2_yolov3_1.2_M2.6.tar.gz + type: float + board: GPU + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_yolov3_1.2_M2.6.tar.gz + checksum: f6b9310be93fd7067069d399f21429d8 +- name: tf2_yolov3_1.2_M2.6_MI100.tar.gz + type: ymodel + board: MI100 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_yolov3_1.2_M2.6_MI100.tar.gz + checksum: f3cdeae17ec8f5987713216bafc88c3d +- name: tf2_yolov3_1.2_M2.6_MI210.tar.gz + type: ymodel + board: MI210 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_yolov3_1.2_M2.6_MI210.tar.gz + checksum: de419d0f0865510838a4a35183d21b2c +- name: tf2_yolov3_1.2_M2.6_NAVI2.tar.gz + type: ymodel + board: NAVI2 + download link: https://www.xilinx.com/bin/public/openDownload?filename=tf2_yolov3_1.2_M2.6_NAVI2.tar.gz + checksum: f462d42ad4b168aa4a94ca9740d9310a +license: https://github.com/amd/UIF/blob/main/LICENSE diff --git a/docs/2_model_setup/uifmodelsetup.md b/docs/2_model_setup/uifmodelsetup.md index deeabac..5645c86 100644 --- a/docs/2_model_setup/uifmodelsetup.md +++ b/docs/2_model_setup/uifmodelsetup.md @@ -1,6 +1,6 @@

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

- - @@ -14,200 +14,171 @@ - [2.1: UIF Model Zoo Introduction](#21-uif-model-zoo-introduction) - [2.1.1: Standard Naming Rules](#211-standard-naming-rules) - [2.1.2: Model List](#212-model-list) -- [2.2: Get ZenDNN Models from UIF Model Zoo](#22-get-zendnn-models-from-uif-model-zoo) -- [2.3: Get MIGraphX Models from UIF Model Zoo](#23-get-migraphx-models-from-uif-model-zoo) -- [2.4: Set Up MIGraphX YModel](#24-set-up-migraphx-ymodel) -- [2.5: Get Vitis AI Models from UIF Model Zoo](#25-get-vitis-ai-models-from-uif-model-zoo) + - [2.1.3: Once-For-All: Efficient Model Customization for Various Platforms](#213-once-for-all-ofa-efficient-model-customization-for-various-platforms) +- [2.2: Get MIGraphX Models from UIF Model Zoo](#22-get-migraphx-models-from-uif-model-zoo) +- [2.3: Set Up MIGraphX YModel](#23-set-up-migraphx-ymodel) +- [2.4: Get Vitis AI Models from UIF Model Zoo](#24-get-vitis-ai-models-from-uif-model-zoo) +- [2.5: Get ZenDNN Models from UIF Model Zoo](#25-get-zendnn-models-from-uif-model-zoo) - _Click [here](/README.md#implementing-uif-11) to go back to the UIF User Guide home page._ + _Click [here](/README.md#implementing-uif-12) to go back to the UIF User Guide home page._ # 2.1: UIF Model Zoo Introduction -UIF 1.1 Model Zoo provides 30 models for AMD Instinct™ GPUs (MIGraphX) and 84 models for AMD EPYC™ CPUs (ZenDNN). In the Vitis™ AI development environment, 130 reference models for different FPGA adaptive platforms are also provided. Refer to the following model lists for details. +UIF 1.2 Model Zoo provides 50 models for AMD Instinct™ GPUs (MIGraphX) including 20 new models and 30 models inherited from UIF 1.1. The Vitis™ AI development environment provides 106 reference models for different FPGA adaptive platforms. +Also, you could use the 84 models for AMD EPYC™ CPUs (ZenDNN) inherited from UIF1.1. + +Model information is located in [model-list](/docs/2_model_setup/model-list). **Note:** If a model is marked as limited to non-commercial use, you must comply with the [AMD license agreement for non-commercial models](/docs/2_model_setup/AMD-license-agreement-for-non-commercial-models.md). -**UIF 1.1 Models for Vitis AI** +**UIF 1.2 Models for MIGraphX** + +
+ Click here to view details + +| # | Model | Original Platform | Datatype FP32 | Datatype FP16 | Pruned | Reminder for limited use scope | +| ---- | ----------------------- | ----------------- | ------------- | ------------- | ------ | ------------------------------ | +| 1 | 2D-Unet | TensorFlow2 | √ | √ | × | | +| 2 | 2D-Unet pruned0.7 | TensorFlow2 | √ | √ | √ | | +| 3 | Albert-base | PyTorch | √ | √ | × | | +| 4 | Albert-large | PyTorch | √ | √ | × | | +| 5 | Bert-base | TensorFlow2 | √ | √ | × | | +| 6 | Bert-large | TensorFlow2 | √ | √ | × | | +| 7 | DETR | PyTorch | √ | √ | × | | +| 8 | DistillBert | PyTorch | √ | √ | × | | +| 9 | DLRM (40M) | PyTorch | √ | √ | × | Non-Commercial Use Only | +| 10 | EfficientDet | TensorFlow2 | √ | √ | × | | +| 11 | GPT2-large | PyTorch | √ | √ | × | | +| 12 | GPT2-medium | PyTorch | √ | √ | × | | +| 13 | GPT2-XL | PyTorch | √ | √ | × | | +| 14 | MobileBert | PyTorch | √ | √ | × | | +| 15 | PointPillars | PyTorch | √ | √ | × | Non-Commercial Use Only | +| 16 | Resnet50_v1.5 ofa | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 17 | RetinaNet | PyTorch | √ | √ | × | | +| 18 | ViT | PyTorch | √ | √ | × | Non-Commercial Use Only | +| 19 | W&D | PyTorch | √ | √ | × | | +| 20 | yolov3 | TensorFlow2 | √ | √ | × | | + +
+ + +**UIF 1.2 Models for Vitis AI**
Click here to view details | # | Model | Platform | Datatype FP32 | Datatype INT8 | Pruned | Reminder for limited use scope | | ---- | :--------------------------- | :--------- | :-----------: | :-----------: | :----: | ------------------------------ | -| 1 | inception-resnetv2 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 2 | inceptionv1 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 3 | inceptionv1 pruned0.09 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 4 | inceptionv1 pruned0.16 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 5 | inceptionv2 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 6 | inceptionv3 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 7 | inceptionv3 pruned0.2 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 8 | inceptionv3 pruned0.4 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 9 | inceptionv4 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 10 | inceptionv4 pruned0.2 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 11 | inceptionv4 pruned0.4 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 12 | mobilenetv1_0.25 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 13 | mobilenetv1_0.5 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 14 | mobilenetv1_1.0 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 15 | mobilenetv1_1.0 pruned0.11 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 16 | mobilenetv1_1.0 pruned0.12 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 17 | mobilenetv2_1.0 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 18 | mobilenetv2_1.4 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 19 | resnetv1_50 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 20 | resnetv1_50 pruned0.38 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 21 | resnetv1_50 pruned0.65 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 22 | resnetv1_101 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 23 | resnetv1_101 pruned0.35 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 24 | resnetv1_101 pruned0.57 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 25 | resnetv1_152 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 26 | resnetv1_152 pruned0.51 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 27 | resnetv1_152pruned0.60 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 28 | vgg16 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 29 | vgg16 pruned0.43 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 30 | vgg16 pruned0.50 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 31 | vgg19 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 32 | vgg19 pruned0.24 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 33 | vgg19 pruned0.39 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | -| 34 | resnetv2_50 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 35 | resnetv2_101 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 36 | resnetv2_152 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 37 | efficientnet-edgetpu-S | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 38 | efficientnet-edgetpu-M | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 39 | efficientnet-edgetpu-L | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 40 | mlperf_resnet50 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 41 | mobilenetEdge1.0 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 42 | mobilenetEdge0.75 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 43 | resnet50 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 44 | mobilenetv1 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 45 | inceptionv3 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 46 | efficientnet-b0 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 47 | mobilenetv3 | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 48 | efficientnet-lite | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 49 | ViT | TensorFlow | √ | √ | × | Non-Commercial Use Only | -| 50 | ssdmobilenetv1 | TensorFlow | √ | √ | × | | -| 51 | ssdmobilenetv2 | TensorFlow | √ | √ | × | | -| 52 | ssdresnet50v1_fpn | TensorFlow | √ | √ | × | | -| 53 | yolov3 | TensorFlow | √ | √ | × | | -| 54 | mlperf_resnet34 | TensorFlow | √ | √ | × | | -| 55 | ssdlite_mobilenetv2 | TensorFlow | √ | √ | × | | -| 56 | ssdinceptionv2 | TensorFlow | √ | √ | × | | -| 57 | refinedet | TensorFlow | √ | √ | × | | -| 58 | efficientdet-d2 | TensorFlow | √ | √ | × | | -| 59 | yolov3 | TensorFlow | √ | √ | × | | -| 60 | yolov4_416 | TensorFlow | √ | √ | × | | -| 61 | yolov4_512 | TensorFlow | √ | √ | × | | -| 62 | RefineDet-Medical | TensorFlow | √ | √ | × | | -| 63 | RefineDet-Medical pruned0.50 | TensorFlow | √ | √ | √ | | -| 64 | RefineDet-Medical pruned0.75 | TensorFlow | √ | √ | √ | | -| 65 | RefineDet-Medical pruned0.85 | TensorFlow | √ | √ | √ | | -| 66 | RefineDet-Medical pruned0.88 | TensorFlow | √ | √ | √ | | -| 67 | mobilenetv2 (segmentation) | TensorFlow | √ | √ | × | | -| 68 | erfnet | TensorFlow | √ | √ | × | | -| 69 | 2d-unet | TensorFlow | √ | √ | × | | -| 70 | bert-base | TensorFlow | √ | √ | × | | -| 71 | superpoint | TensorFlow | √ | √ | × | | -| 72 | HFNet | TensorFlow | √ | √ | × | | -| 73 | rcan | TensorFlow | √ | √ | × | | -| 74 | inceptionv3 | PyTorch | √ | √ | × | Non-Commercial Use Only | -| 75 | inceptionv3 pruned0.3 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 76 | inceptionv3 pruned0.4 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 77 | inceptionv3 pruned0.5 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 78 | inceptionv3 pruned0.6 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 79 | squeezenet | PyTorch | √ | √ | × | Non-Commercial Use Only | -| 80 | resnet50_v1.5 | PyTorch | √ | √ | × | Non-Commercial Use Only | -| 81 | resnet50_v1.5 pruned0.3 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 82 | resnet50_v1.5 pruned0.4 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 83 | resnet50_v1.5 pruned0.5 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 84 | resnet50_v1.5 pruned0.6 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 85 | resnet50_v1.5 pruned0.7 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 86 | OFA-resnet50 | PyTorch | √ | √ | × | Non-Commercial Use Only | -| 87 | OFA-resnet50 pruned0.45 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 88 | OFA-resnet50 pruned0.60 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 89 | OFA-resnet50 pruned0.74 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 90 | OFA-resnet50 pruned0.88 | PyTorch | √ | √ | √ | Non-Commercial Use Only | -| 91 | OFA-depthwise-resnet50 | PyTorch | √ | √ | × | Non-Commercial Use Only | -| 92 | vehicle type classification | PyTorch | √ | √ | × | | -| 93 | vehicle make classification | PyTorch | √ | √ | × | | -| 94 | vehicle color classification | PyTorch | √ | √ | × | | -| 95 | OFA-yolo | PyTorch | √ | √ | × | | -| 96 | OFA-yolo pruned0.3 | PyTorch | √ | √ | √ | | -| 97 | OFA-yolo pruned0.6 | PyTorch | √ | √ | √ | | -| 98 | yolox-nano | PyTorch | √ | √ | × | | -| 99 | yolov4csp | PyTorch | √ | √ | × | | -| 100 | yolov5-large | PyTorch | √ | √ | × | | -| 101 | yolov5-nano | PyTorch | √ | √ | × | | -| 102 | yolov5s6 | PyTorch | √ | √ | × | | -| 103 | yolov6m | PyTorch | √ | √ | × | | -| 104 | pointpillars | PyTorch | √ | √ | × | | -| 105 | CLOCs | PyTorch | √ | √ | × | | -| 106 | Enet | PyTorch | √ | √ | × | | -| 107 | SemanticFPN-resnet18 | PyTorch | √ | √ | × | | -| 108 | SemanticFPN-mobilenetv2 | PyTorch | √ | √ | × | | -| 109 | salsanext pruned0.60 | PyTorch | √ | √ | √ | | -| 110 | salsanextv2 pruned0.75 | PyTorch | √ | √ | √ | | -| 111 | SOLO | PyTorch | √ | √ | × | | -| 112 | HRNet | PyTorch | √ | √ | × | | -| 113 | CFLOW | PyTorch | √ | √ | × | | -| 114 | 3D-UNET | PyTorch | √ | √ | × | | -| 115 | MaskRCNN | PyTorch | √ | √ | × | | -| 116 | bert-base | PyTorch | √ | √ | × | | -| 117 | bert-large | PyTorch | √ | √ | × | | -| 118 | bert-tiny | PyTorch | √ | √ | × | | -| 119 | face-mask-detection | PyTorch | √ | √ | × | | -| 120 | movenet | PyTorch | √ | √ | × | | -| 121 | fadnet | PyTorch | √ | √ | × | | -| 122 | fadnet pruned0.65 | PyTorch | √ | √ | √ | | -| 123 | fadnetv2 | PyTorch | √ | √ | × | | -| 124 | fadnetv2 pruned0.51 | PyTorch | √ | √ | √ | | -| 125 | psmnet pruned0.68 | PyTorch | √ | √ | √ | | -| 126 | pmg | PyTorch | √ | √ | × | | -| 127 | SESR-S | PyTorch | √ | √ | × | | -| 128 | OFA-rcan | PyTorch | √ | √ | × | | -| 129 | DRUNet | PyTorch | √ | √ | × | | -| 130 | xilinxSR | PyTorch | √ | √ | × | | +| 1 | inceptionv1 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 2 | inceptionv1 pruned0.09 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 3 | inceptionv1 pruned0.16 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 4 | inceptionv3 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 5 | inceptionv3 pruned0.2 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 6 | inceptionv3 pruned0.4 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 7 | inceptionv4 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 8 | inceptionv4 pruned0.2 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 9 | inceptionv4 pruned0.4 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 10 | mobilenetv1_0.25 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 11 | mobilenetv1_1.0 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 12 | mobilenetv1_1.0 pruned0.11 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 13 | mobilenetv1_1.0 pruned0.12 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 14 | mobilenetv2_1.0 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 15 | mobilenetv2_1.4 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 16 | resnetv1_50 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 17 | resnetv1_50 pruned0.38 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 18 | resnetv1_50 pruned0.65 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 19 | resnetv1_101 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 20 | resnetv1_101 pruned0.35 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 21 | resnetv1_101 pruned0.57 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 22 | resnetv1_152 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 23 | resnetv1_152 pruned0.51 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 24 | resnetv1_152pruned0.60 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 25 | vgg16 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 26 | vgg16 pruned0.43 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 27 | vgg16 pruned0.50 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 28 | vgg19 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 29 | vgg19 pruned0.24 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 30 | vgg19 pruned0.39 | TensorFlow | √ | √ | √ | Non-Commercial Use Only | +| 31 | resnetv2_50 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 32 | resnetv2_101 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 33 | resnetv2_152 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 34 | efficientnet-edgetpu-S | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 35 | efficientnet-edgetpu-M | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 36 | efficientnet-edgetpu-L | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 37 | mlperf_resnet50 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 38 | resnet50 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 39 | mobilenetv1 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 40 | inceptionv3 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 41 | efficientnet-b0 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 42 | mobilenetv3 | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 43 | efficientnet-lite | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 44 | ViT | TensorFlow | √ | √ | × | Non-Commercial Use Only | +| 45 | ssdmobilenetv1 | TensorFlow | √ | √ | × | | +| 46 | ssdmobilenetv2 | TensorFlow | √ | √ | × | | +| 47 | yolov3 | TensorFlow | √ | √ | × | | +| 48 | mlperf_resnet34 | TensorFlow | √ | √ | × | | +| 49 | efficientdet-d2 | TensorFlow | √ | √ | × | | +| 50 | yolov3 | TensorFlow | √ | √ | × | | +| 51 | yolov4_416 | TensorFlow | √ | √ | × | | +| 52 | yolov4_512 | TensorFlow | √ | √ | × | | +| 53 | RefineDet-Medical | TensorFlow | √ | √ | × | | +| 54 | RefineDet-Medical pruned0.50 | TensorFlow | √ | √ | √ | | +| 55 | RefineDet-Medical pruned0.75 | TensorFlow | √ | √ | √ | | +| 56 | RefineDet-Medical pruned0.85 | TensorFlow | √ | √ | √ | | +| 57 | RefineDet-Medical pruned0.88 | TensorFlow | √ | √ | √ | | +| 58 | bert-base | TensorFlow | √ | √ | × | | +| 59 | superpoint | TensorFlow | √ | √ | × | | +| 60 | HFNet | TensorFlow | √ | √ | × | | +| 61 | rcan | TensorFlow | √ | √ | × | | +| 62 | inceptionv3 | PyTorch | √ | √ | × | Non-Commercial Use Only | +| 63 | inceptionv3 pruned0.3 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 64 | inceptionv3 pruned0.4 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 65 | inceptionv3 pruned0.5 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 66 | inceptionv3 pruned0.6 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 67 | squeezenet | PyTorch | √ | √ | × | Non-Commercial Use Only | +| 68 | resnet50_v1.5 | PyTorch | √ | √ | × | Non-Commercial Use Only | +| 69 | resnet50_v1.5 pruned0.3 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 70 | resnet50_v1.5 pruned0.4 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 71 | resnet50_v1.5 pruned0.5 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 72 | resnet50_v1.5 pruned0.6 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 73 | resnet50_v1.5 pruned0.7 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 74 | OFA-resnet50 | PyTorch | √ | √ | × | Non-Commercial Use Only | +| 75 | OFA-resnet50 pruned0.45 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 76 | OFA-resnet50 pruned0.60 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 77 | OFA-resnet50 pruned0.74 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 78 | OFA-resnet50 pruned0.88 | PyTorch | √ | √ | √ | Non-Commercial Use Only | +| 79 | OFA-depthwise-resnet50 | PyTorch | √ | √ | × | Non-Commercial Use Only | +| 80 | vehicle type classification | PyTorch | √ | √ | × | | +| 81 | vehicle make classification | PyTorch | √ | √ | × | | +| 82 | vehicle color classification | PyTorch | √ | √ | × | | +| 83 | OFA-yolo | PyTorch | √ | √ | × | | +| 84 | OFA-yolo pruned0.3 | PyTorch | √ | √ | √ | | +| 85 | OFA-yolo pruned0.6 | PyTorch | √ | √ | √ | | +| 86 | yolox-nano | PyTorch | √ | √ | × | | +| 87 | yolov4csp | PyTorch | √ | √ | × | | +| 88 | yolov6m | PyTorch | √ | √ | × | | +| 89 | pointpillars | PyTorch | √ | √ | × | | +| 90 | HRNet | PyTorch | √ | √ | × | | +| 91 | 3D-UNET | PyTorch | √ | √ | × | | +| 92 | bert-base | PyTorch | √ | √ | × | | +| 93 | bert-large | PyTorch | √ | √ | × | | +| 94 | bert-tiny | PyTorch | √ | √ | × | | +| 95 | face-mask-detection | PyTorch | √ | √ | × | | +| 96 | movenet | PyTorch | √ | √ | × | | +| 97 | fadnet | PyTorch | √ | √ | × | | +| 98 | fadnet pruned0.65 | PyTorch | √ | √ | √ | | +| 99 | fadnetv2 | PyTorch | √ | √ | × | | +| 100 | fadnetv2 pruned0.51 | PyTorch | √ | √ | √ | | +| 101 | psmnet pruned0.68 | PyTorch | √ | √ | √ | | +| 102 | SESR-S | PyTorch | √ | √ | × | | +| 103 | OFA-rcan | PyTorch | √ | √ | × | | +| 104 | xilinxSR | PyTorch | √ | √ | × | | +| 105 | yolov7 | PyTorch | √ | √ | × | | +| 106 | 2D-UNET | PyTorch | √ | √ | × | |
-**UIF 1.1 Models for MIGraphX** - -
- Click here to view details - -| # | Model | Original Platform | Converted Format | Datatype FP32 | Datatype FP16 | Datatype INT8 | Pruned | Reminder for limited use scope | -| ---- | ----------------------- | ----------------- | ---------------- | ------------- | ------------- | ------------- | ------ | ------------------------------ | -| 1 | Resnet50_v1 | TensorFlow | .PB | √ | √ | × | × | Non-Commercial Use Only | -| 2 | Resnet50_v1 pruned0.5 | TensorFlow | .PB | √ | √ | × | √ | Non-Commercial Use Only | -| 3 | Resnet50_v1 pruned0.7 | TensorFlow | .PB | √ | √ | × | √ | Non-Commercial Use Only | -| 4 | Inception_v3 | TensorFlow | .PB | √ | √ | × | × | Non-Commercial Use Only | -| 5 | Inception_v3 pruned0.4 | TensorFlow | .PB | √ | √ | × | √ | Non-Commercial Use Only | -| 6 | Inception_v3 pruned0.6 | TensorFlow | .PB | √ | √ | × | √ | Non-Commercial Use Only | -| 7 | Mobilenet_v1 | TensorFlow | .PB | √ | √ | × | × | Non-Commercial Use Only | -| 8 | Mobilenet_v1 pruned0.3 | TensorFlow | .PB | √ | √ | × | √ | Non-Commercial Use Only | -| 9 | Mobilenet_v1 pruned0.5 | TensorFlow | .PB | √ | √ | × | √ | Non-Commercial Use Only | -| 10 | Resnet34-ssd | TensorFlow | .PB | √ | √ | × | × | Non-Commercial Use Only | -| 11 | Resnet34-ssd pruned0.19 | TensorFlow | .PB | √ | √ | × | √ | Non-Commercial Use Only | -| 12 | Resnet34-ssd pruned0.29 | TensorFlow | .PB | √ | √ | × | √ | Non-Commercial Use Only | -| 13 | Bert-base | PyTorch | .ONNX | √ | √ | × | × | | -| 14 | Bert-large | PyTorch | .ONNX | √ | √ | × | × | | -| 15 | DLRM | PyTorch | .ONNX | √ | √ | × | × | Non-Commercial Use Only | -| 16 | Resnet50_v1.5 | PyTorch | .ONNX | √ | √ | × | × | Non-Commercial Use Only | -| 17 | Resnet50_v1.5 pruned0.4 | PyTorch | .ONNX | √ | √ | × | √ | Non-Commercial Use Only | -| 18 | Resnet50_v1.5 pruned0.6 | PyTorch | .ONNX | √ | √ | × | √ | Non-Commercial Use Only | -| 19 | Inception_v3 | PyTorch | .ONNX | √ | √ | × | × | Non-Commercial Use Only | -| 20 | Inception_v3 pruned0.4 | PyTorch | .ONNX | √ | √ | × | √ | Non-Commercial Use Only | -| 21 | Inception_v3 pruned0.6 | PyTorch | .ONNX | √ | √ | × | √ | Non-Commercial Use Only | -| 22 | Reid_resnet50 | PyTorch | .ONNX | √ | √ | × | × | Non-Commercial Use Only | -| 23 | Reid_resnet50 pruned0.6 | PyTorch | .ONNX | √ | √ | × | √ | Non-Commercial Use Only | -| 24 | Reid_resnet50 pruned0.7 | PyTorch | .ONNX | √ | √ | × | √ | Non-Commercial Use Only | -| 25 | OFA_resnet50 pruned0.45 | PyTorch | .ONNX | √ | √ | × | √ | Non-Commercial Use Only | -| 26 | OFA_resnet50 pruned0.60 | PyTorch | .ONNX | √ | √ | × | √ | Non-Commercial Use Only | -| 27 | OFA_resnet50 pruned0.74 | PyTorch | .ONNX | √ | √ | × | √ | Non-Commercial Use Only | -| 28 | OFA_resnet50 pruned0.88 | PyTorch | .ONNX | √ | √ | × | √ | Non-Commercial Use Only | -| 29 | GPT-2 small | PyTorch | .ONNX | √ | √ | × | × | | -| 30 | DistillGPT | PyTorch | .ONNX | √ | √ | × | × | | - -
- **UIF 1.1 Models for ZenDNN**
@@ -301,31 +272,150 @@ UIF 1.1 Model Zoo provides 30 models for AMD Instinct™ GPUs (MIGraphX) and 84 | 84 | VGG16 pruned0.50 | ONNXRT | .ONNX | √ | × | √ | √ | Non-Commercial Use Only |
- + ### 2.1.1: Standard Naming Rules -Model name: `F_M_(D)_H_W_(P)_C_V_Z` +Model name: `F_M_(P)_V_Z` * `F` specifies the training framework: `tf` is TensorFlow 1.x, `tf2` is TensorFlow 2.x, `pt` is PyTorch, `onnx` is ONNXRT. * `M` specifies the model. -* `D` specifies the dataset. It is optional depending on whether the dataset is public or private. -* `H` specifies the height of input data. -* `W` specifies the width of input data. * `P` specifies the pruning ratio, meaning how much computation is reduced. It is optional depending on whether the model is pruned or not. -* `C` specifies the computation of the model: how many Gops per image. * `V` specifies the version of UIF. * `Z` specifies the version of ZenDNN or MIGraphX. -For example, `pt_inceptionv3_imagenet_299_299_0.6_4.25G_1.1_Z4.0` is an `Inception v3` model trained with `PyTorch` using `Imagenet` dataset, the input size is `299*299`, `60%` pruned, the computation per image is `4.25G flops`, the UIF release version is `1.1`, and ZenDNN version is `4.0`. - -`pt_resnet50v1.5_imagenet_224_224_8.2G_1.1_M2.4` is a `resnet50 v1.5` model trained with `PyTorch` using the `Imagenet` dataset, the input size is `224*224`, `No` pruned, the computation per image is `8.2G flops`, the UIF release version is `1.1`, and the MIGraphX version is `2.4`. +For example, `pt_resnet50v1.5_pruned0.6_1.2_M2.6` is a `resnet50 v1.5` model trained with `PyTorch`, `60%` pruned, the UIF release version is `1.2`, and the MIGraphX version is `2.6`. ### 2.1.2: Model List -Visit [model-list](/docs/2_model_setup/model-list). The models are named according to standard naming rules. From here, you can get a download link and MD5 checksum of all released models running on different hardware platforms. You can download it manually or use the automatic download script described below to search for models by keyword. +Visit [model-list](/docs/2_model_setup/model-list) The models are named according to standard naming rules. + +You can get a download link and MD5 checksum of all released models running on different hardware platforms from here. You can download it manually or use the the following automatic download script to search for models by keyword. + +### 2.1.3: Once-For-All (OFA): Efficient Model Customization for Various Platforms + +Once-For-All (OFA) is an efficient neural architecture search (NAS) method that enables the customization of sub-networks for various hardware platforms. It decouples training and search, enabling quick derivation of specialized models optimized for specific platforms with efficient inference performance. OFA offers the following benefits: + +1. OFA requires only one training process and can specialize in diverse hardware platforms (for example, DPU, GPU, CPU, IPU) without incurring the heavy computation costs associated with manual design or conventional RL-based NAS methods. + +2. By decoupling supernet training and subnetwork searching, OFA optimizes networks with actual latency constraints of a specific hardware platform, avoiding discrepancies between estimated and actual latencies. + +3. OFA offers strong flexibility and scalability regarding the search space, supporting various network architectures (for example, CNN and Transformer) with different granularities (for example, operation-wise, layer-wise, block-wise) to cater to different tasks (for example, CV, NLP). + +4. Extensive experiments with OFA demonstrate consistent performance acceleration while maintaining similar accuracy levels compared to baselines on different devices. For instance, OFA-ResNet achieves a speedup ratio of 69% on MI100, 78% on MI210, and 97% on Navi2 compared to ResNet50 baselines with a batch size of 1 and prune ratio of 78%. + +5. OFA has the potential to search for optimized large language models (LLM) by stitching pretrained models for accuracy-efficiency trade-offs. A proof of concept for Vision Transformer (ViT) shows that OFA can derive an optimized ViT with a 20% to 40% speedup ratio compared to ViT-base on MI100. + +#### 2.1.3.1: OFA-ResNet for AMD GPUs with MIGraphX + +- Task Description: This case aims to search for ResNet-like models optimized for AMD GPUs with MIGraphX. The latency on MI100 is used as constraints for optimization. Channel numbers are aligned with GPU capabilities. +- Search Space Design: The search space has the following parameters: + - Stage output ratio (The ratio of output channel for each stage of the model): [0.65, 0.8, 1.0] + - Depth (The number of blocks for each stage): [0, 1, 2] + - Expand ratio (The ratio of output channel for each block in each stage): [0.2, 0.25, 0.35] + - Resolution (The input size of the model): [128, 160, 192, 224] +- Results: The OFA-ResNet model achieves a speedup ratio of 69% on MI100, 78% on MI210, 97% on Navi2 compared to ResNet50 baseline (with batch size of 1 and pruned ratio of 78%) while maintaining similar accuracy. + +#### 2.1.3.2: Performance Comparison + +| Model | Float Accuracy (ImageNet 1K) | FLOPs (G) | Pruned Ratio | Speedup ratio on MI100 | Speedup ratio on MI210 | Speedup ratio on Navi2 | +|-------------|------------------------------|-----------|--------------|------------------------|------------------------|------------------------| +| ResNet50 | 76.1% | 8.2 | 0% | - | - | - | +| OFA-ResNet | 75.8% | 1.77 | 78% | 1.69x | 1.78x | 1.97x | + + +# 2.2: Get MIGraphX Models from UIF Model Zoo + +Perform the following steps to install UIF 1.2 models: + +1. Set up and run the model downloader tool. + + ``` + git clone https://github.com/AMD/uif.git + cd uif/docs/2_model_setup + python3 downloader.py + ``` + It has the following provision to specify the frameworks: + + ``` + Tip: + You need to input framework and model name. Use space divide such as tf vgg16 + tf:tensorflow1.x tf2:tensorflow2.x onnx:onnxruntime dk:darknet pt:pytorch all: list all model + input: + ``` +2. Download UIF 1.2 GPU models. Provide `pt` as input to get the list of models. + + ``` + input:pt + chose model + 0 : all + 1 : pt_inceptionv3_1.2_M2.6 + 2 : pt_bert_base_1.2_M2.6 + 3 : pt_bert_large_1.2_M2.6 + + + ... + input num: + ``` + +The models with 1.2 as suffix are UIF 1.2 models. MI100 means the model has been tuned for MI-100 GPU, and MI210 indicates the model has been tuned for MI-210 GPU. Without either of these suffixes, the model is a model version that should be used for training, including the GPU example later in this section. + + ``` + input num:1 + chose model type + 0: all + 1: GPU + 2: MI100 + 3: MI200 + ... + ... + input num: + + ``` + +Provide `1` as input to download the GPU model. + + ``` + input num:1 + pt_inceptionv3_1.2_M2.6.zip + 100.0%|100% + done + ``` + +# 2.3: Set Up MIGraphX YModel + +YModel is designed to provide significant inference performance through MIGraphX. Prior to the introduction of YModel, performance tuning was conditioned by having tuned kernel configs stored in a `/home` local User DB. If users move their model to a different server or allow a different user to use it, they would have to run through the MIOpen tuning process again to populate the next User DB with the best kernel configs and corresponding solvers. Tuning is time consuming, and if the users have not performed tuning, they would see discrepancies between expected or claimed inference performance and the actual inference performance. This leads to repetitive and time-consuming tuning tasks for each user. + +MIGraphX introduces a feature known as YModel that stores the kernel config parameters found during tuning in the same file as the model itself. This ensures the same level of expected performance even when a model is copied to a different user or system. + +**Note:** The YModel feature is available starting with the ROCm™ v5.4.1 and UIF v1.1 releases. For more information on ROCm and the YModel feature, see https://rocm.docs.amd.com/en/latest/examples/machine_learning/migraphx_optimization.html#ymodel. + +**Note:** YModel does not support MIOpen fusions and must be disabled while generating YModel. + +To set up the YModel for GPUs: + +1. Tune the kernels for your architecture with `MIOPEN_FIND_ENFORCE=3 migraphx-driver run `. + +2. Build a `*.mxr` file with a compile option: + +``` + migraphx-driver compile file.onnx --enable-offload-copy --gpu --binary -o file.mxr + +``` + +3. Run the (Python) program with a command line: +``` + model=migraphx.load("file.mxr") +``` + +**Note:** If the Model Zoo contains a `*.mxr` file for your architecture, you can skip steps 1 and 2. + +# 2.4: Get Vitis AI Models from UIF Model Zoo + +Follow the instructions in the [Vitis AI](https://github.com/Xilinx/Vitis-AI/tree/master/model_zoo) Model Zoo page. -# 2.2: Get ZenDNN Models from UIF Model Zoo + +# 2.5: Get ZenDNN Models from UIF Model Zoo Perform the following steps to install UIF 1.1 models: @@ -447,96 +537,6 @@ Perform the following steps to install UIF 1.1 models: 100.0%|100% done ``` - -# 2.3: Get MIGraphX Models from UIF Model Zoo - -Perform the following steps to install UIF 1.1 models: - -1. Set up and run the model downloader tool. - - ``` - git clone https://github.com/AMD/uif.git - cd uif/docs/2_model_setup - python3 downloader.py - ``` - It has the following provision to specify the frameworks: - - ``` - Tip: - You need to input framework and model name, use space divide such as tf vgg16 - tf:tensorflow1.x tf2:tensorflow2.x onnx:onnxruntime dk:darknet pt:pytorch all: list all model - input: - ``` -2. Download UIF 1.1 GPU models. Provide `pt` as input to get the list of models. - - ``` - input:pt - chose model - 0 : all - 1 : pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4 - 2 : pt_3dunet_kits19_128_128_128_0.3_763.8G_1.1_Z4.0 - 3 : pt_bert_base_SQuADv1.1_384_70.66G_1.1_M2.4 - 4 : pt_bert_large_SQuADv1.1_384_246.42G_1.1_M2.4 - 5 : pt_personreid-res50_market1501_256_128_0.4_3.3G_1.1_Z4.0 - - ... - input num: - ``` - -The models with 1.1 as suffix are UIF 1.1 models. MI100 means the model has been tuned for MI-100 GPU, and MI210 indicates the model has been tuned for MI-210 GPU. Without either of these suffixes, the model is a model version that should be used for training, including the GPU example later in this chapter. - - ``` - input num:1 - chose model type - 0: all - 1: GPU - 2: MI100 - 3: MI200 - ... - ... - input num: - - ``` - -Provide `1` as input to download the GPU model. - - ``` - input num:1 - pt_inceptionv3_imagenet_299_299_11.4G_1.1_M2.4.zip - 100.0%|100% - done - ``` - -# 2.4: Set Up MIGraphX YModel - -YModel is designed to provide significant inference performance through MIGraphX. Prior to the introduction of YModel, performance tuning was conditioned by having tuned kernel configs stored in a `/home` local User DB. If users move their model to a different server or allow a different user to use it, they would have to run through the MIOpen tuning process again to populate the next User DB with the best kernel configs and corresponding solvers. Tuning is time consuming, and if the users have not performed tuning, they would see discrepancies between expected or claimed inference performance and the actual inference performance. This leads to repetitive and time-consuming tuning tasks for each user. - -MIGraphX introduces a feature known as YModel that stores the kernel config parameters found during tuning in the same file as the model itself. This ensures the same level of expected performance even when a model is copied to a different user or system. - -**Note:** The YModel feature is available in the ROCm™ v5.4.1 and UIF v1.1 release. For more information on ROCm, refer to https://docs.amd.com. - -To set up the YModel for GPUs: - -1. Tune the kernels for your architecture with `MIOPEN_FIND_ENFORCE=3 migraphx-driver run `. - -2. Build a `*.mxr` file with a compile option: - -``` - migraphx-driver compile file.onnx --enable-offload-copy --gpu --binary -o file.mxr - -``` - -3. Run the (Python) program with a command line: - -``` - model=migraphx.load("file.mxr") -``` - -**Note:** If the Model Zoo contains a `*.mxr` file for your architecture, you can skip steps 1 and 2. - -# 2.5: Get Vitis AI Models from UIF Model Zoo - -Follow the instructions in the [Vitis AI](https://github.com/Xilinx/Vitis-AI/tree/master/model_zoo) Model Zoo page.
diff --git a/docs/3_run_example/inference_server_example.md b/docs/3_run_example/inference_server_example.md index 81bffaa..106818b 100644 --- a/docs/3_run_example/inference_server_example.md +++ b/docs/3_run_example/inference_server_example.md @@ -1,6 +1,6 @@

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

- @@ -9,7 +9,8 @@

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

-This example walks you through running ResNet50 examples with the Inference Server using the development container based on the [development quickstart](https://xilinx.github.io/inference-server/0.3.0/quickstart_development.html) guide. +This example walks you through running ResNet50 examples with the Inference Server using the development and deployment containers based on the [developer quickstart](https://xilinx.github.io/inference-server/0.4.0/quickstart_development.html) guide. +An easier example using just the deployment container is in the [quickstart](https://xilinx.github.io/inference-server/0.4.0/quickstart.html). The full example files described here are available in the Inference Server [repository](https://github.com/Xilinx/inference-server/tree/main/examples/resnet50). The repository has examples for three backends: CPU (with ZenDNN), GPU (with MIGraphX), and FPGA (with Vitis™ AI). This example uses the GPU backend but all backends behave similarly. @@ -44,13 +45,13 @@ python3 docker/generate.py This builds the development image with the name `$(whoami)/amdinfer-dev-migraphx:latest` on your host. The development image only contains the dependencies for the Inference Server but not the source code. -When you start the container, mounts this directory inside so you can compile it. +When you start the container, mount this directory inside so you can compile it. 3. Get the example models and test data. You can pass the appropriate flag(s) for your backend to get the relevant files or use the `--all` flag to get everything. This command downloads these files and save them so they can be used for inference. You need `git-lfs` to get test data. -You can install it on your host or run this command from inside the development container which already has it installed. +You can install it on your host or run this command from inside the development container that already has it installed. ```bash ./amdinfer get --migraphx @@ -61,11 +62,11 @@ You can install it on your host or run this command from inside the development You can start the deployment container with the running server with: ```bash -docker pull amdih/serve:uif1.1_migraphx_amdinfer_0.3.0 -docker run -d --device /dev/kfd --device /dev/dri --volume $(pwd):/workspace/amdinfer:rw --network=host amdih/serve:uif1.1_migraphx_amdinfer_0.3.0 +docker pull amdih/serve:uif1.2_migraphx_amdinfer_0.4.0 +docker run -d --device /dev/kfd --device /dev/dri --volume $(pwd):/workspace/amdinfer:rw --network=host amdih/serve:uif1.2_migraphx_amdinfer_0.4.0 ``` -This will start the server in detached mode, mount the GPU into the container as well as the current inference server repository containing the models. +This starts the server in detached mode, mount the GPU into the container as well as the current inference server repository containing the models. By default, the server uses port 8998 for HTTP requests and it shares the host network for easier networking to remote clients. You can confirm the server is ready by using `curl` on the host to see if the command below succeeds. @@ -98,12 +99,12 @@ You can run the example with: python ./examples/resnet50/migraphx.py --ip --http-port ``` -If the server is running on a different host, pass the IP address to that host or use 127.0.0.1 if the server is running on the same host where you're making the request. -By passing the correct IP and port of the running server, this example script will connect to the running server in the deployment container. +If the server is running on a different host, pass the IP address to that host or use 127.0.0.1 if the server is running on the same host where you are making the request. +By passing the correct IP and port of the running server, this example script connects to the running server in the deployment container. This example script carries out the following steps: -1. Starts the server if it is not already started. It will print a message if it does this. By passing the right address for it to connect to, the script will not attempt to start a server itself. +1. Starts the server if it is not already started. It prints a message if it does this. By passing the right address for it to connect to, the script does not attempt to start a server itself. 2. Loads a MIGraphX worker to handle the incoming inference request. The worker opens and compiles a ResNet50 ONNX model with MIGraphX. 3. Opens and preprocesses an image of a dog for ResNet50 and uses it to make an inference request. 4. Sends the request over HTTP REST to the server. The server responds with the output of the model. @@ -117,13 +118,13 @@ For example, you can use `--image` to pass a path to your own image to the ResNe The other ResNet50 examples using different backends work similarly but use different workers to implement the functionality. Some of the other examples also demonstrate using different communication protocols like gRPC to communicate with the server instead of HTTP REST. -Changing protocols is easy: just change the client you're using to make inferences. +Changing protocols is easy: just change the client you are using to make inferences. To run other examples, you need a compatible Docker image. You can build a new container with another backend or enable multiple backends in one by passing in multiple flags. -You can see more information about what the example script does in the [Python examples](https://xilinx.github.io/inference-server/0.3.0/example_resnet50_python.html) in the Inference Server documentation. +You can see more information about what the example script does in the [Python examples](https://xilinx.github.io/inference-server/0.4.0/example_resnet50_python.html) in the Inference Server documentation. There are also C++ versions of these examples in the repository. -You can see more information about the [C++ examples](https://xilinx.github.io/inference-server/0.3.0/example_resnet50_cpp.html) in the Inference Server documentation. +You can see more information about the [C++ examples](https://xilinx.github.io/inference-server/0.4.0/example_resnet50_cpp.html) in the Inference Server documentation.
diff --git a/docs/3_run_example/runexample-migraphx.md b/docs/3_run_example/runexample-migraphx.md index fa43623..2abaeee 100644 --- a/docs/3_run_example/runexample-migraphx.md +++ b/docs/3_run_example/runexample-migraphx.md @@ -1,6 +1,6 @@ - diff --git a/docs/3_run_example/runexample-script.md b/docs/3_run_example/runexample-script.md index be94cef..7bc4c17 100644 --- a/docs/3_run_example/runexample-script.md +++ b/docs/3_run_example/runexample-script.md @@ -1,6 +1,6 @@

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

- diff --git a/docs/4_deploy_your_own_model/deploy_model/deployingmodel.md b/docs/4_deploy_your_own_model/deploy_model/deployingmodel.md index c41dbc8..cee3bdf 100644 --- a/docs/4_deploy_your_own_model/deploy_model/deployingmodel.md +++ b/docs/4_deploy_your_own_model/deploy_model/deployingmodel.md @@ -1,6 +1,6 @@

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

- @@ -33,21 +33,21 @@ ## 4.3.1.1: In-Framework -WeGO (WholeGraph Optimizer) offers a smooth solution to deploy models on cloud DPU by integrating the Vitis™ AI Development kit with TensorFlow 1.x, TensorFlow 2.x, and PyTorch frameworks. +WeGO ( - @@ -28,13 +28,13 @@ # 4.1.1: Pruning -Neural networks are typically over-parameterized with significant redundancy. Pruning is the process of eliminating redundant weights while keeping the accuracy loss as low as possible. Industry research has led to several techniques that serve to reduce the computational cost of neural networks for inference. These techniques include: +Neural networks are typically over-parameterized with significant redundancy. Pruning is the process of eliminating redundant weights while keeping the accuracy loss to a minimum. Industry research has led to several techniques that serve to reduce the computational cost of neural networks for inference. These techniques include: - Fine-grained pruning - Coarse-grained pruning - Neural Architecture Search (NAS) -The simplest form of pruning is called fine-grained pruning and results in sparse matrices (that is, matrices which have many elements equal to zero), which requires the addition of specialized hardware and techniques for weight skipping and compression. Fine-grained pruning is not currently supported by UIF Optimizer. +The simplest form of pruning is called fine-grained pruning and results in sparse matrices (that is, matrices that have many elements equal to zero), which require the addition of specialized hardware and techniques for weight skipping and compression. Fine-grained pruning is not currently supported by UIF Optimizer. UIF Optimizer employs coarse-grained pruning, which eliminates neurons that do not contribute significantly to the accuracy of the network. For convolutional layers, the coarse-grained method prunes the entire 3D kernel and hence is also known as channel pruning. Inference acceleration can be achieved without specialized hardware for coarse-grained pruned models. Pruning always reduces the accuracy of the original model. Retraining (fine-tuning) adjusts the remaining weights to recover accuracy. @@ -43,7 +43,7 @@ Coarse-grained pruning works well on large models with common convolutions, such # 4.1.2: UIF Optimizer Overview -Inference in machine learning is computationally intensive and requires high memory bandwidth to meet the low-latency and high-throughput requirements of various applications. UIF Optimizer provides the ability to prune neural network models. It prunes redundant kernels in neural networks thereby reducing the overall computational cost for inference. The pruned models produced by UIF Optimizer are then quantized by UIF Quantizer to be further optimized. +Inference in machine learning is computationally intensive and requires high memory bandwidth to meet the low-latency and high-throughput requirements of various applications. UIF Optimizer provides the ability to prune neural network models. It prunes redundant kernels in neural networks, thereby reducing the overall computational cost for inference. The pruned models produced by UIF Optimizer are then quantized by UIF Quantizer to be further optimized. The following tables show the features that are supported by UIF Optimizer for different frameworks: @@ -62,14 +62,14 @@ The following tables show the features that are supported by UIF Optimizer for d - + - + @@ -150,13 +150,13 @@ model = pruning_runner.prune(removal_ratio=0.2) ``` Run analysis only once for the same model. You can prune the model iteratively without re-running analysis because there is only one pruned model generated for a specific pruning ratio. -The subnetwork obtained by pruning may not be very good because an approximate algorithm is used to generate this unique pruned model according to the analysis result. +The subnetwork obtained by pruning may not be perfect because an approximate algorithm is used to generate this unique pruned model according to the analysis result. The one-step pruning method can generate a better subnetwork. #### One-step Pruning -The method also includes two stages: adaptive batch normalization (BN) based searching for pruning strategy, and pruned model generation. +The method also includes two stages: adaptive batch normalization (BN) based searching for pruning strategy and pruned model generation. After searching, a file named `.vai/your_model_name.search` is generated in which the search result (pruning strategies and corresponding evaluation scores) is stored. You can get the final pruned model in one-step. `num_subnet` provides the number of candidate subnetworks satisfying the sparsity requirement to be searched. @@ -186,7 +186,7 @@ The one-step pruning method has several advantages over the iterative approach: - The workflow is simpler because you can obtain the final pruned model in one step without iterations. - Retraining a slim model is faster than a sparse model. -There are two disadvantages to one-step pruning: one is that the random generation of pruning strategy is unstable. The other is that the subnetwork searching must be performed once for every pruning ratio. +There are two disadvantages to one-step pruning: one is that the random generation of pruning strategy is unstable, and the other is that the subnetwork searching must be performed once for every pruning ratio. ### Retraining the Pruned Model @@ -323,7 +323,7 @@ The searching result looks like the following: ``` ### Getting a Subnetwork -Call `get_static_subnet()` to get a specific subnetwork. The `static_subnet` can be used for finetuning and doing quantization. +Call `get_static_subnet()` to get a specific subnetwork. The `static_subnet` can be used for finetuning and quantization. ```python pareto_global = ofa_pruner.load_subnet_config('pareto_global.txt') @@ -393,15 +393,15 @@ sparse_model = runner.prune(ratio=0.2) ``` **Note:** `ratio` is only an approximate target value and the actual pruning ratio may not be exactly equal to this value. -The returned model from `prune()` is sparse which means the pruned channels are set to zeros and model size remains unchanged. +The returned model from `prune()` is sparse, which means that the pruned channels are set to zeros and model size remains unchanged. The sparse model has been used in the iterative pruning process. The sparse model is converted to a pruned dense model only after pruning is completed. -Besides returning a sparse model, the pruning runner generates a specification file in the `.vai` directory that describes how each layer will be pruned. +Besides returning a sparse model, the pruning runner generates a specification file in the `.vai` directory that describes how each layer is pruned. ### 4.1.4.4: Fine-tuning a Sparse Model -Training a sparse model is no different from training a normal model. The model will maintain sparsity internally. There is no need for any additional actions other than adjusting the hyper-parameters. +Training a sparse model is no different from training a normal model. The model maintains sparsity internally. There is no need for any additional actions other than adjusting the hyper-parameters. ```python sparse_model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"]) @@ -425,7 +425,7 @@ sparse_model = runner.prune(ratio=0.5) ``` ### 4.1.4.6: Getting the Pruned Model -When the iterative pruning is completed, a sparse model is generated which has the same number of parameters as the original model but with many of them now set to zero. +When the iterative pruning is completed, a sparse model is generated, which has the same number of parameters as the original model but with many of them now set to zero. Call `get_slim_model()` to remove zeroed parameters from the sparse model and retrieve the pruned model: diff --git a/docs/4_deploy_your_own_model/quantize_model/pt_resnet18_quant.py b/docs/4_deploy_your_own_model/quantize_model/pt_resnet18_quant.py index cf7c20b..86ef9b8 100644 --- a/docs/4_deploy_your_own_model/quantize_model/pt_resnet18_quant.py +++ b/docs/4_deploy_your_own_model/quantize_model/pt_resnet18_quant.py @@ -213,11 +213,14 @@ def quantization(title='optimize', inspector.inspect(quant_model, (input,), device=device) sys.exit() else: - ## new api #################################################################################### + # This function call will create a quantizer object and setup it. + # Eager mode model code will be converted to graph model. + # Quantization is not done here if it needs calibration. quantizer = torch_quantizer( quant_mode, model, (input), device=device, quant_config_file=config_file) + # Get the converted model to be quantized. quant_model = quantizer.quant_model ##################################################################################### @@ -252,16 +255,22 @@ def quantization(title='optimize', acc_org5 = 0.0 loss_org = 0.0 - #register_modification_hooks(model_gen, train=False) + # This function call is to do forward loop for model to be quantized. + # Quantization calibration will be done after it. acc1_gen, acc5_gen, loss_gen = evaluate(quant_model, val_loader, loss_fn) - # logging accuracy + # Logging accuracy + # If quant_mode is 'calib', ignore the accuracy log because it is not the final accuracy can be got. + # Only check the accuray if quant_mode is 'test'. print('loss: %g' % (loss_gen)) print('top-1 / top-5 accuracy: %g / %g' % (acc1_gen, acc5_gen)) - # handle quantization result if quant_mode == 'calib': + # Exporting intermediate files will be used when quant_mode is 'test'. This is must. quantizer.export_quant_config() + if quant_mode == 'test': + # Exporint ONNX format quantized model + quantizer.export_onnx_model() if __name__ == '__main__': diff --git a/docs/4_deploy_your_own_model/quantize_model/quantizemodel.md b/docs/4_deploy_your_own_model/quantize_model/quantizemodel.md index c45a1f3..eb2a083 100644 --- a/docs/4_deploy_your_own_model/quantize_model/quantizemodel.md +++ b/docs/4_deploy_your_own_model/quantize_model/quantizemodel.md @@ -1,6 +1,6 @@

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

PyTorchSupports 1.4 - 1.10Supports 1.7 - 1.13 Yes Yes Yes
TensorFlowSupports 2.3 - 2.8Supports 2.4 - 2.12 Yes No No
- @@ -19,6 +19,13 @@ - [4.2.2.3: vai_q_tensorflow Quantization Aware Training](#4223-vai_q_tensorflow-quantization-aware-training) - [4.2.2.4: vai_q_tensorflow Supported Operations and APIs](#4224-vai_q_tensorflow-supported-operations-and-apis) - [4.2.2.5: vai_q_tensorflow Usage](#4225-vai_q_tensorflow-usage) +- [4.2.3: Quantize ONNX Models](#423-quantize-onnx-models) + - [4.2.3.1 Test Environment](#4231-test-environment) + - [4.2.3.2 Installation](#4232-installation) + - [4.2.3.3 Post Training Quantization](#4233-post-training-quantizationptq---static-quantization) + - [4.2.3.4 Running vai_q_onnx](#4234-running-vai_q_onnx) + - [4.2.3.5 List of Vai_q_onnx Supported Quantized Ops](#4235-list-of-vai_q_onnx-supported-quantized-ops) + - [4.2.3.6 vai_q_onnx APIs](#4236-vai_q_onnx-apis) _Click [here](/README.md#implementing-uif-11) to go back to the UIF User Guide home page._ @@ -77,11 +84,13 @@ Using [pt_resnet18_quant.py](pt_resnet18_quant.py) as an example: ```py if quant_mode == 'calib': quantizer.export_quant_config() + if quant_mode == 'test': + quantizer.export_onnx_model() ``` ### Run and Output Results -Before running commands, let's introduce the log message in vai_q_pytorch. vai_q_pytorch log messages have special colors and the keyword prefix "VAIQ_*". vai_q_pytorch log message types include "error", "warning", and "note". +Before running commands, introducing the log message in vai_q_pytorch. vai_q_pytorch log messages have special colors and the keyword prefix "VAIQ_*". vai_q_pytorch log message types include error, warning, and note. Pay attention to vai_q_pytorch log messages to check the flow status.
* Run command with `--quant_mode calib` to the quantize model. @@ -89,12 +98,12 @@ Pay attention to vai_q_pytorch log messages to check the flow status.
```py python pt_resnet18_quant.py --quant_mode calib --subset_len 200 --config_file ./pt_quant_config.json ``` -When doing calibration forward, the float evaluation flow is borrowed to minimize code changes from the float script, which means that loss and accuracy information is displayed at the end. Because these loss and accuracy values are meaningless at this point in the process, they can be skipped. Attention should be given to the colorful log messages with the special keywords prefix "VAIQ_*". +When doing calibration forward, the float evaluation flow is borrowed to minimize code changes from the float script, which means that loss and accuracy information is displayed at the end. Because these loss and accuracy values are meaningless at this point in the process, they can be skipped. Attention should be given to the colorful log messages with the special keywords prefix VAIQ_*. It is important to control iteration numbers during quantization and evaluation. Generally, 100-1000 images are enough for quantization, and the whole validation set is required for evaluation. The iteration numbers can be controlled in the data loading part. In this case, the argument `subset_len` controls how many images used for network forwarding. However, if the float evaluation script does not have an argument with similar role, it is better to add one, otherwise it should be changed manually. -If this quantization command runs successfully, two important files will be generated under output directory `./quantize_result`. +If this quantization command runs successfully, two important files are generated under output directory `./quantize_result`. ``` ResNet.py: converted vai_q_pytorch format model, @@ -111,9 +120,9 @@ When this command finishes, the displayed accuracy is the right accuracy for qua Sometimes direct quantization accuracy is not high enough, and finetuning of model parameters is necessary to recover accuracy:
-- Fast finetuning is not real training of the model, and only needs a limited number of iterations. For classification models on the Imagenet dataset, 5120 images are enough in general. +- Fast finetuning is not real training of the model and only needs a limited number of iterations. For classification models on the Imagenet dataset, 5120 images are enough in general. - It only requires some modification based on the evaluation model script, and does not require you to set up Optimizer for training. -- A function for model forwarding iteration is needed and will be called as part of fast finetuning. +- A function for model forwarding iteration is needed and is called as part of fast finetuning. - Re-calibration with original inference code is highly recommended. - Example code in [pt_resnet18_quant.py](pt_resnet18_quant.py) is as follows: @@ -149,14 +158,14 @@ Sometimes direct quantization accuracy is not high enough, and finetuning of mod - This mode can be used to finetune a quantized model (loading float model parameters), as well as to do quantization-aware-training (QAT) from scratch. - It is necessary to add some vai_q_pytorch interface functions based on the float model training script. -- The mode requires that the trained model cannot use the +/- operator in model forwarding code. Replace them with the torch.add/torch.sub module. +- The mode requires that the trained model cannot use the +/- operator in the model forwarding code. Replace them with the torch.add/torch.sub module. - For detailed information, refer to the [Vitis AI User Guide](https://docs.xilinx.com/r/en-US/ug1414-vitis-ai/vai_q_pytorch-QAT). # 4.2.2: Quantize TensorFlow Models -vai_q_tensorflow is a Vitis AI quantizer for TensorFlow. It supports FPGA-friendly quantization for TensorFlow models. After quantization, models can be deployed to FPGA devices. vai_q_tensorflow is a component of [Vitis AI](https://github.com/Xilinx/Vitis-AI), a development stack for AI inference on Xilinx hardware platforms. +vai_q_tensorflow is a Vitis AI quantizer for TensorFlow. It supports FPGA-friendly quantization for TensorFlow models. After quantization, models can be deployed to FPGA devices. vai_q_tensorflow is a component of [Vitis AI](https://github.com/Xilinx/Vitis-AI), a development stack for AI inference on AMD hardware platforms. -**Note:** You can download the Xilinx prebuilt version in [Vitis AI](https://www.xilinx.com/products/design-tools/vitis.html). See the [Vitis AI User Guide](https://docs.xilinx.com/r/en-US/ug1414-vitis-ai) for further details. +**Note:** You can download the AMD prebuilt version in [Vitis AI](https://www.xilinx.com/products/design-tools/vitis.html). See the [Vitis AI User Guide](https://docs.xilinx.com/r/en-US/ug1414-vitis-ai) for further details. ## 4.2.2.1: Installation @@ -224,7 +233,7 @@ Before running vai_q_tensorflow, prepare the frozen inference TensorFlow model i #### **Generating the Frozen Inference Graph** -Training a model with TensorFlow 1.x creates a folder containing a GraphDef file (usually ending with `a.pb` or `.pbtxt` extension) and a set of checkpoint files. What you need for mobile or embedded deployment is a single GraphDef file that has been “frozen,” or had its variables converted into inline constants, so everything is in one file. To handle the conversion, TensorFlow provides `freeze_graph.py`, which is automatically installed with the vai_q_tensorflow quantizer. +Training a model with TensorFlow 1.x creates a folder containing a GraphDef file (usually ending with `a.pb` or `.pbtxt` extension) and a set of checkpoint files. What you need for mobile or embedded deployment is a single GraphDef file that has been frozen, or had its variables converted into inline constants, so everything is in one file. To handle the conversion, TensorFlow provides `freeze_graph.py`, which is automatically installed with the vai_q_tensorflow quantizer. An example of command-line usage is as follows: @@ -262,7 +271,7 @@ $ netron /tmp/inception_v3_inf_graph.pb The calibration set is usually a subset of the training/validation dataset or actual application images (at least 100 images for performance). The input function is a Python importable function to load the calibration dataset and perform data preprocessing. The vai_q_tensorflow quantizer can accept an input_fn to do the preprocessing, which is not saved in the graph. If the preprocessing subgraph is saved into the frozen graph, the input_fn only needs to read the images from dataset and return a feed_dict. The format of input function is `module_name.input_fn_name`, (for example, -`my_input_fn.calib_input`). The `input_fn` takes an int object as input, indicating the calibration step number, and returns a dict ("`placeholder_name, numpy.Array`") object for each call, which is fed into the placeholder nodes of the model when running inference. The `placeholder_name` is always the input node of frozen graph, that is to say, the node receiving input data. The `input_nodes`, in the vai_q_tensorflow options, indicate where quantization starts in the frozen graph. The `placeholder_names` and the `input_nodes` options are sometimes different. For example, when the frozen graph includes in-graph preprocessing, the placeholder_name is the input of the graph, although it is recommended that `input_nodes` be set to the last node of preprocessing. The shape of `numpy.array` must be consistent with the placeholders. See the following pseudo code example: +`my_input_fn.calib_input`). The `input_fn` takes an int object as input, indicating the calibration step number, and returns a dict ("`placeholder_name, numpy.Array`") object for each call, which is fed into the placeholder nodes of the model when running inference. The `placeholder_name` is always the input node of the frozen graph, that is to say, the node receiving input data. The `input_nodes`, in the vai_q_tensorflow options, indicate where quantization starts in the frozen graph. The `placeholder_names` and the `input_nodes` options are sometimes different. For example, when the frozen graph includes in-graph preprocessing, the placeholder_name is the input of the graph, although it is recommended that `input_nodes` be set to the last node of preprocessing. The shape of `numpy.array` must be consistent with the placeholders. See the following pseudo code example: ```python $ “my_input_fn.py” @@ -298,7 +307,7 @@ The `input_nodes` and `output_nodes` arguments are the name list of input nodes -It is recommended to set `–input_nodes` to be the last nodes of the preprocessing part and to set *-output_nodes* to be the last nodes of the main graph part because some operations in the pre- and postprocessing parts are not quantizable and might cause errors when compiled by the Vitis AI quantizer if you need to deploy the quantized model to the DPU. +It is recommended to set `–input_nodes` to be the last nodes of the preprocessing part. Similarly, it is advisable to set *-output_nodes* to be the last nodes of the main graph part because some operations in the pre- and postprocessing parts are not quantizable and might cause errors when compiled by the Vitis AI quantizer if you need to deploy the quantized model to the DPU. The input nodes might not be the same as the placeholder nodes of the graph. If no in-graph preprocessing part is present in the frozen graph, the placeholder nodes should be set to input nodes. @@ -321,7 +330,7 @@ After the successful execution of the `vai_q_tensorflow` command, one output fil |No.|Name|Description| | :--- | :--- | :--- | |1|deploy_model.pb|Quantized model for the Vitis AI compiler (extended TensorFlow format) for targeting DPUCZDX8G implementations.| -|2|quantize_eval_model.pb|Quantized model for evaluation (also, the Vitis AI compiler input for most DPU architectures, like DPUCAHX8H, and DPUCADF8H).| +|2|quantize_eval_model.pb|Quantized model for evaluation (also, the Vitis AI compiler input for most DPU architectures, like DPUCAHX8H and DPUCADF8H).| ### Evaluating the Quantized Model (Optional) @@ -368,7 +377,7 @@ Quantization aware training (QAT) is similar to float model training/finetuning, |No.|Name|Description| | :--- | :--- | :--- | -|1|Checkpoint files|Floating-point checkpoint files to start from. Omit this if you training the model from scratch.| +|1|Checkpoint files|Floating-point checkpoint files to start from. Omit this if you are training the model from scratch.| |2|Dataset|The training dataset with labels.| |3|Train Scripts|The Python scripts to run float train/finetuning of the model.| @@ -400,7 +409,7 @@ optimizer = tf.train.GradientDescentOptimizer() The `QuantizeConfig` contains the configurations for quantization. -Some basic configurations such as `input_nodes`, `output_nodes`, `input_shapes` need to be set according to your model structure. +Some basic configurations such as `input_nodes`, `output_nodes`, and `input_shapes` need to be set according to your model structure. Other configurations such as `weight_bit`, `activation_bit`, and `method` have default values and can be modified as needed. See [vai_q_tensorflow Usage](#4225-vai_q_tensorflow-usage) for detailed information of all the configurations. @@ -412,7 +421,7 @@ Other configurations such as `weight_bit`, `activation_bit`, and `method` have d 4. **Evaluate the quantized model and generate the frozen model:** After QAT, generate the frozen model after evaluating the quantized graph with a checkpoint file. This can be done by calling the following function after building the float evaluation graph. As the freeze process depends on the quantize evaluation graph, they are often called together. - **Note**: Function `decent_q.CreateQuantizeTrainingGraph` and `decent_q.CreateQuantizeEvaluationGraph` will modify the default graph in TensorFlow. They need to be called on different graph phases. `decent_q.CreateQuantizeTrainingGraph` needs to be called on the float training graph while `decent_q.CreateQuantizeEvaluationGraph` needs to be called on the float evaluation graph. `decent_q.CreateQuantizeEvaluationGraph` cannot be called right after calling function `decent_q.CreateQuantizeTrainingGraph`, because the default graph has been converted to a quantize training graph. The correct way is to call it right after the float model creation function. + **Note**: Function `decent_q.CreateQuantizeTrainingGraph` and `decent_q.CreateQuantizeEvaluationGraph` modify the default graph in TensorFlow. They need to be called on different graph phases. `decent_q.CreateQuantizeTrainingGraph` needs to be called on the float training graph while `decent_q.CreateQuantizeEvaluationGraph` needs to be called on the float evaluation graph. `decent_q.CreateQuantizeEvaluationGraph` cannot be called right after calling function `decent_q.CreateQuantizeTrainingGraph`, because the default graph has been converted to a quantize training graph. The correct way is to call it right after the float model creation function. ```python # eval.py @@ -455,7 +464,7 @@ The suffix contains the iteration information from the checkpoint file and the d The following are some tips for QAT. -- **Keras Model:** For Keras models, set `backend.set_learning_phase(1)` before creating the float train graph, and set `backend.set_learning_phase(0)` before creating the float evaluation graph. Moreover, `backend.set_learning_phase()` should be called after `backend.clear_session()`. TensorFlow 1.x QAT APIs are designed for TensorFlow native training APIs. Using Keras `model.fit()` APIs in QAT may lead to some "nodes not executed" issues. It is recommended to use QAT APIs in the TensorFlow 2 quantization tool with Keras APIs. +- **Keras Model:** For Keras models, set `backend.set_learning_phase(1)` before creating the float train graph, and set `backend.set_learning_phase(0)` before creating the float evaluation graph. Moreover, `backend.set_learning_phase()` should be called after `backend.clear_session()`. TensorFlow 1.x QAT APIs are designed for TensorFlow native training APIs. Using Keras `model.fit()` APIs in QAT might lead to some "nodes not executed" issues. It is recommended to use QAT APIs in the TensorFlow 2 quantization tool with Keras APIs. - **Dropout**: Experiments show that QAT works better without dropout ops. This tool does not support finetuning with dropouts at the moment and they should be removed or disabled before running QAT. This can be done by setting `is_training=false` when using tf.layers or call `tf.keras.backend.set_learning_phase(0)` when using tf.keras.layers. @@ -679,7 +688,316 @@ $vai_q_tensorflow dump --input_frozen_graph quantize_results/quantize_eval_model ``` Refer to [Vitis AI Model Zoo](https://github.com/Xilinx/Vitis-AI/tree/master/model_zoo) for more TensorFlow model quantization examples. -
+## 4.2.3 Quantize ONNX Models + +The Vitis AI Quantizer for ONNX models is customized based on [Quantization Tool](https://github.com/microsoft/onnxruntime/tree/rel-1.14.0/onnxruntime/python/tools/quantization) in ONNX Runtime. + + +### 4.2.3.1 Test Environment + +* Python 3.7, 3.8 +* ONNX>=1.12.0 +* ONNX Rumtime>=1.14.0 +* onnxruntime-extensions>=0.4.2 + +### 4.2.3.2 Installation + +You can install vai_q_onnx as follows: + +#### Install from Source Code with Wheel Package +To build vai_q_onnx, run the following command: +``` +$ sh build.sh +$ pip install pkgs/*.whl +``` + + +### 4.2.3.3 Post Training Quantization (PTQ) + +The static quantization method first runs the model using a set of inputs called calibration data. During these runs, the quantization parameters for each activation are computed. These quantization parameters are written as constants to the quantized model and used for all inputs. Our quantization tool supports the following calibration methods: MinMax, Entropy and Percentile, and MinMSE. + +```python +import vai_q_onnx + +vai_q_onnx.quantize_static( + model_input, + model_output, + calibration_data_reader, + quant_format=vai_q_onnx.VitisQuantFormat.FixNeuron, + calibrate_method=vai_q_onnx.PowerOfTwoMethod.MinMSE) +``` + +**Arguments** + +* **model_input**: (String) Represents the file path of the model to be quantized. +* **model_output**: (String) Represents the file path where the quantized model is saved. +* **calibration_data_reader**: (Object or None) Calibration data reader. It enumerates the calibration data and generates inputs for the original model. If you want to use random data for a quick test, you can set calibration_data_reader to None. The default value is None. +* **quant_format**: (String) Specifies the quantization format of the model. It has the following options: +
**QOperator:** This option quantizes the model directly using quantized operators. +
**QDQ:** This option quantizes the model by inserting QuantizeLinear/DeQuantizeLinear into the tensor. It supports 8-bit quantization only. +
**VitisQuantFormat.QDQ:** This option quantizes the model by inserting VAIQuantizeLinear/VAIDeQuantizeLinear into the tensor. It supports a wider range of bit-widths and configurations. +
**VitisQuantFormat.FixNeuron:** This option quantizes the model by inserting FixNeuron (a combination of QuantizeLinear and DeQuantizeLinear) into the tensor. +* **calibrate_method**: (String) For DPU devices, set calibrate_method to either 'vai_q_onnx.PowerOfTwoMethod.NonOverflow' or 'vai_q_onnx.PowerOfTwoMethod.MinMSE' to apply power-of-2 scale quantization. The PowerOfTwoMethod currently supports two methods: MinMSE and NonOverflow. The default method is MinMSE. + + +### 4.2.3.4 Running vai_q_onnx + + +Quantization in ONNX Runtime refers to the linear quantization of an ONNX model. We have developed the vai_q_onnx tool as a plugin for ONNX Runtime to support more post-training quantization(PTQ) functions for quantizing a deep learning model. Post-training quantization (PTQ) is a technique to convert a pre-trained float model into a quantized model with little degradation in model accuracy. A representative dataset is needed to run a few batches of inference on the float model to obtain the distributions of the activations, which is also called quantized calibration. + + +Usage of vai_q_onnx supports is as follows: + +#### vai_q_onnx Post-Training Quantization (PTQ) + +Use the following steps to run PTQ with vai_q_onnx: + +1. ##### Preparing the Float Model and Calibration Set + +Before running vai_q_onnx, ensure to prepare the float model and calibration set, including the files listed in the following table. + +Table 1. Input files for vai_q_onnx + +| No. | Name | Description | +| ------ | ------ | ----- | +| 1 | float model | Floating-point ONNX models in onnx format. | +| 2 | calibration dataset | A subset of the training dataset or validation dataset to represent the input data distribution, usually 100 to 1000 images are enough. | + +2. ##### (Recommended) Pre-processing the Float Model + +Pre-processing float32 model transforms and prepares it for quantization. It consists of the +following three optional steps: +* Symbolic shape inference: It is best suited for transformer models. +* Model Optimization: It uses ONNX Runtime native library to rewrite the computation graph, including merging computation nodes, and eliminating redundancies to improve runtime efficiency. +* ONNX shape inference. + +The primary objective of these steps is to enhance quantization quality. The ONNX Runtime quantization tool performs optimally when the tensor's shape is known. Both symbolic shape inference and ONNX shape inference play a crucial role in determining tensor shapes. Symbolic shape inference is particularly effective for transformer-based models, whereas ONNX shape inference works well with other models. +Model optimization performs certain operator fusion, making the quantization tool’s job easier. For instance, a Convolution operator followed by BatchNormalization can be fused into one during the optimization, which enables effective quantization. +ONNX Runtime has a known issue: model optimization cannot output a model size greater than 2 GB. As a result, for large models, optimization must be skipped. +Pre-processing API is in the Python module onnxruntime.quantization.shape_inference, function quant_pre_process(). +Pre-processing API can be found in the onnxruntime.quantization.shape_inference Python +module inside the quant_pre_process() function: + +```python +from onnxruntime.quantization import shape_inference + +shape_inference.quant_pre_process( + input_model_path: str, + output_model_path: str, + skip_optimization: bool = False, + skip_onnx_shape: bool = False, + skip_symbolic_shape: bool = False, + auto_merge: bool = False, + int_max: int = 2**31 - 1, + guess_output_rank: bool = False, + verbose: int = 0, + save_as_external_data: bool = False, + all_tensors_to_one_file: bool = False, + external_data_location: str = "./", + external_data_size_threshold: int = 1024,) +``` + +**Arguments** + +* **input_model_path**: (String) Specifies the file path of the input model to be pre-processed for quantization. +* **output_model_path**: (String) Specifies the file path where the pre-processed model is saved. +* **skip_optimization**: (Boolean) Indicates whether to skip the model optimization step. If set to True, model optimization is skipped, which may cause ONNX shape inference failure for some models. The default value is False. +* **skip_onnx_shape**: (Boolean) Indicates whether to skip the ONNX shape inference step. The symbolic shape inference is most effective with transformer-based models. Skipping all shape inferences may reduce the effectiveness of quantization because a tensor with an unknown shape cannot be quantized. The default value is False. +* **skip_symbolic_shape**: (Boolean) Indicates whether to skip the symbolic shape inference step. Symbolic shape inference is most effective with transformer-based models. Skipping all shape inferences may reduce the effectiveness of quantization because a tensor with an unknown shape cannot be quantized. The default value is False. +* **auto_merge**: (Boolean) Determines whether to automatically merge symbolic dimensions when a conflict occurs during symbolic shape inference. The default value is False. +* **int_max**: (Integer) Specifies the maximum integer value that is to be considered as boundless for operations like slice during symbolic shape inference. The default value is 2**31 - 1. +* **guess_output_rank**: (Boolean) Indicates whether to guess the output rank to be the same as input 0 for unknown operations. The default value is False. +* **verbose**: (Integer) Controls the level of detailed information logged during inference. A value of 0 turns off logging, 1 logs warnings, and 3 logs detailed information. The default value is 0. +* **save_as_external_data**: (Boolean) Determines whether to save the ONNX model to external data. The default value is False. +* **all_tensors_to_one_file**: (Boolean) Indicates whether to save all the external data to one file. The default value is False. +* **external_data_location**: (String) Specifies the file location where the external file is saved. The default value is "./". +* **external_data_size_threshold**: (Integer) Specifies the size threshold for external data. The default value is 1024. + +3. ##### Quantizing Using the vai_q_onnx API +The static quantization method first runs the model using a set of inputs called calibration data. During these runs, the quantization parameters are computed for each activation. These quantization parameters are written as constants to the quantized model and used for all inputs. Vai_q_onnx quantization tool has expanded calibration methods to power-of-2 scale/float scale quantization methods. Float scale quantization methods include MinMax, Entropy, and Percentile. Power-of-2 scale quantization methods include MinMax and MinMSE: + +```python + +vai_q_onnx.quantize_static( + model_input, + model_output, + calibration_data_reader, + quant_format=vai_q_onnx.VitisQuantFormat.FixNeuron, + calibrate_method=vai_q_onnx.PowerOfTwoMethod.MinMSE, + input_nodes=[], + output_nodes=[], + extra_options=None,) +``` + + +**Arguments** + +* **model_input**: (String) Specifies the path of the model to be quantized. +* **model_output**: (String) Specifies the file path where the quantized model is saved. +* **calibration_data_reader**: (Object or None) Calibration data reader that enumerates the calibration data and generates inputs for the original model. If you want to use random data for a quick test, you can set calibration_data_reader to None. +* **quant_format**: (Enum) Defines the quantization format for the model. It has the following options: +
**QOperator**: This option quantizes the model directly using quantized operators. +
**QDQ**: This option quantizes the model by inserting QuantizeLinear/DeQuantizeLinear into the tensor. It supports 8-bit quantization. +
**VitisQuantFormat.QDQ** This option quantizes the model by inserting VAIQuantizeLinear/VAIDeQuantizeLinear into the tensor. It supports a wider range of bit-widths and configurations. +
**VitisQuantFormat.FixNeuron** This option quantizes the model by inserting FixNeuron (a combination of QuantizeLinear and DeQuantizeLinear) into the tensor. This is the default value. +* **calibrate_method**: (Enum) Used to set the power-of-2 scale quantization method for DPU devices. It currently supports two methods: 'vai_q_onnx.PowerOfTwoMethod.NonOverflow' and 'vai_q_onnx.PowerOfTwoMethod.MinMSE'. The default value is 'vai_q_onnx.PowerOfTwoMethod.MinMSE'. +* **input_nodes**: (List of Strings) List of the names of the starting nodes to be quantized. The nodes before these start nodes in the model are not optimized or quantized. For example, this argument can be used to skip some pre-processing nodes or stop quantizing the first node. The default value is []. +* **output_nodes**: (List of Strings) Names of the end nodes to be quantized. The nodes after these nodes in the model are not optimized or quantized. For example, this argument can be used to skip some post-processing nodes or stop quantizing the last node. The default value is []. +* **extra_options**: (Dict or None) Dictionary of additional options that can be passed to the quantization process. If there are no additional options to provide, this can be set to None. The default value is None. + + +4. ##### (Optional) Evaluating the Quantized Model +If you have scripts to evaluate float models, like the models in AMD Model Zoo, you can replace the float model file with the quantized model for evaluation. + +To support the customized FixNeuron op, the vai_dquantize module should be imported. THe following is example: + +```python +import onnxruntime as ort +from vai_q_onnx.operators.vai_ops.qdq_ops import vai_dquantize + +so = ort.SessionOptions() +so.register_custom_ops_library(_lib_path()) +sess = ort.InferenceSession(dump_model, so) +input_name = sess.get_inputs()[0].name +results_outputs = sess.run(None, {input_name: input_data}) +``` + +After that, evaluate the quantized model just as the float model. + + +5. ##### (Optional) Dumping the Simulation Results +Sometimes, after deploying the quantized model, it is essential to compare the simulation results on the CPU and GPU with the output values on the DPU. +You can use the dump_model API of vai_q_onnx to dump the simulation results with the quantized_model: + +```python +# This function dumps the simulation results of the quantized model, +# including weights and activation results. +vai_q_onnx.dump_model( + model, + dump_data_reader=None, + output_dir='./dump_results', + dump_float=False) +``` + +**Arguments** + +* **model**: (String) Specifies the file path of the quantized model whose simulation results are to be dumped. +* **dump_data_reader**: (Object or None) Data reader that is used for the dumping process. It generates inputs for the original model. +* **output_dir**: (String) Specifies the directory where the dumped simulation results are saved. After successful execution of the function, dump results are generated in this specified directory. The default value is './dump_results'. +* **dump_float**: (Boolean) Determines whether to dump the floating-point value of weights and activation results. If set to True, the float values are dumped. The default value is False. + +**Note**: The batch_size of the dump_data_reader should be set to 1 for DPU debugging. + +After successfully executing the command, the dump results are generated in the output_dir. +Each quantized node's weights and activation results are saved separately in *.bin and *.txt formats. In cases where the node output is not quantized, such as the softmax node, the float activation results are saved in *_float.bin and *_float.txt formats if the option "save_float" is set to True. +The following table shows an example of the dump results. + +Table 2. Example of Dumping Results + +| Quantized | Node Name | Saved Weights and Activations| +| ------ | ------ | ----- | +| Yes | resnet_v1_50_conv1 | {output_dir}/dump_results/quant_resnet_v1_50_conv1.bin
{output_dir}/dump_results/quant_resnet_v1_50_conv1.txt| +| Yes | resnet_v1_50_conv1_weights | {output_dir}/dump_results/quant_resnet_v1_50_conv1_weights.bin
{output_dir}/dump_results/quant_resnet_v1_50_conv1_weights.txt | +| No | resnet_v1_50_softmax | {output_dir}/dump_results/quant_resnet_v1_50_softmax_float.bin
{output_dir}/dump_results/quant_resnet_v1_50_softmax_float.txt | + + +### 4.2.3.5 List of Vai_q_onnx Supported Quantized Ops + +The following table lists the supported operations and APIs for vai_q_onnx. + +Table 3. List of Vai_q_onnx Supported Quantized Ops +| supported ops | Comments | +| :-- | :-- | +| Add| | +| Conv| | +| ConvTranspose| | +| Gemm| | +| Concat| | +| Relu| | +| Reshape| | +| Transpose| | +| Resize| | +| MaxPool| | +| GlobalAveragePool| | +| AveragePool| | +| MatMul| | +| Mul| | +| Sigmoid| | +| Softmax| | + + +### 4.2.3.6 vai_q_onnx APIs + +quantize_static Method + +```python +vai_q_onnx.quantize_static( + model_input, + model_output, + calibration_data_reader, + quant_format=vai_q_onnx.VitisQuantFormat.FixNeuron, + calibrate_method=vai_q_onnx.PowerOfTwoMethod.MinMSE, + input_nodes=[], + output_nodes=[], + op_types_to_quantize=None, + per_channel=False, + reduce_range=False, + activation_type=QuantType.QInt8, + weight_type=QuantType.QInt8, + nodes_to_quantize=None, + nodes_to_exclude=None, + optimize_model=True, + use_external_data_format=False, + extra_options=None,) +``` + + +**Arguments** + + +* **model_input**: (String) Specifies the file path of the model that is to be quantized. +* **model_output**: (String) Specifies the file path where the quantized model will be saved. +* **calibration_data_reader**: (Object or None) Calibration data reader that enumerates the calibration data and generates inputs for the original model. If you want to use random data for a quick test, you can set calibration_data_reader to None. +* **quant_format**: (Enum) Defines the quantization format for the model. It has the following options: +
**QOperator** Quantizes the model directly using quantized operators. +
**QDQ** Quantizes the model by inserting QuantizeLinear/DeQuantizeLinear into the tensor. It supports 8-bit quantization only. +
**VitisQuantFormat.QDQ** Quantizes the model by inserting VAIQuantizeLinear/VAIDeQuantizeLinear into the tensor. It supports a wider range of bit-widths and configurations. +
**VitisQuantFormat.FixNeuron** Quantizes the model by inserting FixNeuron (a combination of QuantizeLinear and DeQuantizeLinear) into the tensor. This is the default value. +* **calibrate_method**: (Enum) Used to set the power-of-2 scale quantization method for DPU devices. It currently supports two methods: 'vai_q_onnx.PowerOfTwoMethod.NonOverflow' and 'vai_q_onnx.PowerOfTwoMethod.MinMSE'. The default value is 'vai_q_onnx.PowerOfTwoMethod.MinMSE'. +* **input_nodes**: (List of Strings) Names of the starting nodes to be quantized. Nodes in the model before these nodes will not be quantized. For example, this argument can be used to skip some pre-processing nodes or stop the first node from being quantized. The default value is an empty list ([]). +* **output_nodes**: (List of Strings) Names of the end nodes to be quantized. Nodes in the model after these nodes are not be quantized. For example, this argument can be used to skip some post-processing nodes or stop the last node from being quantized. The default value is an empty list ([]). +* **op_types_to_quantize**: (List of Strings or None) If specified, only operators of the given types are quantized (For example, ['Conv'] to only quantize Convolutional layers). By default, all supported operators are quantized. +* **per_channel**: (Boolean) Determines whether weights should be quantized per channel. For DPU devices, this must be set to False as they currently do not support per-channel quantization. +* **reduce_range**: (Boolean) If True, quantizes weights with 7-bits. For DPU devices, this must be set to False as they currently do not support reduced range quantization. +* **activation_type**: (QuantType) Specifies the quantization data type for activations. For DPU devices, this must be set to QuantType.QInt8. For more details on data type selection, refer to the ONNX Runtime quantization documentation. +* **weight_type**: (QuantType) Specifies the quantization data type for weights. For DPU devices, this must be set to QuantType.QInt8. +* **nodes_to_quantize**:(List of Strings or None) If specified, only the nodes in this list are quantized. The list should contain the names of the nodes, for example, ['Conv__224', 'Conv__252']. +* **nodes_to_exclude**:(List of Strings or None) If specified, the nodes in this list are excluded from quantization. +* **optimize_model**:(Boolean) If True, optimizes the model before quantization. However, this is not recommended as optimization changes the computation graph, making the debugging of quantization loss difficult. +* **use_external_data_format**: (Boolean) Used for large size (>2GB) model. The default is False. +* **extra_options**: (Dictionary or None) Contains key-value pairs for various options in different cases. Current used:
+ **extra.Sigmoid.nnapi = True/False** (Default is False) +
**ActivationSymmetric = True/False**: If True, calibration data for activations is symmetrized. The default is False. When using PowerOfTwoMethod for calibration, this should always be set to True. +
**WeightSymmetric = True/False**: If True, calibration data for weights is symmetrized. The default is True. When using PowerOfTwoMethod for calibration, this should always be set to True. +
**EnableSubgraph = True/False**: If True, the subgraph is quantized. The default is False. +
**ForceQuantizeNoInputCheck = True/False**: + If True, latent operators such as maxpool and transpose are always quantize their inputs, generating quantized outputs even if their inputs have not been quantized. The default behavior can be overridden for specific nodes using nodes_to_exclude. +
**MatMulConstBOnly = True/False**: + If True, only MatMul operations with a constant 'B' is quantized. The default is False for static mode. +
**AddQDQPairToWeight = True/False**: + If True, both QuantizeLinear and DeQuantizeLinear nodes are inserted for weight, maintaining its floating-point format. The default is False, which quantizes floating-point weight and feeds it solely to an inserted DeQuantizeLinear node. In the PowerOfTwoMethod calibration method, QDQ should always appear as a pair, hence this should be set to True. +
**OpTypesToExcludeOutputQuantization = list of op type**: + If specified, the output of operators with these types is not quantized. The default is an empty list. +
**DedicatedQDQPair = True/False**: If True, an identical and dedicated QDQ pair is created for each node. The default is False, allowing multiple nodes to share a single QDQ pair as their inputs. +
**QDQOpTypePerChannelSupportToAxis = dictionary**: + Sets the channel axis for specific operator types (e.g., {'MatMul': 1}). This is only effective when per-channel quantization is supported and per_channel is True. If a specific operator type supports per-channel quantization but no channel axis is explicitly specified, the default channel axis is used. For DPU devices, this must be set to {} as per-channel quantization is currently unsupported. +
**CalibTensorRangeSymmetric = True/False**: + If True, the final range of the tensor during calibration is symmetrically set around the central point "0". The default is False. In PowerOfTwoMethod calibration method, this should always be set to True. +
**CalibMovingAverage = True/False**: + If True, the moving average of the minimum and maximum values is computed when the calibration method selected is MinMax. The default is False. In PowerOfTwoMethod calibration method, this should be set to False. +
**CalibMovingAverageConstant = float**: + Specifies the constant smoothing factor to use when computing the moving average of the minimum and maximum values. The default is 0.01. This is only effective when the calibration method selected is MinMax and CalibMovingAverage is set to True. In PowerOfTwoMethod calibration method, this option is unsupported. + [< Previous](/docs/4_deploy_your_own_model/prune_model/prunemodel.md) | [Next >](/docs/4_deploy_your_own_model/deploy_model/deployingmodel.md) diff --git a/docs/4_deploy_your_own_model/serve_model/servingmodelwithinferenceserver.md b/docs/4_deploy_your_own_model/serve_model/servingmodelwithinferenceserver.md index 1bf57bf..3073282 100644 --- a/docs/4_deploy_your_own_model/serve_model/servingmodelwithinferenceserver.md +++ b/docs/4_deploy_your_own_model/serve_model/servingmodelwithinferenceserver.md @@ -1,6 +1,6 @@

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

- @@ -16,7 +16,9 @@ For testing, you can use the development image and move to the deployment image In all cases, you must configure your host machines for the appropriate hardware backends as described in the [UIF installation instructions](/docs/1_installation/installation.md), such as installing the ROCm™ platform for GPUs and XRT for FPGAs. There are several methods you can use to serve your models with different benefits and tradeoffs, which are discussed here. -This UIF release uses AMD Inference Server 0.3.0. The full documentation for the server for this release is available [online](https://xilinx.github.io/inference-server/0.3.0/index.html). + +This UIF release uses AMD Inference Server 0.4.0. The full documentation for the server for this release is available [online](https://xilinx.github.io/inference-server/0.4.0/index.html). + The latest version of the server and documentation are available on [GitHub](https://github.com/Xilinx/inference-server). # Table of Contents @@ -32,7 +34,7 @@ The latest version of the server and documentation are available on [GitHub](htt Using the development image to serve your model allows you to have greater control and visibility into its operation, which can be useful for debugging and analyzing. You must build the development image on a Linux host machine before using it. -The process to build and run the development image is documented in the Inference Server's [development quick start guide](https://xilinx.github.io/inference-server/0.3.0/quickstart_development.html). +The process to build and run the development image is documented in the Inference Server's [development quick start guide](https://xilinx.github.io/inference-server/main/quickstart_development.html). The development container mounts your working directory inside so you can place the model you want to serve somewhere in that tree. In the container, you can compile the server in debug or release configuration and start it. @@ -43,17 +45,33 @@ At load-time, you can pass worker-specific arguments to configure how it behaves In particular, you pass the path to your model to the worker at load-time. After the load succeeds, the server will respond with an endpoint string that you will need for subsequent requests. +## 4.4.1.1 Naming Format for *.mxr* Files + +In UIF 1.2, the MIGraphX worker requires a naming format for *.mxr files that differs from the names used for Modelzoo in Section 2. The required format is +*\_bXX.mxr* where XX is the compiled model's batch size. For example, *resnet50_b32.mxr*. If you use compiled *.mxr models that come from any source other than what the MIGraphX worker itself compiles, you must rename them to match this format. For example, + +``` + $ cp resnet50.mxr resnet50_b32.mxr +``` + +When requesting a worker, the _bXX suffix must be left out of the requested model name, so for this example the request would contain parameters *batch=32* and *model="resnet50.mxr"* or simply *model="resnet50"*. + +## 4.4.1.2 Sending Server Requests You can send requests to the server from inside or outside the container. -From inside the container, the default address for HTTP and gRPC will be `http://127.0.0.1:8998` and `127.0.0.1:50051`, but this can be changed when starting the server. -From the outside, the easiest approach is to use `docker ps` to list the ports the development container has exposed. -From the same host machine, you can use `http://127.0.0.1:` corresponding to the port listed that maps to 8998 in the container for HTTP requests. -When you have the address and the endpoint returned by the load, you can [make requests](#443-making-requests-with-http-or-grpc) to the server. + +From inside the container, the default address for HTTP and gRPC is `http://127.0.0.1:8998` and `127.0.0.1:50051`. However, users can change the default address when starting the server. + +- From the outside, the easiest approach is to use `docker ps` to list the ports the development container has exposed. +- From the same host machine, you can use `http://127.0.0.1:` corresponding to the port listed that maps to 8998 in the container for HTTP requests. + + When you have the address and the endpoint returned by the load, you can [make requests](#443-making-requests-with-http-or-grpc) to the server. # 4.4.2: Using the Deployment Image -The deployment image is a minimal image that contains a precompiled server executable that starts automatically when the container starts. This image is suitable for deployment with Docker, Kubernetes, or [KServe](https://xilinx.github.io/inference-server/0.3.0/kserve.html). With the latter two methods, you need to install and [set up a Kubernetes cluster](https://kubernetes.io/docs/setup/). +The deployment image is a minimal image that contains a precompiled server executable that starts automatically when the container starts. This image is suitable for deployment with Docker, Kubernetes, or [KServe](https://xilinx.github.io/inference-server/0.4.0/kserve.html). With the latter two methods, you need to install and [set up a Kubernetes cluster](https://kubernetes.io/docs/setup/). + +The process to build and run the deployment image is documented in the Inference Server's [deployment guide](https://xilinx.github.io/inference-server/0.4.0/deployment.html) and the [KServe deployment guide](https://xilinx.github.io/inference-server/0.4.0/kserve.html). As with the development image, you will need to load a worker to serve your model and use the endpoint it returns to make requests. -The process to build and run the deployment image is documented in the Inference Server's [Docker deployment guide](https://xilinx.github.io/inference-server/0.3.0/docker.html) and the [KServe deployment guide](https://xilinx.github.io/inference-server/0.3.0/kserve.html). As with the development image, you will need to load a worker to serve your model and use the endpoint it returns to make requests. When the container is up, get the address for the server. With Docker, you can use `docker ps` to get the exposed ports. With KServe, there is a [separate process](https://kserve.github.io/website/master/get_started/first_isvc/#3-check-inferenceservice-status) to determine the ingress address. @@ -66,16 +84,23 @@ Making requests to the server is most easily accomplished with the Python librar While you can use `curl` to query the status endpoints, making more complicated inference requests using `curl` can be difficult. In the development container, the Python library is installed as part of the server compilation so you can use it from there. -To use the library elsewhere, you need to [install it](https://xilinx.github.io/inference-server/0.3.0/python.html#install-the-python-library) yourself. -More detailed information and discussion around making requests using Python is in the [Python examples with ResNet50](https://xilinx.github.io/inference-server/0.3.0/example_resnet50_python.html) and the corresponding working [Python scripts in the repository](https://github.com/Xilinx/inference-server/tree/main/examples/resnet50). +To use the library elsewhere, you need to install it with pip: + +``` + $ pip install amdinfer +``` + +More detailed information and discussion around making requests using Python is in the [Python examples with ResNet50](https://xilinx.github.io/inference-server/0.4.0/example_resnet50_python.html) and the corresponding working [Python scripts in the repository](https://github.com/Xilinx/inference-server/tree/main/examples/resnet50). + An outline of the steps is provided here, where it is assumed you have started the server and loaded some models that you want to use for inference. In general, the process to make a request has the following steps: 1. Make a client. -2. Prepare a request. -3. Send the request. -4. Check the response. +2. Request a worker from the server. +3. Prepare a request. +4. Send the request. +5. Check the response. You can make an `HttpClient` or a `GrpcClient` depending on which protocol you want to use. As part of the constructor, you provide the address for the server that the client is supposed to use. @@ -144,4 +169,3 @@ UIF is licensed under [Apache License Version 2.0](/LICENSE). Refer to the [LICE Contact uif_support@amd.com for questions, issues, and feedback on UIF. Submit your questions, feature requests, and bug reports on the [GitHub issues](https://github.com/amd/UIF/issues) page. - diff --git a/docs/5_debugging_and_profiling/debugging_and_profiling.md b/docs/5_debugging_and_profiling/debugging_and_profiling.md index 32edd61..4e45d87 100644 --- a/docs/5_debugging_and_profiling/debugging_and_profiling.md +++ b/docs/5_debugging_and_profiling/debugging_and_profiling.md @@ -1,6 +1,6 @@

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide

- @@ -32,7 +32,7 @@ ROCGDB can do four main kinds of things to help you catch bugs in the act: - Examine what has happened, when your program has stopped. - Change things in your program, so you can experiment with correcting the effects of one bug and go on to learn about another. -For more information about ROCDebugger, refer to the [ROCDebugger User Guide](https://docs.amd.com/category/compilers_and_tools). +For more information about ROCDebugger, refer to the [ROCDebugger User Guide](https://rocm.docs.amd.com/projects/ROCgdb/en/latest/ROCgdb/gdb/doc/gdb/index.html). ## 5.1.2: ROCProfiler @@ -43,13 +43,13 @@ The following user requirements can be fulfilled using rocprof: ### 5.1.2.1: Counters and Metric Collection To collect counters and metrics such as number of VMEM read/write instructions issued, number of SALU instructions issued, and other details, use rocprof with profiling options. -For more details, refer to the chapter on Counter and Metric Collection in the [ROCm Profiling Tools User Guide](https://docs.amd.com/category/compilers_and_tools). +For more details, refer to the chapter on Counter and Metric Collection in the [ROCm Profiling Tools User Guide](https://rocm.docs.amd.com/projects/rocprofiler/en/latest/profiler_home_page.html). ### 5.1.2.2: Application Tracing To retrieve kernel-level traces such as workgroup size, HIP/HSA calls, and so on, use rocprof with tracing options such as hsa-trace, hip-trace, sys-trace, and roctx-trace. -To demonstrate the usage of rocprof with various options, the [ROCm Profiling Tools User Guide](https://docs.amd.com/category/compilers_and_tools) refers to the MatrixTranspose application as an example. +To demonstrate the usage of rocprof with various options, the [ROCm Profiling Tools User Guide](https://rocm.docs.amd.com/projects/rocprofiler/en/latest/profiler_home_page.html) refers to the MatrixTranspose application as an example. # 5.2: Debug on CPU @@ -87,7 +87,7 @@ System-level profiling for model execution can be done with AMD μProf from AMD # 5.3: Debug on FPGA -This section describes the utility tools available in UIF 1.1 for DPU execution debugging, performance profiling, DPU runtime mode manipulation, and DPU configuration file generation. With these tools, you can conduct DPU debugging and performance profiling independently. +This section describes the utility tools available in UIF 1.2 for DPU execution debugging, performance profiling, DPU runtime mode manipulation, and DPU configuration file generation. With these tools, you can conduct DPU debugging and performance profiling independently. ## 5.3.1: Profiling the Model diff --git a/docs/6_deployment_guide/PyTorch.md b/docs/6_deployment_guide/PyTorch.md index 179b684..2e8ee67 100644 --- a/docs/6_deployment_guide/PyTorch.md +++ b/docs/6_deployment_guide/PyTorch.md @@ -19,7 +19,7 @@ Unified Inference Frontend (UIF) accelerates deep learning inference solutions o ``` - docker pull amdih/uif-pytorch:uif1.1_rocm5.4.1_vai3.0_py3.7_pytorch1.12 + docker pull amdih/uif-pytorch:uif1.2_rocm5.6.1_vai3.5_py3.8_pytorch1.13 ``` @@ -27,8 +27,8 @@ Unified Inference Frontend (UIF) accelerates deep learning inference solutions o The UIF Pytorch Tools Docker container includes: -* ROCm™ 5.4.1 -* ROCm™ Pytorch 1.12 +* ROCm™ 5.6.1 +* ROCm™ Pytorch 1.13 * ROCm™ MIGraphX inference engine * Scripts for downloading pretrained models from the Model Zoo * Sample Application @@ -47,7 +47,7 @@ Use the following instructions to launch the Docker container for the applicatio docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device=/dev/kfd --device=/dev/dri --group-add video --ipc=host --shm-size 8G - amdih/uif-pytorch:uif1.1_rocm5.4.1_vai3.0_py3.7_pytorch1.12 + amdih/uif-pytorch:uif1.2_rocm5.6.1_vai3.5_py3.8_pytorch1.13 ``` diff --git a/docs/6_deployment_guide/Tensorflow.md b/docs/6_deployment_guide/Tensorflow.md index 66647bd..404d83e 100644 --- a/docs/6_deployment_guide/Tensorflow.md +++ b/docs/6_deployment_guide/Tensorflow.md @@ -21,7 +21,7 @@ For more information about UIF including use of this container, refer to https:/ ``` - docker pull amdih/uif-tensorflow:uif1.1_rocm5.4.1_vai3.0_tf2.10 + docker pull amdih/uif-tensorflow:uif1.2_rocm5.6.1_vai3.5_tensorflow2.12 ``` @@ -29,8 +29,8 @@ For more information about UIF including use of this container, refer to https:/ The UIF Tensorflow Tools Docker container includes: -* ROCm™ 5.4.1 -* ROCm™ Tensorflow 2.10 +* ROCm™ 5.6.1 +* ROCm™ Tensorflow 2.12 * ROCm™ MIGraphX inference engine * Scripts for downloading pretrained models from the Model Zoo * Sample Application @@ -52,7 +52,7 @@ Use the following instructions to launch the Docker container for the applicatio --group-add video --ipc=host --shm-size 8G - amdih/uif-tensorflow:uif1.1_rocm5.4.1_vai3.0_tf2.10 + amdih/uif-tensorflow:uif1.2_rocm5.6.1_vai3.5_tensorflow2.12 ```

Unified Inference Frontend (UIF) 1.1 User Guide

+

Unified Inference Frontend (UIF) 1.2 User Guide