Skip to content

Latest commit

 

History

History
146 lines (117 loc) · 5.49 KB

CHANGELOG.rst

File metadata and controls

146 lines (117 loc) · 5.49 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased

Added

  • Allow building Debian package (:commit:`930fab2`)
  • Add modelInferAsync to the API (:commit:`2f4a6c2`)
  • Add inferAsyncOrdered as a client operator for making inferences in parallel (:pr:`66`)
  • Support building Python wheels with cibuildwheel (:pr:`71`)
  • Support XModels with multiple output tensors (:pr:`74`)
  • Add FP16 support (:pr:`76`)
  • Add more documentation (:pr:`85`, :pr:`90`)
  • Add Python bindings for gRPC and Native clients (:pr:`88`)
  • Add tests with KServe (:pr:`90`)
  • Add batch size flag to examples (:pr:`94`)
  • Add Kubernetes test for KServe (:pr:`95`)
  • Use exhale to generate Python API documentation (:pr:`95`)
  • OpenAPI spec for REST protocol (:pr:`100`)
  • Use a timer for simpler time measurement (:pr:`104`)
  • Allow building containers with custom backend versions (:pr:`107`)

Changed

  • Refactor pre- and post-processing functions in C++ (:commit:`42cf748`)
  • Templatize Dockerfile for different base images (:pr:`71`)
  • Use multiple HTTP clients internally for parallel HTTP requests (:pr:`66`)
  • Update test asset downloading (:pr:`81`)
  • Reimplement and align examples across platforms (:pr:`85`)
  • Reorganize Python library (:pr:`88`)
  • Rename 'proteus' to 'amdinfer' (:pr:`91`)
  • Use Ubuntu 20.04 by default for Docker (:pr:`97`)
  • Bump up to ROCm 5.4.1 (:pr:`99`)
  • Some function names changed for style (:pr:`102`)
  • Bump up to ZenDNN 4.0 (:pr:`113`)

Deprecated

  • ALL_CAPS style enums for the DataType (:pr:`102`)

Removed

  • Mappings between XIR data types <-> inference server data types from public API (:pr:`102`)
  • Web GUI (:pr:`110`)

Fixed

  • Use input tensors in requests correctly (:pr:`61`)
  • Fix bug with multiple input tensors (:pr:`74`)
  • Align gRPC responses using non-gRPC-native data types with other input protocols (:pr:`81`)
  • Fix the Manager's destructor (:pr:`88`)
  • Fix using --no-user-config with proteus run (:pr:`89`)
  • Handle assigning user permissions if the host UID is same as UID in container (:pr:`101`)
  • Fix test discovery if some test assets are missing (:pr:`105`)
  • Fix gRPC queue shutdown race condition (:pr:`111`)

Added

Changed

  • Use Pybind11 to create Python API (:pr:`20`)
  • Two logs are created now: server and client
  • Logging macro is now PROTEUS_LOG_*
  • Loading workers is now case-insensitive (:commit:`14ed4ef` and :commit:`90a51ae`)
  • Build AKS from source (:commit:`e04890f`)
  • Use consistent custom exceptions (:issue:`30`)
  • Update Docker build commands to opt-in to all backends (:pr:`43`)
  • Renamed 'modelLoad' to 'workerLoad' and changed the behavior for 'modelLoad' (:pr:`27`)

Fixed

  • Get the right request size in the batcher when enqueuing with the C++ API (:commit:`d1ad81d`)
  • Construct responses correctly in the XModel worker if there are multiple input buffers (:commit:`d1ad81d`)
  • Populate the right number of offsets in the hard batcher (:commit:`6666142`)
  • Calculate offset values correctly during batching (:commit:`8c7534b`)
  • Get correct library dependencies for production container (:commit:`14ed4ef`)
  • Correctly throw an exception if a worker gets an error during initialization (:pr:`29`)
  • Detect errors in HTTP client during loading (:commit:`99ffc33`)
  • Construct batches with the right sizes (:pr:`57`)

Added

  • Core inference server functionality
  • Batching support
  • Support for running multiple workers simultaneously
  • Support for different batcher and buffer implementations
  • XModel support
  • Logging, metrics and tracing support
  • REST API based on KServe v2 API
  • C++ API
  • Python library for REST
  • Documentation, examples, and some tests
  • Experimental GUI