All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Allow building Debian package (:commit:`930fab2`)
- Add
modelInferAsync
to the API (:commit:`2f4a6c2`) - Add
inferAsyncOrdered
as a client operator for making inferences in parallel (:pr:`66`) - Support building Python wheels with cibuildwheel (:pr:`71`)
- Support XModels with multiple output tensors (:pr:`74`)
- Add FP16 support (:pr:`76`)
- Add more documentation (:pr:`85`, :pr:`90`)
- Add Python bindings for gRPC and Native clients (:pr:`88`)
- Add tests with KServe (:pr:`90`)
- Add batch size flag to examples (:pr:`94`)
- Add Kubernetes test for KServe (:pr:`95`)
- Use exhale to generate Python API documentation (:pr:`95`)
- OpenAPI spec for REST protocol (:pr:`100`)
- Use a timer for simpler time measurement (:pr:`104`)
- Allow building containers with custom backend versions (:pr:`107`)
- Refactor pre- and post-processing functions in C++ (:commit:`42cf748`)
- Templatize Dockerfile for different base images (:pr:`71`)
- Use multiple HTTP clients internally for parallel HTTP requests (:pr:`66`)
- Update test asset downloading (:pr:`81`)
- Reimplement and align examples across platforms (:pr:`85`)
- Reorganize Python library (:pr:`88`)
- Rename 'proteus' to 'amdinfer' (:pr:`91`)
- Use Ubuntu 20.04 by default for Docker (:pr:`97`)
- Bump up to ROCm 5.4.1 (:pr:`99`)
- Some function names changed for style (:pr:`102`)
- Bump up to ZenDNN 4.0 (:pr:`113`)
- ALL_CAPS style enums for the DataType (:pr:`102`)
- Mappings between XIR data types <-> inference server data types from public API (:pr:`102`)
- Web GUI (:pr:`110`)
- Use input tensors in requests correctly (:pr:`61`)
- Fix bug with multiple input tensors (:pr:`74`)
- Align gRPC responses using non-gRPC-native data types with other input protocols (:pr:`81`)
- Fix the Manager's destructor (:pr:`88`)
- Fix using
--no-user-config
withproteus run
(:pr:`89`) - Handle assigning user permissions if the host UID is same as UID in container (:pr:`101`)
- Fix test discovery if some test assets are missing (:pr:`105`)
- Fix gRPC queue shutdown race condition (:pr:`111`)
- HTTP/REST C++ client (:commit:`cbf33b8`)
- gRPC API based on KServe v2 API (:commit:`37a6aad` and others)
- TensorFlow/Pytorch + ZenDNN backend (:pr:`17` and :pr:`21`)
- 'ServerMetadata' endpoint to the API (:commit:`7747911`)
- 'modelList' endpoint to the API (:commit:`7477b7d`)
- Parse JSON data as string in HTTP body (:commit:`694800e`)
- Directory monitoring for model loading (:commit:`6459797`)
- 'ModelMetadata' endpoint to the API (:commit:`22b9d1a`)
- MIGraphX backend (:pr:`34`)
- Pre-commit for style verification(:commit:`048bdd7`)
- Use Pybind11 to create Python API (:pr:`20`)
- Two logs are created now: server and client
- Logging macro is now
PROTEUS_LOG_*
- Loading workers is now case-insensitive (:commit:`14ed4ef` and :commit:`90a51ae`)
- Build AKS from source (:commit:`e04890f`)
- Use consistent custom exceptions (:issue:`30`)
- Update Docker build commands to opt-in to all backends (:pr:`43`)
- Renamed 'modelLoad' to 'workerLoad' and changed the behavior for 'modelLoad' (:pr:`27`)
- Get the right request size in the batcher when enqueuing with the C++ API (:commit:`d1ad81d`)
- Construct responses correctly in the XModel worker if there are multiple input buffers (:commit:`d1ad81d`)
- Populate the right number of offsets in the hard batcher (:commit:`6666142`)
- Calculate offset values correctly during batching (:commit:`8c7534b`)
- Get correct library dependencies for production container (:commit:`14ed4ef`)
- Correctly throw an exception if a worker gets an error during initialization (:pr:`29`)
- Detect errors in HTTP client during loading (:commit:`99ffc33`)
- Construct batches with the right sizes (:pr:`57`)
- Core inference server functionality
- Batching support
- Support for running multiple workers simultaneously
- Support for different batcher and buffer implementations
- XModel support
- Logging, metrics and tracing support
- REST API based on KServe v2 API
- C++ API
- Python library for REST
- Documentation, examples, and some tests
- Experimental GUI