You will need to have a cluster with kserve installed. If you do not have a cluster, you can create one by following the guide here
- Docker
- Poetry
- Python 3.10
- kubectl
- pre-commit
This project uses poetry
for package management.
There is also a requirements.txt
file for those who prefer to use pip
.
poetry install
This repository contains all the necessary files to deploy a model using kserve. It implements:
- Continuous integration using GitHub Actions
- Example unit tests
- Example integration tests
- Pre-commit hooks for code quality
- Pre-commit hooks for poetry
requirements.txt
exports - A dummy model wrapped in KServe custom predictor class.
- Notebook to demonstrate how to deploy the model
To build the docker container use:
docker build -f custom_predictor/Dockerfile -t demokserve .
Then run the container with:
docker run -p 8080:8080 demokserve
You can send a request to the model using the following code:
import requests
import base64
import io
import uuid
from PIL import Image
from src.data_models import InferenceV2Inputs, InferenceV2
def bytes_to_json_serializable(bytes_data: bytes) -> str:
return base64.b64encode(bytes_data).decode("utf-8")
# read pil image to bytes
with io.BytesIO() as output:
with Image.open("cat.jpg") as img:
img.save(output, format="PNG")
image_size = img.size
image_bytes_data = bytes_to_json_serializable(output.getvalue())
inputs = InferenceV2Inputs(
name="input-0",
shape=list(image_size),
datatype="BYTES",
data=[image_bytes_data]
)
inference_request_payload = InferenceV2(
id=str(uuid.uuid4()),
inputs=[inputs]
)
model_name = "kserve-demo-model"
url = f"http://localhost:8080/v2/models/{model_name}/infer"
request_headers = {"Host": f"{model_name}.kserve.example.com"}
response = requests.post(url, data=inference_request_payload.json(), headers=request_headers)