Skip to content

andrewgeller/kserve

 
 

Repository files navigation

KServe

go.dev reference Coverage Status Go Report Card Releases LICENSE Slack Status

KServe provides a Kubernetes Custom Resource Definition for serving machine learning (ML) models on arbitrary frameworks. It aims to solve production model serving use cases by providing performant, high abstraction interfaces for common ML frameworks like Tensorflow, XGBoost, ScikitLearn, PyTorch, and ONNX.

It encapsulates the complexity of autoscaling, networking, health checking, and server configuration to bring cutting edge serving features like GPU Autoscaling, Scale to Zero, and Canary Rollouts to your ML deployments. It enables a simple, pluggable, and complete story for Production ML Serving including prediction, pre-processing, post-processing and explainability. KServe is being used across various organizations.

For more details, visit KServe website

KServe

Since 0.7 KFServing is rebranded to KServe, we still support previous KFServing 0.5.x and 0.6.x releases, please refer to corresponding release branch for docs.

Learn More

To learn more about KServe, how to deploy it as part of Kubeflow, how to use various supported features, and how to participate in the KServe community, please follow the KServe website documentation. Additionally, we have compiled a list of presentations and demoes to dive through various details.

Installation

Standalone Installation

KServe by default installs Knative for serverless deployment, please follow Serverless installation guide to install KServe. If you are looking to install KServe without Knative(this feature is still alpha), please follow Raw Kubernetes Deployment installation guide.

Quick Install

Please follow quick install to install KServe on your local machine.

Create test inference service

Please follow getting started to create your first InferenceService.

Roadmap

Roadmap

API Reference

InferenceService v1beta1 API Docs

Developer Guide

Developer Guide.

Contributor Guide

Contributor Guide

Adopters

Adopters

About

Serverless Inferencing on Kubernetes

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jsonnet 44.1%
  • Python 35.5%
  • Go 19.0%
  • Shell 0.9%
  • Makefile 0.3%
  • Dockerfile 0.2%