Skip to content

Latest commit

 

History

History
28 lines (21 loc) · 1.53 KB

File metadata and controls

28 lines (21 loc) · 1.53 KB

An example of an e2e MLOps pipeline

This repository walks through an example of what an end-to-end MLOps pipeline could look like. It uses all open source tools:

  • Pachyderm to manage, version and transform data
  • Determined to train a model and manage model versions
  • Seldon to deploy models and request predictions

Actually, we will consider Pachyderm Enterprise and Seldon Deploy as there is some additional complexity that we want to cover and because these are the products normally found in production.

The overall integration will rely on the following Google Cloud infrastructure:

  • All Pachyderm, Determined and Seldon components will be deployed on a GKE cluster
  • Pachyderm will use a bucket to store the repositories
  • Determined will use a bucket to store the models' checkpoints
  • Seldon will use a bucket to store data for the model drift and outlier detectors
  • Google Cloud Registry will be used to store the container images for the two Pachyderm pipelines and the Seldon serving image

In order to keep the explanation simple, let's break the integration description into a serie of steps: