An example of an e2e MLOps pipeline

This repository walks through an example of what an end-to-end MLOps pipeline could look like. It uses all open source tools:

Pachyderm to manage, version and transform data
Determined to train a model and manage model versions
Seldon to deploy models and request predictions

Actually, we will consider Pachyderm Enterprise and Seldon Deploy as there is some additional complexity that we want to cover and because these are the products normally found in production.

The overall integration will rely on the following Google Cloud infrastructure:

All Pachyderm, Determined and Seldon components will be deployed on a GKE cluster
Pachyderm will use a bucket to store the repositories
Determined will use a bucket to store the models' checkpoints
Seldon will use a bucket to store data for the model drift and outlier detectors
Google Cloud Registry will be used to store the container images for the two Pachyderm pipelines and the Seldon serving image

In order to keep the explanation simple, let's break the integration description into a serie of steps:

General architecture
Software prerequisites
Environment setup
Building the containers
Examining the pipelines
Running use cases:
- Image classification
- Market sentiment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

An example of an e2e MLOps pipeline

Files

README.md

Latest commit

History

README.md

File metadata and controls

An example of an e2e MLOps pipeline