Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous Delivery using Kubeflow Pipelines #604

Closed
woop opened this issue Dec 31, 2018 · 8 comments
Closed

Continuous Delivery using Kubeflow Pipelines #604

woop opened this issue Dec 31, 2018 · 8 comments
Assignees
Labels
area/lifecycle kind/misc types beside feature and bug lifecycle/stale The issue / pull request is stale, any activities remove this label. priority/p2

Comments

@woop
Copy link
Member

woop commented Dec 31, 2018

I'm trying to evaluate whether Kubeflow Pipelines makes sense as a replacement for a CD system. At GOJEK we feel strongly that there should be a dividing line between the production of artifacts (data, config, models, binaries, docker images) and the deployment of those artifacts.

The way I see Kubeflow pipelines being used right now is end-to-end. Pulling data, doing feature engineering, training a model, evaluating it, and possibly deploying it. So it's clear that it can be used for artifact production (in this case models), but would it be a good fit for actually doing deployments to large and varied infrastructures?

Imagine you have many different geographic regions (VPCs), and many multiple environments (dev, staging, production). What we would ideally like to do is to have a generic deployment pipeline that can deploy the correct combination of artifacts into the respective regions, where each region might behave differently. With many of the existing CD softwares (Spinnaker, GoCD, Concourse), it's possible to monitor artifacts, and execute a pipeline to run when an artifact changes. The combination of artifacts is a new "version". This pipeline then runs all the respective tests and validations for this new version, and gives you an idea of how it will perform. You then have the option to deploy it into the respective region to serve real (or simulated) traffic.

I see the ML specific outputs that Kubeflow Pipelines has as very valuable for this use case, but the functionality that we would need is:

  • Sensing artifact changes: Can we sense configuration in git, docker images in container registry, model pushes to model database, or data changes in BQ, and then trigger a pipeline run?
  • Manual user actions: Can we have manual actions where a user can evaluate the outputs of a pipeline and choose to then promote that model. This would either add it to a set of models that are deployed to the respective environment, or make it the primary model.
  • Overview: Given hundreds of models being produced in many different regions and environments, is there a way to compare the different runs, or does it require the user to drill deep into specific pipelines to see outputs?

Is this possible right now, or planned in the future?

@IronPan @Ark-kun

@paveldournov
Copy link
Contributor

Thank you @woop for the detailed description of the scenario. This is a very interesting use case.

One question - what are the artifacts that you are deploying to production? Are those models that are deployed for serving, or more complex pipelines? Are you thinking of deploying models across regions and across clusters?

Re. the functionality you've asked for:

  • Currently there's no support for automatic sensing of the artifact changes to trigger a pipeline, although this is a feature that was requested earlier and is on the roadmap. One way to work around it currently is - to have a scheduled pipeline, which triggers every X mins and checks if the artifact/table/etc has changed, and use a condition to either exit the pipeline or continue execution if the change is detected. Once we have support for the automatic triggering - the first step would not be required but the rest of the pipeline could be reused.

  • It sounds like you are looking for a way to manually validate and approve a model for production deployment? The Kubeflow Pipelines do not offer such functionality e2e, although you can build the basic blocks for such CD workflow and have one pipeline that produces the metrics and another pipeline to deploy the model, and then plug them into a system that would allow users to approve the model and trigger the second pipeline. One question I'd ask though is if there's a way to encode automatic evaluation of the model instead and have an algorithmic solution for validating the model readiness for deployment?

  • You can compare runs manually if they are produced in the same clusters. For automatic analysis - there's no built-in support. Currently you will need to create a component that would take model metrics from different runs and compare them. This functionality is pretty high on the priority list and is on the roadmap as well.

Thanks!

@woop
Copy link
Member Author

woop commented Jan 4, 2019

Thank you @woop for the detailed description of the scenario. This is a very interesting use case.

One question - what are the artifacts that you are deploying to production? Are those models that are deployed for serving, or more complex pipelines? Are you thinking of deploying models across regions and across clusters?

Yes. Models, data, configuration, binaries, docker images. Many regions, many cluster groups, and many clusters.

Re. the functionality you've asked for:

  • Currently there's no support for automatic sensing of the artifact changes to trigger a pipeline, although this is a feature that was requested earlier and is on the roadmap. One way to work around it currently is - to have a scheduled pipeline, which triggers every X mins and checks if the artifact/table/etc has changed, and use a condition to either exit the pipeline or continue execution if the change is detected. Once we have support for the automatic triggering - the first step would not be required but the rest of the pipeline could be reused.

Ultimately there will need to be polling, if it's a pull based system. Your suggestion is fine, but the downside is that it's possible to see many pipelines being instantiated just to monitor dependencies. Ideally this polling would happen in the 5-10 second range, which could lead to a lot of "noise".

  • It sounds like you are looking for a way to manually validate and approve a model for production deployment? The Kubeflow Pipelines do not offer such functionality e2e, although you can build the basic blocks for such CD workflow and have one pipeline that produces the metrics and another pipeline to deploy the model, and then plug them into a system that would allow users to approve the model and trigger the second pipeline. One question I'd ask though is if there's a way to encode automatic evaluation of the model instead and have an algorithmic solution for validating the model readiness for deployment?

Yes I agree with the approach you mentioned. Our consideration is basically, do we have

  • Kubeflow -> Artifact Store -> CD system
    or can we have
  • Kubeflow -> Artifact Store -> Kubeflow
    The latter would make things a lot easier, depending on what we give up.
  • You can compare runs manually if they are produced in the same clusters. For automatic analysis - there's no built-in support. Currently you will need to create a component that would take model metrics from different runs and compare them. This functionality is pretty high on the priority list and is on the roadmap as well.

Great, happy to hear that. Can't wait for these feature releases. I think you guys are on the right track.

@vicaire
Copy link
Contributor

vicaire commented Feb 13, 2019

Hi @woop,

We indeed would like to support the features needed so that KFP can be used e2e for CD.

Here are some details about what we plan to support the features that you requested (see our public design doc for some details: https://bit.ly/2WhNT3D):

  • Sensing artifact changes: Can we sense configuration in git, docker images in container registry, model pushes to model database, or data changes in BQ, and then trigger a pipeline run?

We plan to implement a metadata store to gather metadata about any data artifact that a workflow generated.

We then plan to provide a data-driven workflow Kubernetes resource (an alternate orchestrator to Argo) that can trigger workflow execution based on a query to the metadata store (e.g. "execute a workflow each time there is a new model")

Additionally, it will be possible to record events to the metadata store through the metadata store API. As long as there is an event source (webhook, pub/sub queue, etc.), a decoupled piece of infrastructure can be launched that gathers events from this data source and adds them to the metadata store. Then, the data-driven workflow Kubernetes resource can be configured to trigger workflows based on these events.

  • Manual user actions: Can we have manual actions where a user can evaluate the outputs of a pipeline and choose to then promote that model. This would either add it to a set of models that are deployed to the respective environment, or make it the primary model.

It should be possible to leverage the metadata store and the data-driven workflow K8 resource to implement something like this (roughly):

  1. Some pipeline creates a model and stores it in an object store (GCS, S3, etc.). Metadata about that model is stored in the metadata store (its location, etc.).
  2. The users queries information about the model from the metadata store. If the model is ready to be promoted, the user registers a "model promotion" event with the metadata store.
  3. A data-driven workflow instance is listening to "model promotion" events and promoting the model to production.
  • Overview: Given hundreds of models being produced in many different regions and environments, is there a way to compare the different runs, or does it require the user to drill deep into specific pipelines to see outputs?

Once we have the metadata store, we could decouple:

  1. Querying for a bunch of results and get references to them (URI).
  2. Submitting these URI to a viewer to compare the referenced data.

It would be nice to know more about exactly what you would like to compare, what kind of view you would be looking for, and how many results would be compared simultaneously.

@kkasravi
Copy link

@vicaire would this use case be addressed by Event-Driven pipelines

@vicaire
Copy link
Contributor

vicaire commented Apr 9, 2019

kkasravi@, deploying a model for serving could be triggered by an event or just be a step of your pipeline. I think that would depend on your use case.

@animeshsingh
Copy link
Contributor

@vicaire @paveldournov given the implementation for metadata store, where does this fit in priority list? Also given the talk about keeping the pipeline decoupled from Argo to alrge extent, would it make sense to do with a neutral eventing solution like knative or you are planning to rely on argo?

@Ark-kun Ark-kun self-assigned this Oct 10, 2019
@rmgogogo rmgogogo added kind/misc types beside feature and bug and removed kind/proposal labels Nov 18, 2019
@stale
Copy link

stale bot commented Jun 25, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 25, 2020
@stale
Copy link

stale bot commented Jul 2, 2020

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@stale stale bot closed this as completed Jul 2, 2020
Linchin pushed a commit to Linchin/pipelines that referenced this issue Apr 11, 2023
* See kubeflow#595
* This PR is creating a reconciler for the auto deployed clusters.

  * The reconciler compares the list of auto-deployed clusters against
    the master and release branches to determine if we need a cluster
    based off a newer commit.

  * If we do we fire off a K8s job to deploy Kubeflow.

* This PR includes some general utilities

  * assertions.py some useful utilities for writing test assertions
    to compare lists and dictionaries

  * gcp_util some common GCP functionality like an iterator to list
    Deployments.

  * git_repo_manager.py a wrapper class to manage a local clone
    of a git repo.
HumairAK pushed a commit to red-hat-data-services/data-science-pipelines that referenced this issue Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/lifecycle kind/misc types beside feature and bug lifecycle/stale The issue / pull request is stale, any activities remove this label. priority/p2
Projects
None yet
Development

No branches or pull requests

8 participants