-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continuous Delivery using Kubeflow Pipelines #604
Comments
Thank you @woop for the detailed description of the scenario. This is a very interesting use case. One question - what are the artifacts that you are deploying to production? Are those models that are deployed for serving, or more complex pipelines? Are you thinking of deploying models across regions and across clusters? Re. the functionality you've asked for:
Thanks! |
Yes. Models, data, configuration, binaries, docker images. Many regions, many cluster groups, and many clusters.
Ultimately there will need to be polling, if it's a pull based system. Your suggestion is fine, but the downside is that it's possible to see many pipelines being instantiated just to monitor dependencies. Ideally this polling would happen in the 5-10 second range, which could lead to a lot of "noise".
Yes I agree with the approach you mentioned. Our consideration is basically, do we have
Great, happy to hear that. Can't wait for these feature releases. I think you guys are on the right track. |
Hi @woop, We indeed would like to support the features needed so that KFP can be used e2e for CD. Here are some details about what we plan to support the features that you requested (see our public design doc for some details: https://bit.ly/2WhNT3D):
We plan to implement a metadata store to gather metadata about any data artifact that a workflow generated. We then plan to provide a data-driven workflow Kubernetes resource (an alternate orchestrator to Argo) that can trigger workflow execution based on a query to the metadata store (e.g. "execute a workflow each time there is a new model") Additionally, it will be possible to record events to the metadata store through the metadata store API. As long as there is an event source (webhook, pub/sub queue, etc.), a decoupled piece of infrastructure can be launched that gathers events from this data source and adds them to the metadata store. Then, the data-driven workflow Kubernetes resource can be configured to trigger workflows based on these events.
It should be possible to leverage the metadata store and the data-driven workflow K8 resource to implement something like this (roughly):
Once we have the metadata store, we could decouple:
It would be nice to know more about exactly what you would like to compare, what kind of view you would be looking for, and how many results would be compared simultaneously. |
@vicaire would this use case be addressed by Event-Driven pipelines |
kkasravi@, deploying a model for serving could be triggered by an event or just be a step of your pipeline. I think that would depend on your use case. |
@vicaire @paveldournov given the implementation for metadata store, where does this fit in priority list? Also given the talk about keeping the pipeline decoupled from Argo to alrge extent, would it make sense to do with a neutral eventing solution like knative or you are planning to rely on argo? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it. |
* See kubeflow#595 * This PR is creating a reconciler for the auto deployed clusters. * The reconciler compares the list of auto-deployed clusters against the master and release branches to determine if we need a cluster based off a newer commit. * If we do we fire off a K8s job to deploy Kubeflow. * This PR includes some general utilities * assertions.py some useful utilities for writing test assertions to compare lists and dictionaries * gcp_util some common GCP functionality like an iterator to list Deployments. * git_repo_manager.py a wrapper class to manage a local clone of a git repo.
I'm trying to evaluate whether Kubeflow Pipelines makes sense as a replacement for a CD system. At GOJEK we feel strongly that there should be a dividing line between the production of artifacts (data, config, models, binaries, docker images) and the deployment of those artifacts.
The way I see Kubeflow pipelines being used right now is end-to-end. Pulling data, doing feature engineering, training a model, evaluating it, and possibly deploying it. So it's clear that it can be used for artifact production (in this case models), but would it be a good fit for actually doing deployments to large and varied infrastructures?
Imagine you have many different geographic regions (VPCs), and many multiple environments (dev, staging, production). What we would ideally like to do is to have a generic deployment pipeline that can deploy the correct combination of artifacts into the respective regions, where each region might behave differently. With many of the existing CD softwares (Spinnaker, GoCD, Concourse), it's possible to monitor artifacts, and execute a pipeline to run when an artifact changes. The combination of artifacts is a new "version". This pipeline then runs all the respective tests and validations for this new version, and gives you an idea of how it will perform. You then have the option to deploy it into the respective region to serve real (or simulated) traffic.
I see the ML specific outputs that Kubeflow Pipelines has as very valuable for this use case, but the functionality that we would need is:
Is this possible right now, or planned in the future?
@IronPan @Ark-kun
The text was updated successfully, but these errors were encountered: