-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pipeline version to job/run integration test so that job/run is c… #3270
Conversation
…reated from pipeline version
/test kubeflow-pipeline-e2e-tes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small comment, otherwise looks good to me
ResourceReferences: []*job_model.APIResourceReference{ | ||
{Key: &job_model.APIResourceKey{Type: job_model.APIResourceTypeEXPERIMENT, ID: helloWorldExperiment.ID}, | ||
Relationship: job_model.APIRelationshipOWNER}, | ||
{Key: &job_model.APIResourceKey{Type: job_model.APIResourceTypePIPELINEVERSION, ID: helloWorldPipelineVersion.ID}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussion:
Shall we keep existing tests and add a new one testing using pipeline version?
So that we can keep test coverage for using pipelines in Job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, we are going to deprecate the usage of pipeline id to create run/job. And all runs and jobs must be from some version. The to-be-deprecated approach to creating from pipeline is actually using its default version as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
I'm curious why we are deprecating that usage, I thought we are going to be backward-compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a little bit of history. Once upon a time, we are trying to follow the practice of CMLE api. In that approach, a pipeline is allowed to contain no pipeline versions at all. Then we figure it might be more flexible and more extensible if we always go from versions. No matter we allow pipeline to have empty versions or not, creating from versions works. And moreover, we always have a default version, so from the UI, user open a pipeline, its details page actually shows its default version and when run is created from there, we naturally use default version to create run. And blahblah... and also, if we really really really just want pipeline, it should be specified in resource reference field instead of the pipeline spec field, since that resource reference field is meant and added to have a uniform way to specify dependencies between our objects/resources...
So in short, specifying pipeline in the pipeline spec is going to be deprecated for actually two reasons: (1) we want creation from pipeline versions (2) we want to use resource reference field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so it will be supported to specify pipeline id from resource references? (is it already supported?)
So it's still a breaking change, but we have a migration path, is that right?
Do you have a tracking list for TODO items like updating SDK to not use the spec field?
e.g. here
pipelines/sdk/python/kfp/_client.py
Lines 333 to 338 in 119e329
spec = kfp_server_api.models.ApiPipelineSpec( | |
pipeline_id=pipeline_id, | |
workflow_manifest=pipeline_json_string, | |
parameters=api_params) | |
run_body = kfp_server_api.models.ApiRun( | |
pipeline_spec=spec, resource_references=resource_references, name=job_name) |
Also, the pipeline spec field also accepts json string of the entire pipeline manifest to run. I think that usage is required for cloning runs. Will we deprecate that?
Looks like you have migrated all UI usages of pipeline_spec with pipeline_id, SGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the SDK, two cases:
Our own script can be fixed by us
User script will get warning that pipeline id in pipeline spec.
Another important reason why we prefer to refer to version instead of pipeline is that default version can change. When user specify a pipeline, we use the default version since all real manifests are conceptually linked to versions. Later the default version changes, and if GetRun only give back the pipeline id used to create the run, user wouldn't know which version was used at the creation time and they'll have to rely the manifest associated with the run at creation time. It is ok but messy.
So even when we allow the user still uses pipeline id in pipeline spec, internally we'll still record the run with a reference to the default version at the creation time. Hence implement the TODO at
// TODO(jingzhang36): after FE is enabled to use pipeline version to create |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, an opportunity for us to totally remove all legacies is to introduce breaking changes (that is, surface the internal change to the sdk client interface) at KFP 1.0......
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on these, just want to double check if you have put the following in your plan like:
- adding deprecation notification in sdk methods that we will no longer support (I believe we should do this as early as possible, it's not far away from KFP 1.0's final release.)
- confirm with the team that we will have a breaking change (I'm not sure if we all agreed in last design review, at least I didn't know we will make breaking change.)
So even when we allow the user still uses pipeline id in pipeline spec, internally we'll still record the run with a reference to the default version at the creation time. Hence implement the TODO at
For this part, it's totally reasonable, but it doesn't convince me we need to stop people from using pipeline id to create runs. Your mentioned recording pipeline version id at run creation time sounds like the right fix (and is not hacky at all).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
(1) the breaking change is optional, we can choose not to. That's my original plan (not to make the sdk client interface change). When we keep the SDK client interface the same, we only have a different implementation underhood (use resource reference instead of pipeline id in spec).
(2) the reason I talked about breaking change is IF we want it, we can leverage KF 1.0 as a proper time to do. Alternatively, as I suggested before, we put warning message without committing any change for a period of time. And the actual commit of the breaking change can wait until 2.0 or never....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed offline: we will fix sdk to workaround the breaking change
/test kubeflow-pipeline-e2e-test |
/retest |
/test kubeflow-pipeline-e2e-test |
/test kubeflow-pipeline-sample-test |
/test kubeflow-pipeline-e2e-test |
/test kubeflow-pipeline-e2e-test |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Name: pipelineVersionName, Relationship: job_model.APIRelationshipCREATOR}, | ||
} | ||
assert.Len(t, job.ResourceReferences, 2) | ||
assert.Subset(t, job.ResourceReferences, resourceReferences) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a little searching, is this a better choice: https://godoc.org/github.com/stretchr/testify/assert#Assertions.ElementsMatch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/lgtm |
@@ -218,6 +218,7 @@ github.com/spf13/pflag v1.0.3 h1:zPAT6CGy6wXeQ7NtTnaTerfKOsV6V6F8agHXFiazDkg= | |||
github.com/spf13/pflag v1.0.3/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= | |||
github.com/spf13/viper v1.3.0 h1:cO6QlTTeK9RQDhFAbGLV5e3fHXbRpin/Gi8qfL4rdLk= | |||
github.com/spf13/viper v1.3.0/go.mod h1:ZiWeW+zYFKm7srdB9IoDzzZXaJaI5eL9QjNiN/DMA2s= | |||
github.com/stretchr/objx v0.1.0 h1:4G4v2dO3VZwixGIRoQ5Lfboy6nUhCyYzaqnIAPPhYs4= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, why do you need this change? Now this PR requires root OWNERS to approve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems getting in after using subset or elementsmatch. Not manually added. But it seems that I can merge, so I'll just leave it there.
kubeflow#3270) * Add pipeline version to job/run integration test so that job/run is created from pipeline version * Use set comparison to check array equation in test * address comments to use elements match
…reated from pipeline version
Related issue: #3247
This change is