-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: use cases: add CI/CD for ML #2404
Conversation
Yes maybe needs to be added to #820 (or even replace #862?) full disclosure: this PR has already had a few iterations with @shcheklein :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff! Initial review (of about the first half)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actual review...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, @casperdcl! Since this is so open-ended and subjective, I left mostly discussion points rather than requests for changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a little unclear on the scenario this use case addresses. It seems like the examples are focused on CI: regularly merging, training, and validating models. This seems like where CML provides most value.
However, right after the CML example scenario is a paragraph about production deployment, which seems disconnected from the examples. From the list there, it seems like "Sharing & deployment" and "Monitoring & feedback" in particular are not really addressed.
It feels to me like maybe this use case should be limited to CI for ML training/development and leave deployment for a different scenario. Obviously, there is overlap, and deployment is a natural next step, but to me that overlap is making it confusing. For example, scheduling regular jobs pulling in new data could be useful both in a training pipeline (to generate a new model that might get deployed) or in a production pipeline (to make predictions using a deployed model).
This comment has been minimized.
This comment has been minimized.
configuration. Here are a few feature highlights: | ||
|
||
**Models, Data, and Metrics as Code**: DVC removes the need to create versioning | ||
databases, use special file/folder structures, or write bespoke interfacing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This last part is about the solution right? Structure-wise I think it's OK to have a list here only if no bullet lists remain in the first half. And even then maybe it can be a list of brief (1/2 sentence) features.
In any case there would still need to be some more text that completes and concludes the story in some logical way. Probably elaborating on what these features mean from the user's perspective (sell it). Again, no need to go super deep into details so hopefully the whole 2nd half can be about 1/2 or at most 2/3 the current length, based on other use cases.
dfcf3f2
to
d8deccb
Compare
Co-authored-by: Jorge Orpinel <[email protected]>
6246b34
to
fcac0f2
Compare
* docs: use cases: add CI/CD for ML * review comments * change slug * redraft after reviews * minor review comments * add image placeholders * misc review updates * heighten the level (rework) * mention CML provisioning * missing CML mention * split diagram, delay low-level details * reorganise sections & respond to review comments * add cross-references * consistent emboldening, shortening * fix typo Co-authored-by: Jorge Orpinel <[email protected]> * update images * add transparency * shrink images Co-authored-by: Jorge Orpinel <[email protected]>
Use Cases: CI/CD for ML
CI/CD
in current docsNote that this is a "use case" in the purest sense (i.e. advertising/marketing/advocacy material, "features" or just the introductory bit of a "how-to") so no tutorial.
The aims (in order of priority) are to: