Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: use cases: add CI/CD for ML #2404

Merged
merged 18 commits into from
Sep 4, 2021
Merged

Conversation

casperdcl
Copy link
Contributor

@casperdcl casperdcl commented Apr 21, 2021

  • add Use Cases: CI/CD for ML
  • cross-reference mentions of CI/CD in current docs

Note that this is a "use case" in the purest sense (i.e. advertising/marketing/advocacy material, "features" or just the introductory bit of a "how-to") so no tutorial.

The aims (in order of priority) are to:

  1. explain what MLOps and DataOps mean to people who know about DevOps and Git
  2. mention that DVC can help
  3. mention that CML can help

@casperdcl casperdcl self-assigned this Apr 21, 2021
@casperdcl casperdcl added type: discussion Requires active participation to reach a conclusion. A: docs Area: user documentation (gatsby-theme-iterative) labels Apr 21, 2021
@casperdcl casperdcl requested a review from shcheklein April 21, 2021 03:11
@casperdcl casperdcl marked this pull request as ready for review April 21, 2021 03:15
@jorgeorpinel
Copy link
Contributor

Very needed use case!

Quick note: please see #820 and #194, in case this solves them (partially).

@casperdcl
Copy link
Contributor Author

Yes maybe needs to be added to #820 (or even replace #862?)

full disclosure: this PR has already had a few iterations with @shcheklein :)

@casperdcl casperdcl requested a review from jorgeorpinel April 21, 2021 10:44
Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! Initial review (of about the first half)

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actual review...

content/docs/use-cases/ci-cd-ml.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-ml.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-ml.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-ml.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-ml.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-ml.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-ml.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-ml.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-ml.md Outdated Show resolved Hide resolved
@casperdcl casperdcl requested a review from dberenbaum April 24, 2021 17:32
Copy link
Contributor

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, @casperdcl! Since this is so open-ended and subjective, I left mostly discussion points rather than requests for changes.

content/docs/use-cases/ci-cd-for-machine-learning.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-for-machine-learning.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-for-machine-learning.md Outdated Show resolved Hide resolved
Copy link
Contributor

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still a little unclear on the scenario this use case addresses. It seems like the examples are focused on CI: regularly merging, training, and validating models. This seems like where CML provides most value.

However, right after the CML example scenario is a paragraph about production deployment, which seems disconnected from the examples. From the list there, it seems like "Sharing & deployment" and "Monitoring & feedback" in particular are not really addressed.

It feels to me like maybe this use case should be limited to CI for ML training/development and leave deployment for a different scenario. Obviously, there is overlap, and deployment is a natural next step, but to me that overlap is making it confusing. For example, scheduling regular jobs pulling in new data could be useful both in a training pipeline (to generate a new model that might get deployed) or in a production pipeline (to make predictions using a deployed model).

content/docs/use-cases/ci-cd-for-machine-learning.md Outdated Show resolved Hide resolved
content/docs/use-cases/ci-cd-for-machine-learning.md Outdated Show resolved Hide resolved
@shcheklein shcheklein temporarily deployed to dvc-org-docs-ci-cd-2ewexpm4ifu April 30, 2021 23:58 Inactive
@shcheklein

This comment has been minimized.

casperdcl added a commit to iterative/static that referenced this pull request May 1, 2021
Comment on lines +50 to +53
configuration. Here are a few feature highlights:

**Models, Data, and Metrics as Code**: DVC removes the need to create versioning
databases, use special file/folder structures, or write bespoke interfacing
Copy link
Contributor

@jorgeorpinel jorgeorpinel Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This last part is about the solution right? Structure-wise I think it's OK to have a list here only if no bullet lists remain in the first half. And even then maybe it can be a list of brief (1/2 sentence) features.

In any case there would still need to be some more text that completes and concludes the story in some logical way. Probably elaborating on what these features mean from the user's perspective (sell it). Again, no need to go super deep into details so hopefully the whole 2nd half can be about 1/2 or at most 2/3 the current length, based on other use cases.

@shcheklein shcheklein temporarily deployed to dvc-org-docs-ci-cd-8uuiji0eeg3 September 4, 2021 17:51 Inactive
@shcheklein shcheklein temporarily deployed to dvc-org-docs-ci-cd-8uuiji0eeg3 September 4, 2021 21:55 Inactive
@shcheklein shcheklein merged commit 62dec2c into iterative:master Sep 4, 2021
@casperdcl casperdcl deleted the docs-ci-cd branch September 5, 2021 00:12
karajan1001 pushed a commit to karajan1001/dvc.org that referenced this pull request Sep 29, 2021
* docs: use cases: add CI/CD for ML

* review comments

* change slug

* redraft after reviews

* minor review comments

* add image placeholders

* misc review updates

* heighten the level (rework)

* mention CML provisioning

* missing CML mention

* split diagram, delay low-level details

* reorganise sections & respond to review comments

* add cross-references

* consistent emboldening, shortening

* fix typo

Co-authored-by: Jorge Orpinel <[email protected]>

* update images

* add transparency

* shrink images

Co-authored-by: Jorge Orpinel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: docs Area: user documentation (gatsby-theme-iterative)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants