-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: add key terms to use case intros/tutorial and what is dvc? docs [SEO] #1806
Changes from 10 commits
4296de0
1039aff
f71ef34
916ca5f
5f1708e
508da28
9da95fa
10d3ff3
960db41
6edacb2
d599f1a
dd880e4
19211b1
7605dad
d9e8228
450c87f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,8 @@ | ||
# Tutorial: Versioning | ||
# Tutorial: Data & Model Versioning | ||
|
||
The goal of this example is to give you some hands-on experience with a basic | ||
machine learning version control scenario: working with multiple versions of | ||
datasets and ML models using DVC commands. We'll work with a | ||
machine learning version control scenario: managing multiple dataset and ML | ||
model versions using DVC commands. We'll work with a | ||
jorgeorpinel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[tutorial](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html) | ||
that [François Chollet](https://twitter.com/fchollet) put together to show how | ||
to build a powerful image classifier using a pretty small dataset. | ||
|
@@ -237,9 +237,9 @@ $ git commit -m "Second model, trained with 2000 images" | |
$ git tag -a "v2.0" -m "model v2.0, 2000 images" | ||
``` | ||
|
||
That's it! We have tracked a second dataset, model, and metrics versioned DVC, | ||
and the DVC-files that point to them committed with Git. Let's now look at how | ||
DVC can help us go back to the previous version if we need to. | ||
That's it! We've tracked the second version of the dataset, model, and metrics | ||
in DVC and committed the DVC-files that point to them with Git. Now let's look | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! "a second version" is better. Made the change. Reverted to "Let's now" — I tried some other versions of the second sentence when I rewrote the first one, and unintentionally switched them. (IMO "Let's now" is a better usage for following a result with a new instruction. 😉 ) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
at how DVC can help us go back to the previous version if we need to. | ||
|
||
## Switching between workspace versions | ||
|
||
|
@@ -338,15 +338,15 @@ changed. For example, when we added new images to built the second version of | |
our model, that was a dependency change. It also updates outputs and puts them | ||
into the <abbr>cache</abbr>. | ||
|
||
To make things a little simpler: if `dvc add` and `dvc checkout` provide a basic | ||
mechanism to version control large data files or models, `dvc run` and | ||
`dvc repro` provide a build system for ML models, which is similar to | ||
To make things a little simpler: `dvc add` and `dvc checkout` provide a basic | ||
mechanism for model and large dataset versioning. `dvc run` and `dvc repro` | ||
provide a build system for machine learning models, which is similar to | ||
[Make](https://www.gnu.org/software/make/) in software build automation. | ||
|
||
## What's next? | ||
|
||
In this example, our focus was on giving you hands-on experience with versioning | ||
ML models and datasets. We specifically looked at the `dvc add` and | ||
In this example, our focus was on giving you hands-on experience with dataset | ||
and ML model versioning. We specifically looked at the `dvc add` and | ||
`dvc checkout` commands. We'd also like to outline some topics and ideas you | ||
might be interested to try next to learn more about DVC and how it makes | ||
managing ML projects simpler. | ||
|
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes there is an SEO motivation here: the search term is "data science use cases".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! Going fwd if you can make some notes in the PR file changes on terms each change is for, or a list of terms in the PR description at least, that would be helpful for reviews 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Definitely. That makes a lot of sense and I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it matter that probably users looking for "data science use cases" are not looking for DVC use cases? I don't want to assume what 1000s of people want, but it sounds like a basic data science question rather than anything to do with structuring DS projects (e.g. using DVC).
So maybe changes like this will bring more traffic but also up the bounce rate. We'll have to try and see, I guess!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that the term is not a perfect match, but it is related to the primary subject area (data science). Most non-brand terms are going to be partially related but inexact, as searches for discovery are imprecise (because they don't know what DVC is yet).
The search engine is trying to fill in the gaps, so we want to expand on terms that are showing interest within the correct subject area in order to meet them halfway. This article already has some impressions for "use cases", including ML and data science so that's the motivation for this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, cool! Keeping unresolved for future reference.