This repository has been archived by the owner on Oct 16, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 12
Misc. docs improvements #68
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
9f3d85f
nav update
jorgeorpinel 9b60b8c
remoge datasets and remote-repos guides
jorgeorpinel 2f85b77
term: .mlem/ dir vs .mlem file
jorgeorpinel ff2c49c
Lint
jorgeorpinel c88802f
term: metadata vs. metafile
jorgeorpinel 117b03a
Merge branch 'main' into docs-content
jorgeorpinel 6e601dc
Update content/docs/use-cases/dvc.md
jorgeorpinel 63d58b2
cases: simplify DVC integration
jorgeorpinel e48e5d1
Merge branch 'main' into docs-content
aguschin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,6 +1,6 @@ | ||||||
# import | ||||||
|
||||||
Create a MLEM model or dataset metadata from a file/directory. | ||||||
Create a `.mlem` metafile for a model or dataset in any file or directory. | ||||||
|
||||||
## Synopsis | ||||||
|
||||||
|
@@ -14,10 +14,10 @@ TARGET Path to save MLEM object [required] | |||||
|
||||||
## Description | ||||||
|
||||||
Use `import` on an existing datasets or model files (or directories) to | ||||||
auto-generate the necessary MLEM metadata (`.mlem`) files for them. This is | ||||||
useful to quickly make existing datasets and model files compatible with MLEM, | ||||||
which can then be used in future operations such as `mlem apply`. | ||||||
Use `import` on an existing datasets or model files (or directories) to generate | ||||||
the necessary `.mlem` metafiles for them. This is useful to quickly make | ||||||
existing datasets and model files compatible with MLEM, which can then be used | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
? |
||||||
in future operations such as `mlem apply`. | ||||||
|
||||||
This command provides a quick and easy alternative to writing python code to | ||||||
load those models/datasets into object for subsequent usage in MLEM context. | ||||||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -61,9 +61,9 @@ details. | |||||
|
||||||
</admon> | ||||||
|
||||||
What we see here is that model was saved along with some metadata about it: `rf` | ||||||
containing the model binary and `.mlem` file containing metadata. Let's take a | ||||||
look at it: | ||||||
The model was saved along with some metadata about it: `rf` containing the model | ||||||
binary and a `.mlem` metafile containing information about it. Let's take a look | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
at it: | ||||||
|
||||||
<details> | ||||||
|
||||||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -49,7 +49,7 @@ $ mlem config set default_storage.type dvc | |
``` | ||
|
||
Also, let’s add `.mlem` files to `.dvcignore` so that metafiles are ignored by | ||
DVC | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This needs to be updated due to #79 |
||
DVC. | ||
|
||
```cli | ||
$ echo "/**/?*.mlem" > .dvcignore | ||
|
@@ -66,15 +66,18 @@ $ git rm -r --cached .mlem/ | |
$ python train.py | ||
``` | ||
|
||
Finally, let’s add new metafiles to Git and artifacts to DVC respectively, | ||
commit and push them | ||
Finally, let’s add and commit new metafiles to Git and artifacts to DVC, | ||
respectively: | ||
|
||
```cli | ||
$ dvc add .mlem/model/rf .mlem/dataset/*.csv | ||
$ git add .mlem | ||
$ git commit -m "Switch to dvc storage" | ||
... | ||
|
||
$ dvc push -r myremote | ||
$ git push | ||
... | ||
``` | ||
|
||
Now, you can load MLEM objects from your repo even though there are no actual | ||
|
@@ -89,18 +92,10 @@ DVC pipelines are the useful DVC mechanism to build data pipelines, in which you | |
can process your data and train your model. You may be already training your ML | ||
models in them and what to start using MLEM to save those models. | ||
|
||
MLEM could be easily plug in into existing DVC pipelines. If you already added | ||
`.mlem` files to `.dvcignore`, you are good to go for most of the cases. Since | ||
DVC will ignore `.mlem` files, you don't need to add them as outputs and mark | ||
them with `cache: false`. | ||
|
||
It becomes a bit more complicated when you need to add them as outputs, because | ||
you want to use them as inputs to next stages. The case may be when model binary | ||
doesn't change for you, but model metadata does. That may happen if you change | ||
things like model description or labels. | ||
|
||
To work with that, you'll need to remove `.mlem` files from `.dvcignore` and | ||
mark your outputs in DVC Pipeline with `cache: false`. | ||
MLEM can be easily plugged into existing DVC pipelines. If you already added | ||
`.mlem` files to `.dvcignore`, you are good to go. Otherwise you'll need to | ||
mark `.mlem` files as `cache: false` [outputs] of a pipelines stage. | ||
[outputs]: https://dvc.org/doc/user-guide/project-structure/pipelines-files#output-subfields | ||
|
||
## Example | ||
|
||
|
@@ -118,7 +113,8 @@ stages: | |
``` | ||
|
||
Next step would be to start saving your models with MLEM. Since MLEM saves both | ||
**binary** and **metadata** you need to have both of them in DVC pipeline: | ||
the binary and corresponding `.mlem` metafile, you need to have both of them in | ||
the DVC pipeline: | ||
|
||
```yaml | ||
# dvc.yaml | ||
|
@@ -133,9 +129,8 @@ stages: | |
cache: false | ||
``` | ||
|
||
Since binary was already captured before, we don't need to add anything for it. | ||
For metadata, we've added two rows to capture it and specify `cache: false` | ||
since we want the metadata to be committed to Git, and not be pushed to DVC | ||
remote. | ||
The binary was already in, so there's no need to add it again. For the metafile, | ||
we've added two rows and specify `cache: false` to track it with DVC while | ||
storing it in Git. | ||
|
||
Now MLEM is ready to be used in your DVC pipeline! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more correct, since with
external=True
it doesn't take into account the MLEM project at all. Also, it's not repo, it's project now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah you should've felt free to commit your fixes too 🙂 but anyway, we'll get to it... ⌛