iterative · aguschin · May 30, 2022 · May 24, 2022 · May 24, 2022 · May 24, 2022
diff --git a/content/docs/api-reference/import_object.md b/content/docs/api-reference/import_object.md
@@ -58,8 +58,8 @@ command.
   'pandas']. Defaults to auto-infer.
 - `copy_data` (optional) - Whether to create a copy of file in target location
   or just link existing file. Defaults to True.
-- `external` (optional) - Save result not in `.mlem`, but directly in repo
-- `index` (optional) - Whether to index output in `.mlem` directory
+- `external` (optional) - Save result directly in repo (not in `.mlem/`)
- `external` (optional) - Save result directly in repo (not in `.mlem/`)
+- `external` (optional) - Save result directly to `target` (not in `.mlem/`)
- `external` (optional) - Save result directly in repo (not in `.mlem/`)
+- `external` (optional) - Save result directly to `target` (not in `.mlem/`)
+- `index` (optional) - Whether to index output in `.mlem/` directory
 
 ## Exceptions
 

diff --git a/content/docs/api-reference/init.md b/content/docs/api-reference/init.md
@@ -1,6 +1,6 @@
 # mlem.api.init()
 
-Creates `.mlem/` directory in `path`
+Creates and populates the `.mlem/` directory in `path`.
 
 ```py
 def init(path: str = ".") -> None

diff --git a/content/docs/api-reference/save.md b/content/docs/api-reference/save.md
@@ -40,7 +40,7 @@ systems (eg: `S3`). The function returns and saves the object as a
 - `repo` (optional) - path to mlem repo
 - `sample_data` (optional) - If the object is a model or function, you can
   provide input data sample, so MLEM will include it's schema in the model's
-  metadata
+  metafile
 - `fs` (optional) - FileSystem for the `path` argument
 - `index` (optional) - Whether to add object to mlem repo index
 - `external` (optional) - if obj is saved to repo, whether to put it outside of

diff --git a/content/docs/command-reference/create.md b/content/docs/command-reference/create.md
@@ -1,7 +1,7 @@
 # create
 
 Creates a new [MLEM Object](/doc/user-guide/basic-concepts#mlem-objects)
-metafile from conf args and config files.
+metafile from config args and config files.
 
 ## Synopsis
 
@@ -16,10 +16,9 @@ PATH         Where to save object  [required]
 
 ## Description
 
-Metadata files (with `.mlem` file extension) can be created for
+`.mlem` metafiles can be created for
 [MLEM Objects](/doc/user-guide/basic-concepts#mlem-objects) using this command.
-This is particularly useful in filling up configuration values for environments
-and deployments.
+This is particularly useful for configuring environments and deployments.
 
 Each MLEM Object, along with its subtype (which represents a particular
 implementation), will accept different configuration arguments. The list of
@@ -38,18 +37,18 @@ check out the last example [here](/doc/command-reference/types#examples)
 
 ## Examples
 
-Create an environment metafile with a config key
+Create an environment object metafile with a config key:
 
 ```cli
-# Fetch all config arguments which can be passed for a heroku env
+# Fetch all available config args for a heroku env
 $ mlem types env heroku
 [not required] api_key: str = None
 
 # Create the heroku env
 $ mlem create env heroku production --conf api_key="mlem_heroku_staging"
 💾 Saving env to .mlem/env/staging.mlem
 
-# print the contents of the saved metafile for the heroku env
+# Print the contents of the new heroku env metafile
 $ cat .mlem/env/staging.mlem
 api_key: mlem_heroku_staging
 object_type: env

diff --git a/content/docs/command-reference/deploy/index.md b/content/docs/command-reference/deploy/index.md
@@ -25,7 +25,7 @@ serving a specific model, using a specific environment definition, and running
 on a target platform.
 
 MLEM deployments allow `applying` methods and even whole datasets on models.
-Each model lists its supported methods in its metafile, and those are
+Each model lists its supported methods in its `.mlem` metafile, and those are
 automatically used by MLEM to wire and expose endpoints on the application
 server upon deployment. Applying datasets on the deployment is a very handy
 shortcut of bulk inferring data on the served model.

diff --git a/content/docs/command-reference/import.md b/content/docs/command-reference/import.md
@@ -1,6 +1,6 @@
 # import
 
-Create a MLEM model or dataset metadata from a file/directory.
+Create a `.mlem` metafile for a model or dataset in any file or directory.
 
 ## Synopsis
 
@@ -14,10 +14,10 @@ TARGET  Path to save MLEM object  [required]
 
 ## Description
 
-Use `import` on an existing datasets or model files (or directories) to
-auto-generate the necessary MLEM metadata (`.mlem`) files for them. This is
-useful to quickly make existing datasets and model files compatible with MLEM,
-which can then be used in future operations such as `mlem apply`.
+Use `import` on an existing datasets or model files (or directories) to generate
+the necessary `.mlem` metafiles for them. This is useful to quickly make
+existing datasets and model files compatible with MLEM, which can then be used
-existing datasets and model files compatible with MLEM, which can then be used
+existing datasets and model files compatible with MLEM, which then can be used
-existing datasets and model files compatible with MLEM, which can then be used
+existing datasets and model files compatible with MLEM, which then can be used
+in future operations such as `mlem apply`.
 
 This command provides a quick and easy alternative to writing python code to
 load those models/datasets into object for subsequent usage in MLEM context.

diff --git a/content/docs/command-reference/init.md b/content/docs/command-reference/init.md
@@ -13,7 +13,7 @@ arguments: [PATH] Target path to workspace
 ## Description
 
 The `init` command (without given `path`) defaults to the current directory for
-the path argument. This creates a `.mlem` directory and an empty `config.yaml`
+the path argument. This creates a `.mlem/` directory and an empty `config.yaml`
 file inside it.
 
 Although we recommend using MLEM within a Git repository to track changes using

diff --git a/content/docs/command-reference/pprint.md b/content/docs/command-reference/pprint.md
@@ -15,8 +15,8 @@ arguments: PATH Path to object [required]
 ## Description
 
 All MLEM objects can be printed to view their metadata. This includes generic
-metadata information such as requirements, type of object, hash, size, as well
-as object specific information such as `methods` for a `model` or `reader` for a
+information such as requirements, type of object, hash, size, as well as object
+specific information such as `methods` for a `model` or `reader` for a
 `dataset`.
 
 Since only one specific object is printed, a `PATH` to the specific MLEM object

diff --git a/content/docs/get-started/saving.md b/content/docs/get-started/saving.md
@@ -55,9 +55,9 @@ $ tree .mlem/model/
 > changed, see [project structure](/doc/user-guide/project-structure) for
 > reference.
 
-What we see here is that model was saved along with some metadata about it: `rf`
-containing the model binary and `.mlem` file containing metadata. Let's take a
-look at it:
+The model was saved along with some metadata about it: `rf` containing the model
+binary and a `.mlem` metafile containing information about it. Let's take a look
-binary and a `.mlem` metafile containing information about it. Let's take a look
+binary and a `rf.mlem` metafile containing information about it. Let's take a look
-binary and a `.mlem` metafile containing information about it. Let's take a look
+binary and a `rf.mlem` metafile containing information about it. Let's take a look
+at it:
 
 <details>
 

diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
@@ -74,21 +74,11 @@
         "label": "Basic concepts",
         "source": "user-guide/basic-concepts.md"
       },
-      {
-        "slug": "datasets",
-        "label": "Working with datasets",
-        "source": "user-guide/datasets.md"
-      },
       {
         "slug": "project-structure",
         "label": "Project structure",
         "source": "user-guide/project-structure.md"
       },
-      {
-        "slug": "remote-repos",
-        "label": "Working with repositories and remote objects",
-        "source": "user-guide/remote-repos.md"
-      },
       {
         "slug": "configuration",
         "label": "Configuration",

diff --git a/content/docs/use-cases/dvc.md b/content/docs/use-cases/dvc.md
@@ -49,7 +49,7 @@ $ mlem config set default_storage.type dvc
 ```
 
 Also, let’s add `.mlem` files to `.dvcignore` so that metafiles are ignored by
-DVC
+DVC.
 
 ```cli
 $ echo "/**/?*.mlem" > .dvcignore
@@ -66,15 +66,18 @@ $ git rm -r --cached .mlem/
 $ python train.py
 ```
 
-Finally, let’s add new metafiles to Git and artifacts to DVC respectively,
-commit and push them
+Finally, let’s add and commit new metafiles to Git and artifacts to DVC,
+respectively:
 
 ```cli
 $ dvc add .mlem/model/rf .mlem/dataset/*.csv
 $ git add .mlem
 $ git commit -m "Switch to dvc storage"
+...
+
 $ dvc push -r myremote
 $ git push
+...
 ```
 
 Now, you can load MLEM objects from your repo even though there are no actual
@@ -89,18 +92,16 @@ DVC pipelines are the useful DVC mechanism to build data pipelines, in which you
 can process your data and train your model. You may be already training your ML
 models in them and what to start using MLEM to save those models.
 
-MLEM could be easily plug in into existing DVC pipelines. If you already added
-`.mlem` files to `.dvcignore`, you are good to go for most of the cases. Since
-DVC will ignore `.mlem` files, you don't need to add them as outputs and mark
-them with `cache: false`.
+MLEM can be easily plugged into existing DVC pipelines. If you already added
+`.mlem` files to `.dvcignore`, you are good to go in most cases.
 
-It becomes a bit more complicated when you need to add them as outputs, because
-you want to use them as inputs to next stages. The case may be when model binary
-doesn't change for you, but model metadata does. That may happen if you change
-things like model description or labels.
+It becomes a bit more complicated when you need to add them as inputs to
+pipeline stages. For example, when a model binary doesn't change, but its
+metadata (e.g. model description or labels) does. things like model description
+or labels.
 
 To work with that, you'll need to remove `.mlem` files from `.dvcignore` and
-mark your outputs in DVC Pipeline with `cache: false`.
+make them `cache: false` outputs in the pipeline.
 
 ## Example
 
@@ -118,7 +119,8 @@ stages:
 ```
 
 Next step would be to start saving your models with MLEM. Since MLEM saves both
-**binary** and **metadata** you need to have both of them in DVC pipeline:
+the binary and corresponding `.mlem` metafile, you need to have both of them in
+the DVC pipeline:
 
 ```yaml
 # dvc.yaml
@@ -133,9 +135,8 @@ stages:
           cache: false
 ```
 
-Since binary was already captured before, we don't need to add anything for it.
-For metadata, we've added two rows to capture it and specify `cache: false`
-since we want the metadata to be committed to Git, and not be pushed to DVC
-remote.
+The binary was already in, so there's no need to add it again. For the metafile,
+we've added two rows and specify `cache: false` to track it with DVC while
+storing it in Git.
 
 Now MLEM is ready to be used in your DVC pipeline!
diff --git a/content/docs/user-guide/basic-concepts.md b/content/docs/user-guide/basic-concepts.md
@@ -12,13 +12,13 @@ datasets and other types you can read about below.
 > Also, MLEM Objects can be created with
 > [`mlem create`](/doc/command-reference/create) CLI command
 
-MLEM Objects are saved as `.mlem` files in `yaml` format. Sometimes they can
-have other files attached to them, in that case we call `.mlem` file as a
-"metadata file" or "metafile" and all the other files we call "artifacts".
+MLEM Objects are saved as special _metafiles_ in YAML format with the `.mlem`
+extension. These may or may not have _artifacts_ (other files or directories)
+associated.
 
-Typically, if **MLEM Object** have only one artifact, it will have the same name
-without `.mlem` extension, for example `model.mlem` + `model`, or `data.csv` +
-`data.csv.mlem`.
+Typically, if **MLEM Object** have only one artifact, it will have the same file
+name without `.mlem` extension, for example `model.mlem` and `model`, or
+`data.csv` and `data.csv.mlem`.
 
 If **MLEM Object** have multiple artifacts, they will be stored in a directory
 with the same name, for example `model.mlem` + `model/data.pkl` +

diff --git a/content/docs/user-guide/datasets.md b/content/docs/user-guide/datasets.md
diff --git a/content/docs/user-guide/mlem-abcs.md b/content/docs/user-guide/mlem-abcs.md
@@ -123,7 +123,7 @@ will be pickled, and NN will be saved using `torch_io`
 
 ## DatasetType
 
-Hold metadata about dataset, like type, dimensions, column names etc.
+Holds metadata about dataset, like type, dimensions, column names etc.
 
 **Base class**: `mlem.core.dataset_type.DatasetType`
 

diff --git a/content/docs/user-guide/project-structure.md b/content/docs/user-guide/project-structure.md
@@ -8,7 +8,8 @@ To create one, use [`mlem init`](/doc/command-reference/init) or
 `config.yaml` (see [Configuration](/doc/user-guide/configuration)).
 
 > Some API and CLI commands like `mlem ls` and `mlem config` require this
-> execution context. But in general, MLEM can work with `.mlem` files anywhere.
+> execution context. But in general, MLEM can work with `.mlem` metafiles
+> anywhere.
 
 A common place to initialize MLEM is a data science Git repository. _MLEM
 repositories_ help you better structure and easily address existing data