Skip to content

Commit

Permalink
Regular updates (March 10) (#1040)
Browse files Browse the repository at this point in the history
* term: flag -> option  as much as possible

* term: review usage of "option" up til status cmd ref
  • Loading branch information
jorgeorpinel authored Mar 14, 2020
1 parent 0d7e99e commit 01ed81c
Show file tree
Hide file tree
Showing 19 changed files with 91 additions and 88 deletions.
2 changes: 1 addition & 1 deletion public/static/docs/changelog/0.18.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ really excited to share the progress with you:
computation): ![](/static/img/0.18-progress.gif)

- Pipeline visualization via command line. Just run `dvc pipeline show` with
`ascii` option and a target: ![](/static/img/0.18-pipeline.gif)
option `--ascii` and a target: ![](/static/img/0.18-pipeline.gif)

- Many hidden gems: `dvc repro` dry and interactive modes, improved overall
commands verbosity and revised commands help.
Expand Down
16 changes: 9 additions & 7 deletions public/static/docs/changelog/0.35.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,10 @@ Now, let's **highlight the changes** (not including bug fixes, and minor
improvements) we have done in the last few months:

- 🏷 We received a lot of feedback that using Git branches is not always an
optimal way to manage experiments. We have added an option to **support Git
tags** (Git commits are coming). The new option `-T` or `--all-tags` is
supported by all DVC commands that support`-a` or `--all-branches`.
optimal way to manage experiments. We have added an option to **use all Git
tags** (Git commits are coming), `-T` or `--all-tags`, which is supported by
all DVC commands that also have `-a` or `--all-branches` (use all Git
branches).

- 📖 The [Get Started](/doc/get-started/agenda) section has been simplified
(e.g. to use tags instead of branches) and extended. We have also prepared a
Expand Down Expand Up @@ -39,10 +40,11 @@ improvements) we have done in the last few months:
1 file not changed, 0 files modified, 1 file added, 0 files deleted, size was increased by 15.3 MB
```

- We've introduced the DVC commit command and `dvc run/repro/add --no-commit`
flag to give a way to **avoid uncontrolled cache growth** and as a way to save
some runs of `dvc repro`. In the future we plan to have “do-not-cache-my-data”
as a default mode for `dvc run`, `dvc add` and `dvc repro`.
- We've introduced the `dvc commit` command and the
`dvc run/repro/add --no-commit` option to give a way to **avoid uncontrolled
cache growth** and as a way to save some runs of `dvc repro`. In the future we
plan to have“do-not-cache-my-data”as a default mode for `dvc run`, `dvc add`
and `dvc repro`.

- **SSH remotes (data storage) support** - config options to set port, key
files, timeouts, password, etc + improved stability and Windows support!
Expand Down
2 changes: 1 addition & 1 deletion public/static/docs/command-reference/add.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ to work with directory hierarchies with `dvc add`:
1. With `dvc add --recursive`, the hierarchy is traversed and every file is
added individually as described above. This means every file has its own
DVC-file, and a corresponding cached file is created (unless the
`--no-commit` flag is used).
`--no-commit` option is used).
2. When not using `--recursive` a DVC-file is created for the top of the
directory (with default name `dirname.dvc`). Every file in the hierarchy is
added to the cache (unless the `--no-commit` option is used), but DVC does
Expand Down
2 changes: 1 addition & 1 deletion public/static/docs/command-reference/cache/dir.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ for the config file.
private locations, etc).

- `-u`, `--unset` - remove the `cache.dir` config option from the config file.
Don't provide a `value` when using this flag.
Don't provide a `value` argument when employing this flag.

- `-h`, `--help` - prints the usage/help message, and exit.

Expand Down
13 changes: 7 additions & 6 deletions public/static/docs/command-reference/commit.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ $ dvc pull --all-branches --all-tags
## Example: Rapid iterations

Sometimes we want to iterate through multiple changes to configuration, code, or
data, trying multiple options to improve the output of a stage. To avoid filling
data, trying different ways to improve the output of a stage. To avoid filling
the <abbr>cache</abbr> with undesired intermediate results, we can run a single
stage with `dvc run --no-commit`, or reproduce an entire pipeline using
`dvc repro --no-commit`. This prevents data from being pushed to cache. When
Expand All @@ -140,17 +140,18 @@ files in the cache.

In the `featurize.dvc` stage, `src/featurize.py` is executed. A useful change to
make is adjusting a parameter to `CountVectorizer` in that script. Namely,
adjusting the `max_features` option in this line changes the resulting model:
adjusting the `max_features` value in the line below changes the resulting
model:

```python
bag_of_words = CountVectorizer(stop_words='english',
max_features=6000, ngram_range=(1, 2))
```

This option not only changes the trained model, it also introduces a change that
would cause the `featurize.dvc`, `train.dvc` and `evaluate.dvc` stages to
execute if we ran `dvc repro`. But if we want to try several values for this
option and save only the best result to the cache, we can execute as so:
This edit introduces a change that would cause the `featurize.dvc`, `train.dvc`
and `evaluate.dvc` stages to execute if we ran `dvc repro`. But if we want to
try several values for `max_features` and save only the best result to the
cache, we can run it like this:

```dvc
$ dvc repro --no-commit evaluate.dvc
Expand Down
30 changes: 15 additions & 15 deletions public/static/docs/command-reference/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,15 @@ This command reads and updates the DVC configuration files. By default (if none
of `--local`, `--global`, or `--system` is provided) a project's config
(`.dvc/config`) file is read or modified. This file is by default meant to be
tracked by Git and should not contain sensitive and/or user-specific information
(passwords, SSH keys, etc). Use `--local`, `--global`, or `--system` options
instead to override project's settings, for sensitive, or user-specific
settings.
(passwords, SSH keys, etc). Use `--local`, `--global`, or `--system` command
options (flags) instead to override project's settings, for sensitive, or
user-specific settings.

If the config option `value` is not provided and `--unset` option is not used,
this command returns the current value of the config option, if found in the
corresponding config file.
If the config option `value` is not provided and the `--unset` command option is
not used, this command returns the current value of the config option, if found
in the corresponding config file.

## Options
## Command options (flags)

- `-u`, `--unset` - remove a specified config option from a config file.

Expand Down Expand Up @@ -69,7 +69,7 @@ This is the main section with the general config options:

- `core.interactive` - whether to always ask for confirmation before reproducing
each [stage](/doc/command-reference/run) in `dvc repro`. (Normally, this
behavior requires the use of option `-i` in that command.) Accepts values:
behavior requires using the `-i` option of that command.) Accepts values:
`true` and `false`.

- `core.analytics` - used to turn off
Expand Down Expand Up @@ -124,7 +124,7 @@ for more details.) This section contains the following options:
option on forces you to run `dvc unprotect` before updating a file, providing
an additional layer of security to your data.

We highly recommend enabling this option when `cache.type` is set to
We highly recommend enabling `cache.protected` when `cache.type` is set to
`hardlink` or `symlink`.

- `cache.type` - link type that DVC should use to link data files from cache to
Expand All @@ -137,16 +137,16 @@ for more details.) This section contains the following options:

⚠️ If you manually set `cache.type` to `hardlink` or `symlink`, **you will
corrupt the cache** if you modify tracked data files in the workspace. See the
`cache.protected` config option above and corresponding `dvc unprotect`
command to modify files safely.
`cache.protected` option above, and corresponding `dvc unprotect` command to
modify files safely.

There are pros and cons to different link types. Refer to
[File link types](/doc/user-guide/large-dataset-optimization#file-link-types-for-the-dvc-cache)
for a full explanation of each one.

To apply changes to this option in the workspace, by restoring all file
links/copies from cache, please use `dvc checkout --relink`. See
[checkout options](/doc/command-reference/checkout#options) for more details.
To apply changes to this config option in the workspace, by restoring all file
links/copies from cache, please use `dvc checkout --relink`. See that
command's [options](/doc/command-reference/checkout#options) for more details.

- `cache.slow_link_warning` - used to turn off the warnings about having a slow
cache link type. These warnings are thrown by `dvc pull` and `dvc checkout`
Expand Down Expand Up @@ -223,7 +223,7 @@ $ dvc config core.remote myremote
```

> Note that this is equivalent to using `dvc remote add` with the
> `-d`/`--default` option.
> `-d`/`--default` flag.
## Example: Default remotes

Expand Down
6 changes: 3 additions & 3 deletions public/static/docs/command-reference/diff.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,9 +79,9 @@ Preparing to download data from 'https://remote.dvc.org/get-started'
...
```

The `-T` flag passed to `dvc fetch` makes sure we have all the data files
related to all existing tags in the repo. You may see the available tags of our
example repo [here](https://github.com/iterative/example-get-started/tags).
With the `-T` option, `dvc fetch` makes sure that we have all the data files
related to all existing Git tags in the repo. You may see the available tags of
our example repo [here](https://github.com/iterative/example-get-started/tags).

</details>

Expand Down
2 changes: 1 addition & 1 deletion public/static/docs/command-reference/fetch.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,7 @@ $ dvc status -c

One could do a simple `dvc fetch` to get all the data, but what if you only want
to retrieve the data up to our third stage, `train.dvc`? We can use the
`--with-deps` (or `-d`) flag:
`--with-deps` (or `-d`) option:

```dvc
$ dvc fetch --with-deps train.dvc
Expand Down
7 changes: 4 additions & 3 deletions public/static/docs/command-reference/get-url.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,10 +96,11 @@ same file name:
$ dvc get-url s3://bucket/path
```

By default DVC expects your AWS CLI is already
By default, DVC expects that AWS CLI is already
[configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html).
DVC will be using default AWS credentials file to access S3. To override some of
these settings, you could the options described in `dvc remote modify`.

DVC will use the AWS credentials file to access S3. To override the
configuration, you can the parameters described in `dvc remote modify`.

> We use the `boto3` library to and communicate with AWS. The following API
> methods may be performed:
Expand Down
12 changes: 6 additions & 6 deletions public/static/docs/command-reference/init.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ DVC works on top of a Git repository by default. This enables all features,
providing the most value. It means that `dvc init` (without flags) expects to
run in a Git repository root (a `.git/` directory should be present).

The command options can be used to start an alternative workflow for advanced
scenarios like monorepos, automation, etc:
The command [options](#options) can be used to start an alternative workflow for
advanced scenarios:

- [Initializing DVC in subdirectories](#initializing-dvc-in-subdirectories) -
support for monorepos, nested <abbr>DVC projects</abbr>, etc.
Expand All @@ -38,7 +38,7 @@ single Git repository providing isolation and granular project management.

#### When is this useful?

This option is mostly used in the scenario of a
`--subdir` is mostly used in the scenario of a
[monorepo](https://en.wikipedia.org/wiki/Monorepo), but also can be used in
other workflows when such isolation and/or advanced granularity is needed.

Expand Down Expand Up @@ -118,8 +118,8 @@ won't download or checkout data for the `data-B.dvc` file.

### Initializing DVC without Git

In rare cases, `--no-scm` option might be used to initialize DVC in a directory
that is not part of a Git repository, or to make DVC ignore Git. Examples
In rare cases, the `--no-scm` option might be desirable: to initialize DVC in a
directory that is not part of a Git repo, or to make DVC ignore Git. Examples
include:

- SCM other than Git is being used. Even though there are DVC features that
Expand All @@ -137,7 +137,7 @@ e.g. managing `.gitignore` files on `dvc add` or `dvc run` to avoid committing
DVC-tracked files into Git, or `dvc diff` and `dvc metrics diff` that accept
Git-revisions to compare, etc.

DVC sets the `core.no_scm` option value to `true` in the DVC
DVC sets the `core.no_scm` config option value to `true` in the DVC
[config](/doc/command-reference/config) when it is initialized this way. It
means that even if the project was Git-tracked already or Git is initialized in
it later, DVC keeps operating in the detached from Git mode.
Expand Down
2 changes: 1 addition & 1 deletion public/static/docs/command-reference/metrics/add.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ defines the given `path` as an <abbr>output</abbr>, marking `path` as a metric
file to track.

Note that outputs can also be marked as metrics via the `-m` or `-M` options of
the `dvc run` command.
`dvc run`.

While any text file can be tracked as a metric file, we recommend using TSV,
CSV, or JSON formats. DVC provides a way to parse those formats to get to a
Expand Down
12 changes: 6 additions & 6 deletions public/static/docs/command-reference/metrics/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ positional arguments:
## Description

DVC has the ability to mark a certain stage <abbr>outputs</abbr> as files
containing metrics to track. (See `--metrics` option of `dvc run`.) Metrics are
project-specific numeric values e.g. `AUC`, `ROC`, etc. DVC itself does not
containing metrics to track. (See the `--metrics` option of `dvc run`.) Metrics
are project-specific numeric values e.g. `AUC`, `ROC`, etc. DVC itself does not
ascribe any specific meaning for these numbers. Usually these numbers are
produced by the model evaluation script and serve as a way to compare and pick
the best performing experiment.
Expand Down Expand Up @@ -54,10 +54,10 @@ $ dvc run -d code/evaluate.py -M data/eval.json \
python code/evaluate.py
```

> `-M|--metrics-no-cache` is telling DVC to mark `data/eval.json` as a metric
> file. Using this option is equivalent to using `-O|--outs-no-cache` and then
> running `dvc metrics add data/eval.json` to explicitly mark `data/eval.json`
> as a metric file.
> `-M` (`--metrics-no-cache`) is telling DVC to mark `data/eval.json` as a
> metric file. Using this option is equivalent to using `-O` (`--outs-no-cache`)
> and then running `dvc metrics add data/eval.json` to explicitly mark
> `data/eval.json` as a metric file.
Now let's print metric values that we are tracking in this <abbr>project</abbr>:

Expand Down
9 changes: 5 additions & 4 deletions public/static/docs/command-reference/pull.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ The `dvc pull` command allows one to retrieve data from remote storage.
immediately after that.

The default remote is used (see `dvc config core.remote`) unless the `--remote`
option is used.
option is used. See `dvc remote` for more information on how to configure a
remote.

With no arguments, just `dvc pull` or `dvc pull --remote REMOTE`, it downloads
only the files (or directories) missing from the workspace by searching all
Expand Down Expand Up @@ -134,8 +135,8 @@ default remote. The only files considered in this case are what is listed in the

## Example: With dependencies

Demonstrating the `--with-deps` flag requires a larger example. First, assume a
[pipeline](/doc/command-reference/pipeline) has been setup with these
Demonstrating the `--with-deps` option requires a larger example. First, assume
a [pipeline](/doc/command-reference/pipeline) has been setup with these
[stages](/doc/command-reference/run):

```dvc
Expand Down Expand Up @@ -184,5 +185,5 @@ and searched backwards through the pipeline for data files to download. Because
the `model.p.dvc` stage occurs later, its data was not pulled.

Then we ran `dvc pull` specifying the last stage, `model.p.dvc`, and its data
was downloaded. Finally, we ran `dvc pull` with no options to make sure that all
was downloaded. Finally, we ran `dvc pull` with no flags to make sure that all
data was already pulled with the previous commands.
12 changes: 5 additions & 7 deletions public/static/docs/command-reference/push.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Under the hood a few actions are taken:

- The push command by default uses all
[DVC-files](/doc/user-guide/dvc-file-format) in the <abbr>workspace</abbr>.
The command line options listed below will either limit or expand the set of
The command options listed below will either limit or expand the set of
DVC-files to consult.

- For each <abbr>output</abbr> referenced from each selected DVC-file, DVC finds
Expand All @@ -49,9 +49,7 @@ Under the hood a few actions are taken:
The DVC `push` command always works with a remote storage, and it is an error if
none are specified on the command line nor in the configuration. The default
remote is used (see `dvc config core.remote`) unless the `--remote` option is
used. See `dvc remote`, `dvc config` and this
[example](/doc/get-started/configure) for more information on how to configure a
remote.
used. See `dvc remote` for more information on how to configure a remote.

With no arguments, just `dvc push` or `dvc push --remote REMOTE`, it uploads
only the files (or directories) that are new in the local repository to remote
Expand Down Expand Up @@ -136,8 +134,8 @@ $ dvc push data.zip.dvc

## Example: With dependencies

Demonstrating the `--with-deps` flag requires a larger example. First, assume a
[pipeline](/doc/command-reference/pipeline) has been setup with these
Demonstrating the `--with-deps` option requires a larger example. First, assume
a [pipeline](/doc/command-reference/pipeline) has been setup with these
[stages](/doc/command-reference/run):

```dvc
Expand Down Expand Up @@ -190,7 +188,7 @@ and searched backwards through the pipeline for data files to upload. Because
the `model.p.dvc` stage occurs later, its data was not pushed.

Then we ran `dvc push` specifying the last stage, `model.p.dvc`, and its data
was uploaded. Finally, we ran `dvc push` and `dvc status` with no options to
was uploaded. Finally, we ran `dvc push` and `dvc status` with no flags to
double check that all data had been uploaded.

## Example: What happens in the cache
Expand Down
10 changes: 5 additions & 5 deletions public/static/docs/command-reference/remote/modify.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@ positional arguments:

## Description

Remote `name` and `option` name are required. Option names are remote type
specific. See `dvc remote add` and
[Available settings](#available-settings-per-storage-type) section below for a
Remote `name` and `option` name are required. Option names are specific to the
remote type. See `dvc remote add` and the
[available settings](#available-settings-per-storage-type) section below for a
list of remote storage types.

This command modifies a `remote` section in the project's
Expand All @@ -37,8 +37,8 @@ manual editing could be used to change the configuration.

## Options

- `-u`, `--unset` - delete configuration value for given `option`. Don't provide
a `value` when using this flag.
- `-u`, `--unset` - delete configuration value for the given `option`. Don't
provide a `value` when employing this flag.

- `--global` - save remote configuration to the global config (e.g.
`~/.config/dvc/config`) instead of `.dvc/config`.
Expand Down
Loading

0 comments on commit 01ed81c

Please sign in to comment.