Skip to content

Commit

Permalink
guide: hotfix pipelines links (#3946)
Browse files Browse the repository at this point in the history
* guide: hotfix pipelines links

* guide: add Unexpected behavior section (tmp)
and fix links
  • Loading branch information
jorgeorpinel authored Sep 14, 2022
1 parent 1367692 commit 2643b33
Show file tree
Hide file tree
Showing 18 changed files with 64 additions and 35 deletions.
2 changes: 1 addition & 1 deletion content/docs/command-reference/dag.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ the `dvc.yaml` files found in the <abbr>project</abbr>. Provide a `target` stage
name to show the pipeline up to that point.

[directed acyclic graph]:
/doc/user-guide/data-pipelines/defining-pipelines#directed-acyclic-graph-dag
/doc/user-guide/pipelines/defining-pipelines#directed-acyclic-graph-dag

### Paginating the output

Expand Down
7 changes: 5 additions & 2 deletions content/docs/command-reference/exp/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,11 @@ science/ machine learning experiments.
📖 See [Experiment Management](/doc/user-guide/experiment-management) for more
info.

> ⚠️ Note that DVC assumes that experiments are deterministic (see **Avoiding
> unexpected behavior** in `dvc stage add`).
> ⚠️ Note that DVC assumes that experiments are deterministic (see [Avoiding
> unexpected behavior]).
[avoiding unexpected behavior]:
/doc/user-guide/project-structure/dvcyaml-files#avoiding-unexpected-behavior

## Options

Expand Down
2 changes: 1 addition & 1 deletion content/docs/command-reference/exp/init.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ See the [Pipelines guide] for more on that topic.
/doc/user-guide/project-structure/dvcyaml-files#stage-commands
[checkpoints]: /doc/user-guide/experiment-management/checkpoints
[dvc experiments]: /doc/user-guide/experiment-management/experiments-overview
[pipelines guide]: /doc/user-guide/data-pipelines/defining-pipelines
[pipelines guide]: /doc/user-guide/pipelines/defining-pipelines

## Options

Expand Down
2 changes: 1 addition & 1 deletion content/docs/command-reference/move.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ Often the output of a stage is a dependency in another stage, creating a
[dependency graph]. In this case, you may want to also update the `path` in the
`deps` field of `dvc.yaml`.

[dependency graph]: /doc/user-guide/data-pipelines/defining-pipelines
[dependency graph]: /doc/user-guide/pipelines/defining-pipelines

</admon>

Expand Down
2 changes: 1 addition & 1 deletion content/docs/command-reference/params/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ is outdated upon `dvc repro` (or `dvc status`).
[hyperparameters]:
/doc/user-guide/experiment-management/running-experiments#tuning-hyperparameters
[use the same params file]:
/doc/user-guide/data-pipelines/defining-pipelines#parameter-dependencies
/doc/user-guide/pipelines/defining-pipelines#parameter-dependencies
[more details]: /doc/user-guide/project-structure/dvcyaml-files#parameters
[templating]: /doc/user-guide/project-structure/dvcyaml-files#templating
[stage commands]: /doc/user-guide/project-structure/dvcyaml-files#stage-commands
Expand Down
21 changes: 11 additions & 10 deletions content/docs/command-reference/repro.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ It stores all the data files, intermediate or final results in the
hash values of changed dependencies and outputs in the `dvc.lock` and `.dvc`
files.

[dependency graph]: /doc/user-guide/data-pipelines/defining-pipelines
[dependency graph]: /doc/user-guide/pipelines/defining-pipelines
[always changed]: /doc/command-reference/status#local-workspace-status

### Parallel stage execution
Expand Down Expand Up @@ -160,10 +160,8 @@ up-to-date and only execute the final stage.
option, as all possible targets are already included.

- `--no-run-cache` - execute stage command(s) even if they have already been run
with the same dependencies and outputs (see the
[details](/doc/user-guide/project-structure/internal-files#run-cache)). Useful
for example if the stage command/s is/are non-deterministic
([not recommended](/doc/user-guide/data-pipelines/defining-pipelines#avoiding-unexpected-behavior)).
with the same dependencies and outputs (see the [details]). Useful for example
if the stage command/s is/are non-deterministic ([not recommended]).

- `--force-downstream` - in cases like `... -> A (changed) -> B -> C` it will
reproduce `A` first and then `B`, even if `B` was previously executed with the
Expand All @@ -185,11 +183,8 @@ up-to-date and only execute the final stage.
corresponding pipelines, including the target stages themselves. This option
has no effect if `targets` are not provided.

- `--pull` - attempts to download outputs of stages found in the
[run-cache](/doc/user-guide/project-structure/internal-files#run-cache) during
reproduction. Uses the
[default remote storage](/doc/command-reference/remote/default). See also
`dvc pull`
- `--pull` - attempts to download outputs of stages found in the [run-cache]
during reproduction. Uses the [default remote storage]. See also `dvc pull`

- `-h`, `--help` - prints the usage/help message, and exit.

Expand All @@ -200,6 +195,12 @@ up-to-date and only execute the final stage.

- `-v`, `--verbose` - displays detailed tracing information.

[details]: /doc/user-guide/project-structure/internal-files#run-cache
[not recommended]:
/doc/user-guide/project-structure/dvcyaml-files#avoiding-unexpected-behavior
[run-cache]: /doc/user-guide/project-structure/internal-files#run-cache
[default remote storage]: /doc/command-reference/remote/default

## Examples

> To get hands-on experience with data science and machine learning pipelines,
Expand Down
15 changes: 9 additions & 6 deletions content/docs/command-reference/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ Relevant notes:
[manual process](/doc/command-reference/move#renaming-stage-outputs) to update
`dvc.yaml` and the project's cache accordingly.

[dependency graph]: /doc/user-guide/data-pipelines/defining-pipelines
[dependency graph]: /doc/user-guide/pipelines/defining-pipelines

### For displaying and comparing data science experiments

Expand Down Expand Up @@ -216,10 +216,8 @@ data science experiments.
asking for confirmation.

- `--no-run-cache` - execute the stage command(s) even if they have already been
run with the same dependencies and outputs (see the
[details](/doc/user-guide/project-structure/internal-files#run-cache)). Useful
for example if the stage command/s is/are non-deterministic
([not recommended](/doc/user-guide/data-pipelines/defining-pipelines#avoiding-unexpected-behavior)).
run with the same dependencies and outputs (see the [details]). Useful for
example if the stage command/s is/are non-deterministic ([not recommended]).

- `--no-commit` - do not store the outputs of this execution in the cache
(`dvc.yaml` and `dvc.lock` are still created or updated); useful to avoid
Expand All @@ -231,7 +229,7 @@ data science experiments.
when reproducing the pipeline.

- `--external` - allow writing outputs outside of the DVC repository. See
[Managing External Data](/doc/user-guide/managing-external-data).
[Managing External Data].

- `--desc <text>` - user description of the stage (optional). This doesn't
affect any DVC operations.
Expand All @@ -243,6 +241,11 @@ data science experiments.

- `-v`, `--verbose` - displays detailed tracing information.

[details]: /doc/user-guide/project-structure/internal-files#run-cache
[not recommended]:
/doc/user-guide/project-structure/dvcyaml-files#avoiding-unexpected-behavior
[managing external data]: /doc/user-guide/managing-external-data

## Examples

Let's create a stage (that counts the number of lines in a `test.txt` file):
Expand Down
4 changes: 2 additions & 2 deletions content/docs/command-reference/stage/add.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ graph] and execute them.
See the guide on [defining pipeline stages] for more details.

[defining pipeline stages]:
/doc/user-guide/data-pipelines/defining-pipelines#pipelines
/doc/user-guide/pipelines/defining-pipelines#pipelines

</admon>

Expand Down Expand Up @@ -111,7 +111,7 @@ Relevant notes:
[manual process](/doc/command-reference/move#renaming-stage-outputs) to update
`dvc.yaml` and the project's cache accordingly.

[dependency graph]: /doc/user-guide/data-pipelines/defining-pipelines
[dependency graph]: /doc/user-guide/pipelines/defining-pipelines

### For displaying and comparing data science experiments

Expand Down
2 changes: 1 addition & 1 deletion content/docs/command-reference/stage/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,4 @@ organize data science projects, or build detailed machine learning pipelines.
examine `dvc.yaml` files manually.

Learn more about
[defining stages](/doc/user-guide/data-pipelines/defining-pipelines#stages).
[defining stages](/doc/user-guide/pipelines/defining-pipelines#stages).
2 changes: 1 addition & 1 deletion content/docs/start/data-management/pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ $ dvc stage add -n featurize \

The `dvc.yaml` file is updated automatically and should include two stages now.

[dag]: /doc/user-guide/data-pipelines/defining-pipelines
[dag]: /doc/user-guide/pipelines/defining-pipelines

<details id="pipeline-expand-to-see-what-happens-under-the-hood">

Expand Down
3 changes: 1 addition & 2 deletions content/docs/user-guide/basic-concepts/pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,5 @@ tooltip: >-
YAML format ([`dvc.yaml`](/doc/user-guide/project-structure/dvcyaml-files)).
This guarantees DVC can reproduce them consistently. DVC also helps automate
their execution and caches their results. See [Defining
Pipelines](/doc/user-guide/data-pipelines/defining-pipelines) for more
details.
Pipelines](/doc/user-guide/pipelines/defining-pipelines) for more details.
---
2 changes: 1 addition & 1 deletion content/docs/user-guide/basic-concepts/stage.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ tooltip: >-
some milestone as part of your project's workflow. For example, `python
train.py` may generate a machine learning model. DVC stages include data
input(s) and resulting output(s), if any. [Learn
more](/doc/user-guide/data-pipelines/defining-pipelines#stages).
more](/doc/user-guide/pipelines/defining-pipelines#stages).
---
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ once.
> 📖 `dvc exp run` is an experiment-specific alternative to `dvc repro`.
[reproduction targets]: /doc/command-reference/repro#options
[dependency graph]: /doc/user-guide/data-pipelines/defining-pipelines
[dependency graph]: /doc/user-guide/pipelines/defining-pipelines

## Tuning (hyper)parameters

Expand Down
2 changes: 1 addition & 1 deletion content/docs/user-guide/pipelines/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ consistent to reproduce.
See [Get Started: Data Pipelines](/doc/start/data-management/pipelines) for a
hands-on introduction to this topic.

[define]: /doc/user-guide/data-pipelines/defining-pipelines
[define]: /doc/user-guide/pipelines/defining-pipelines
23 changes: 23 additions & 0 deletions content/docs/user-guide/project-structure/dvcyaml-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,29 @@ parametrize `cmd` strings.

</admon>

<details>

### 💡 Avoiding unexpected behavior

We don't want to tell anyone how to write their code or what programs to use!
However, please be aware that in order to prevent unexpected results when DVC
reproduces pipeline stages, the underlying code should ideally follow these
rules:

- Read/write exclusively from/to the specified <abbr>dependencies</abbr> and
<abbr>outputs</abbr> (including parameters files, metrics, and plots).
- Completely rewrite outputs. Do not append or edit.
- Stop reading and writing files when the `command` exits.

Also, if your pipeline reproducibility goals include consistent output data, its
code should be
[deterministic](https://en.wikipedia.org/wiki/Deterministic_algorithm) (produce
the same output for any given input): avoid code that increases
[entropy](https://en.wikipedia.org/wiki/Software_entropy) (e.g. random numbers,
time functions, hardware dependencies, etc.).

</details>

### Parameters

<abbr>Parameters</abbr> are simple key/value pairs consumed by the `command`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -168,4 +168,4 @@ run-cache to remote storage for sharing and/or as a back up.
> [Avoiding unexpected behavior]).
[avoiding unexpected behavior]:
/doc/user-guide/data-pipelines/defining-pipelines#avoiding-unexpected-behavior
/doc/user-guide/project-structure/dvcyaml-files#avoiding-unexpected-behavior
4 changes: 2 additions & 2 deletions content/docs/user-guide/related-technologies.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ _Luigi_, etc.
- See also our sister project, [CML](https://cml.dev/), that helps fill some of
these gaps.

[dependency graphs]: /doc/user-guide/data-pipelines/defining-pipelines
[dependency graphs]: /doc/user-guide/pipelines/defining-pipelines

## Experiment management software

Expand Down Expand Up @@ -133,4 +133,4 @@ _Luigi_, etc.
> technical details (Linux).
[directed acyclic graph]:
/doc/user-guide/data-pipelines/defining-pipelines#directed-acyclic-graph-dag
/doc/user-guide/pipelines/defining-pipelines#directed-acyclic-graph-dag
2 changes: 1 addition & 1 deletion content/docs/user-guide/what-is-dvc.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ can version experiments, manage large datasets, and make projects reproducible.
[free]: https://github.com/iterative/dvc/blob/master/LICENSE
[vs code extension]: /doc/vs-code-extension
[command line]: /doc/command-reference
[pipelines]: /doc/user-guide/data-pipelines
[pipelines]: /doc/user-guide/pipelines

## DVC does not replace Git!

Expand Down

0 comments on commit 2643b33

Please sign in to comment.