diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md index 5da65adcd6..49f9f25c7a 100644 --- a/content/docs/command-reference/checkout.md +++ b/content/docs/command-reference/checkout.md @@ -112,9 +112,7 @@ pipeline stages, such as the DVC project created for the [Get Started](/doc/start). Then we can see what happens with `git checkout` and `dvc checkout` as we switch from tag to tag. -
- -### Click and expand to set up the project +
Start by cloning our example repo if you don't already have it: diff --git a/content/docs/command-reference/diff.md b/content/docs/command-reference/diff.md index de0865360e..5a377e11be 100644 --- a/content/docs/command-reference/diff.md +++ b/content/docs/command-reference/diff.md @@ -82,9 +82,7 @@ for example when `dvc init` was used with the `--no-scm` option. For these examples we can use the [Get Started](/doc/start) project. -
- -### Click and expand to set up the project to run examples +
Start by cloning our example repo if you don't already have it: @@ -119,9 +117,7 @@ $ dvc diff ## Example: Comparing workspace with arbitrary commits -
- -### Click and expand to set up the example +
Let's checkout the [2-track-data](https://github.com/iterative/example-get-started/releases/tag/2-track-data) @@ -149,9 +145,7 @@ files summary: 1 added, 0 deleted, 0 modified ## Example: Comparing tags or branches -
- -### Click and expand to set up the example +
Our example repository has the `baseline-experiment` and `bigrams-experiment` [tags](https://github.com/iterative/example-get-started/tags) tags, that @@ -223,9 +217,7 @@ It outputs: ## Example: Renamed files -
- -### Click and expand to set up the example +
Having followed the previous examples' setup, move into the `example-get-started/` directory. Then make sure that you have the latest code diff --git a/content/docs/command-reference/exp/run.md b/content/docs/command-reference/exp/run.md index 0844b93c0c..14da82d28d 100644 --- a/content/docs/command-reference/exp/run.md +++ b/content/docs/command-reference/exp/run.md @@ -126,9 +126,7 @@ committing them to the Git repo. Unnecessary ones can be [cleared] with > This is based on our [Get Started](/doc/start/experiments), where you can find > the actual source code. -
- -### Expand to prepare the example ML project +
Clone the DVC repo and download the data it depends on: diff --git a/content/docs/command-reference/fetch.md b/content/docs/command-reference/fetch.md index 17e2f872dd..1ac3ddc372 100644 --- a/content/docs/command-reference/fetch.md +++ b/content/docs/command-reference/fetch.md @@ -117,7 +117,7 @@ pipeline stages, such as the DVC project created for the [Get Started](/doc/start). Then we can see what `dvc fetch` does in different scenarios. -
+
### Click and expand to set up the project diff --git a/content/docs/command-reference/get-url.md b/content/docs/command-reference/get-url.md index 031911669e..39cd79614c 100644 --- a/content/docs/command-reference/get-url.md +++ b/content/docs/command-reference/get-url.md @@ -79,9 +79,7 @@ $ wget https://example.com/path/to/data.csv ## Examples -
- -### Amazon S3 +
This command will copy an S3 object into the current working directory with the same file name: @@ -106,9 +104,7 @@ configuration, you can use the parameters described in `dvc remote modify`.
-
- -### Google Cloud Storage +
```dvc $ dvc get-url gs://bucket/path file @@ -118,9 +114,7 @@ The above command downloads the `/path` file (or directory) into `./file`.
-
- -### SSH +
```dvc $ dvc get-url ssh://user@example.com/path/to/data @@ -131,9 +125,7 @@ directory).
-
- -### HDFS +
```dvc $ dvc get-url hdfs://user@example.com/path/to/file @@ -141,9 +133,7 @@ $ dvc get-url hdfs://user@example.com/path/to/file
-
- -### HTTP +
> Both HTTP and HTTPS protocols are supported. @@ -153,9 +143,7 @@ $ dvc get-url https://example.com/path/to/file
-
- -### WebHDFS +
```dvc $ dvc get-url webhdfs://user@example.com/path/to/file @@ -163,9 +151,7 @@ $ dvc get-url webhdfs://user@example.com/path/to/file
-
- -### local +
```dvc $ dvc get-url /local/path/to/data diff --git a/content/docs/command-reference/import-url.md b/content/docs/command-reference/import-url.md index 0134d03db8..3045fb6368 100644 --- a/content/docs/command-reference/import-url.md +++ b/content/docs/command-reference/import-url.md @@ -168,9 +168,7 @@ produces a regular stage in `dvc.yaml`. To illustrate these examples we will be using the project explained in the [Get Started](/doc/start). -
- -### Click and expand to set up example +
Start by cloning our example repo if you don't already have it. Then move into the repo and checkout the @@ -285,9 +283,7 @@ $ unzip code.zip $ rm -f code.zip ``` -
- -### Click and expand to set up the environment +
Let's install the requirements. But before we do that, we **strongly** recommend creating a diff --git a/content/docs/command-reference/install.md b/content/docs/command-reference/install.md index 1986134f4c..2b12f47bc2 100644 --- a/content/docs/command-reference/install.md +++ b/content/docs/command-reference/install.md @@ -136,9 +136,7 @@ pipeline stages, such as the DVC project created in our [Get Started](/doc/start) section. Then we can see what happens with `dvc install` in different situations. -
- -### Click and expand to set up the project +
Start by cloning our example repo if you don't already have it: diff --git a/content/docs/command-reference/plots/show.md b/content/docs/command-reference/plots/show.md index 23e3c29b41..f78b7c21a3 100644 --- a/content/docs/command-reference/plots/show.md +++ b/content/docs/command-reference/plots/show.md @@ -97,9 +97,7 @@ We'll use tabular metrics file `train.json` for this example: } ``` -
- -### Expand for YAML format +
Here's a corresponding `train.yaml` metrics file: @@ -154,9 +152,7 @@ epoch,accuracy,loss,val_accuracy,val_loss 7,0.9954,0.01396906608727198,0.9802,0.07247738889862157 ``` -
- -### Expand for TSV format +
Here's a corresponding `train.tsv` metrics file: diff --git a/content/docs/command-reference/pull.md b/content/docs/command-reference/pull.md index f33a59c7c6..740b52251f 100644 --- a/content/docs/command-reference/pull.md +++ b/content/docs/command-reference/pull.md @@ -139,9 +139,7 @@ Let's employ a simple workspace with some data, code, ML models, pipeline stages, such as the DVC project created for the [Get Started](/doc/start). Then we can see what happens with `dvc pull`. -
- -### Click and expand to set up the project +
Start by cloning our example repo if you don't already have it: diff --git a/content/docs/command-reference/remote/add.md b/content/docs/command-reference/remote/add.md index 9242b1500a..a0485d4822 100644 --- a/content/docs/command-reference/remote/add.md +++ b/content/docs/command-reference/remote/add.md @@ -88,9 +88,7 @@ DVC will determine the [type of remote](#supported-storage-types) based on the The following are the types of remote storage (protocols) supported: -
- -### Amazon S3 +
> 💡 Before adding an S3 remote, be sure to > [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html). @@ -113,9 +111,7 @@ methods that are performed by DVC (`list_objects_v2` or `list_objects`,
-
- -### S3-compatible storage +
For object storage that supports an S3-compatible API (e.g. [Minio](https://min.io/), @@ -141,9 +137,7 @@ they're effective depends on each storage platform.
-
- -### Microsoft Azure Blob Storage +
```dvc $ dvc remote add -d myremote azure://mycontainer/path @@ -163,9 +157,7 @@ To use a custom authentication method, use the parameters described in
-
- -### Google Drive +
To start using a GDrive remote, first add it with a [valid URL format](/doc/user-guide/setup-google-drive-remote#url-format). Then @@ -197,9 +189,7 @@ modified.
-
- -### Google Cloud Storage +
> 💡 Before adding a GC Storage remote, be sure to > [Create a storage bucket](https://cloud.google.com/storage/docs/creating-buckets). @@ -219,9 +209,7 @@ parameters, use the parameters described in `dvc remote modify`.
-
- -### Aliyun OSS +
First you need to set up OSS storage on Aliyun Cloud. Then, use an S3 style URL for OSS storage, and configure the @@ -261,9 +249,7 @@ $ export OSS_ACCESS_KEY_SECRET='mysecret'
-
- -### SSH +
```dvc $ dvc remote add -d myremote ssh://user@example.com/path @@ -279,9 +265,7 @@ Please check that you are able to connect both ways with tools like `ssh` and
-
- -### HDFS +
⚠ī¸ Using HDFS with a Hadoop cluster might require additional setup. Our assumption is that the client is set up to use it. Specifically, [`libhdfs`] @@ -301,9 +285,7 @@ $ dvc remote add -d myremote hdfs://user@example.com/path
-
- -### WebHDFS +
⚠ī¸ Using WebHDFS requires to enable REST API access in the cluster: set the config property `dfs.webhdfs.enabled` to `true` in `hdfs-site.xml`. @@ -329,9 +311,7 @@ active kerberos session.
-
- -### HTTP +
```dvc $ dvc remote add -d myremote https://example.com/path @@ -341,9 +321,7 @@ $ dvc remote add -d myremote https://example.com/path
-
- -### WebDAV +
```dvc $ dvc remote add -d myremote \ @@ -362,9 +340,7 @@ $ dvc remote add -d myremote \
-
- -### local remote +
A "local remote" is a directory in the machine's file system. Not to be confused with the `--local` option of `dvc remote` (and other config) commands! diff --git a/content/docs/command-reference/remote/index.md b/content/docs/command-reference/remote/index.md index b67fb78978..db8237b55c 100644 --- a/content/docs/command-reference/remote/index.md +++ b/content/docs/command-reference/remote/index.md @@ -74,9 +74,7 @@ manually. ## Example: Add a default local remote -
- -### What is a "local remote" ? +
While the term may seem contradictory, it doesn't have to be. The "local" part refers to the type of location where the storage is: another directory in the diff --git a/content/docs/command-reference/remote/list.md b/content/docs/command-reference/remote/list.md index 5930d146ec..240f2b3672 100644 --- a/content/docs/command-reference/remote/list.md +++ b/content/docs/command-reference/remote/list.md @@ -40,9 +40,7 @@ local config files (in that order). For simplicity, let's add a default local remote: -
- -### What is a "local remote" ? +
While the term may seem contradictory, it doesn't have to be. The "local" part refers to the type of location where the storage is: another directory in the diff --git a/content/docs/command-reference/remote/modify.md b/content/docs/command-reference/remote/modify.md index 3cc0021dc9..4d3a5c0b12 100644 --- a/content/docs/command-reference/remote/modify.md +++ b/content/docs/command-reference/remote/modify.md @@ -103,9 +103,7 @@ The following config options are available for all remote types: The following are the types of remote storage (protocols) and their config options: -
- -### Amazon S3 +
- `url` - remote location, in the `s3:///` format: @@ -331,9 +329,7 @@ For more on the supported env vars, please see the
-
- -### S3-compatible storage +
- `endpointurl` - URL to connect to the S3-compatible storage server or service (e.g. [Minio](https://min.io/), @@ -350,9 +346,7 @@ storage. Whether they're effective depends on each storage platform.
-
- -### Microsoft Azure Blob Storage +
> If any values given to the parameters below contain sensitive user info, add > them with the `--local` option, so they're written to a Git-ignored config @@ -382,9 +376,7 @@ a Microsoft application. [default credential]: https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential -
- -#### For Windows users +
When using default authentication, you may need to enable some of these exclusion parameters depending on your setup @@ -533,9 +525,7 @@ can propagate from an Azure configuration file (typically managed with
-
- -### Google Drive +
> If any values given to the parameters below contain sensitive user info, add > them with the `--local` option, so they're written to a Git-ignored config @@ -636,9 +626,7 @@ more information.
-
- -### Google Cloud Storage +
> If any values given to the parameters below contain sensitive user info, add > them with the `--local` option, so they're written to a Git-ignored config @@ -683,9 +671,7 @@ $ export GOOGLE_APPLICATION_CREDENTIALS='.../project-XXX.json'
-
- -### Aliyun OSS +
> If any values given to the parameters below contain sensitive user info, add > them with the `--local` option, so they're written to a Git-ignored config @@ -729,9 +715,7 @@ $ export OSS_ENDPOINT='endpoint'
-
- -### SSH +
> If any values given to the parameters below contain sensitive user info, add > them with the `--local` option, so they're written to a Git-ignored config @@ -822,9 +806,7 @@ $ export OSS_ENDPOINT='endpoint'
-
- -### HDFS +
💡 Using a HDFS cluster as remote storage is also supported via the WebHDFS API. Read more about by expanding the WebHDFS section in @@ -856,9 +838,7 @@ Read more about by expanding the WebHDFS section in
-
- -### WebHDFS +
💡 WebHDFS serves as an alternative for using the same remote storage supported by HDFS. Read more about by expanding the WebHDFS section in @@ -945,9 +925,7 @@ by HDFS. Read more about by expanding the WebHDFS section in
-
- -### HTTP +
> If any values given to the parameters below contain sensitive user info, add > them with the `--local` option, so they're written to a Git-ignored config @@ -1037,9 +1015,7 @@ by HDFS. Read more about by expanding the WebHDFS section in
-
- -### WebDAV +
> If any values given to the parameters below contain sensitive user info, add > them with the `--local` option, so they're written to a Git-ignored config diff --git a/content/docs/command-reference/run.md b/content/docs/command-reference/run.md index 3f550cb399..8a6ed08604 100644 --- a/content/docs/command-reference/run.md +++ b/content/docs/command-reference/run.md @@ -38,9 +38,7 @@ the required [`command` argument](#the-command-argument). `dvc run` executes stage commands, unless the `--no-exec` option is used. -
- -### 💡 Avoiding unexpected behavior +
We don't want to tell anyone how to write their code or what programs to use! However, please be aware that in order to prevent unexpected results when DVC diff --git a/content/docs/command-reference/stage/add.md b/content/docs/command-reference/stage/add.md index 42deffba2f..338372a052 100644 --- a/content/docs/command-reference/stage/add.md +++ b/content/docs/command-reference/stage/add.md @@ -36,9 +36,7 @@ Stages whose dependencies are outputs from other stages form [pipelines](/doc/command-reference/dag). `dvc repro` can be used to rebuild their dependency graph, and execute them. -
- -### 💡 Avoiding unexpected behavior +
We don't want to tell anyone how to write their code or what programs to use! However, please be aware that in order to prevent unexpected results when DVC diff --git a/content/docs/install/linux.md b/content/docs/install/linux.md index fa704b10e8..ed139329a5 100644 --- a/content/docs/install/linux.md +++ b/content/docs/install/linux.md @@ -22,9 +22,7 @@ plan to use, you might need to install optional dependencies: `[s3]`, `[gdrive]`, `[gs]`, `[azure]`, `[ssh]`, `[hdfs]`, `[webdav]`, `[oss]`. Use `[all]` to include them all. -
- -### Example: with support for Amazon S3 storage +
```dvc $ pip install "dvc[s3]" @@ -53,9 +51,7 @@ Depending on the type of the [remote storage](/doc/command-reference/remote) you plan to use, you might need to install optional dependencies: `dvc-s3`, `dvc-azure`, `dvc-gdrive`, `dvc-gs`, `dvc-oss`, `dvc-ssh`. -
- -### Example: with support for Amazon S3 storage +
```dvc $ conda install -c conda-forge mamba @@ -79,9 +75,7 @@ $ snap install --classic dvc ## Install from repository -
- -### On Debian/Ubuntu +
```dvc $ sudo wget \ @@ -94,9 +88,7 @@ $ sudo apt install dvc
-
- -### On Fedora/CentOS +
```dvc $ sudo wget \ @@ -115,9 +107,7 @@ Get the binary package from the big "Download" button on the [home page](/), or from the [release page](https://github.com/iterative/dvc/releases/) on GitHub. Then install it with the following command. -
- -### On Debian/Ubuntu +
```dvc $ sudo apt install ./dvc_0.62.1_amd64.deb @@ -125,9 +115,7 @@ $ sudo apt install ./dvc_0.62.1_amd64.deb
-
- -### On Fedora/CentOS +
```dvc $ sudo yum install dvc-0.62.1-1.x86_64.rpm diff --git a/content/docs/install/macos.md b/content/docs/install/macos.md index 186996f38a..14ab4d9d66 100644 --- a/content/docs/install/macos.md +++ b/content/docs/install/macos.md @@ -43,9 +43,7 @@ plan to use, you might need to install optional dependencies: `[s3]`, `[gdrive]`, `[gs]`, `[azure]`, `[ssh]`, `[hdfs]`, `[webdav]`, `[oss]`. Use `[all]` to include them all. -
- -### Example: with support for Amazon S3 storage +
```dvc $ pip install "dvc[s3]" @@ -69,9 +67,7 @@ Depending on the type of the [remote storage](/doc/command-reference/remote) you plan to use, you might need to install optional dependencies: `dvc-s3`, `dvc-azure`, `dvc-gdrive`, `dvc-gs`, `dvc-oss`, `dvc-ssh`. -
- -### Example: with support for Amazon S3 storage +
```dvc $ conda install -c conda-forge mamba # installs much faster than conda diff --git a/content/docs/install/windows.md b/content/docs/install/windows.md index 471fefb2a7..c84393ae72 100644 --- a/content/docs/install/windows.md +++ b/content/docs/install/windows.md @@ -35,9 +35,7 @@ Depending on the type of the [remote storage](/doc/command-reference/remote) you plan to use, you might need to install optional dependencies: `dvc-s3`, `dvc-azure`, `dvc-gdrive`, `dvc-gs`, `dvc-oss`, `dvc-ssh`. -
- -### Example: with support for Amazon S3 storage +
```dvc $ conda install -c conda-forge mamba # installs much faster than conda @@ -61,9 +59,7 @@ Depending on the type of the [remote storage](/doc/command-reference/remote) you plan to use, you might need to install optional dependencies: `[s3]`, `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Use `[all]` to include them all. -
- -### Example: with support for Amazon S3 storage +
```dvc $ pip install "dvc[s3]" diff --git a/content/docs/start/data-and-model-access.md b/content/docs/start/data-and-model-access.md index 1fe2092183..c2360e6a54 100644 --- a/content/docs/start/data-and-model-access.md +++ b/content/docs/start/data-and-model-access.md @@ -73,9 +73,7 @@ This is similar to `dvc get` + `dvc add`, but the resulting `.dvc` files includes metadata to track changes in the source repository. This allows you to bring in changes from the data source later using `dvc update`. -
- -#### 💡 Expand to see what happens under the hood. +
> Note that the > [dataset registry](https://github.com/iterative/dataset-registry) repository diff --git a/content/docs/start/data-and-model-versioning.md b/content/docs/start/data-and-model-versioning.md index 3534c643dd..fd8730cc69 100644 --- a/content/docs/start/data-and-model-versioning.md +++ b/content/docs/start/data-and-model-versioning.md @@ -19,9 +19,7 @@ or watch our video to learn about versioning data with DVC! https://youtu.be/kLKBcPonMYw -
- -### ⚙ī¸ Expand to get an example dataset. +
Having initialized a project in the previous section, we can get the data file (which we'll be using later) like this: @@ -59,9 +57,7 @@ $ git commit -m "Add raw data" The data, meanwhile, is listed in `.gitignore`. -
- -### 💡 Expand to see what happens under the hood. +
`dvc add` moved the data to the project's cache, and linked it back to the workspace. The `.dvc/cache` @@ -104,9 +100,7 @@ $ git commit -m "Configure remote storage" > Drive, Azure Blob Storage, and HDFS. See `dvc remote add` for more details and > examples. -
- -### ⚙ī¸ Expand to set up remote storage. +
DVC remotes let you store a copy of the data tracked by DVC outside of the local cache (usually a cloud storage service). For simplicity, let's set up a _local @@ -145,9 +139,7 @@ $ dvc push Usually, we also want to `git commit` and `git push` the corresponding `.dvc` files. -
- -### 💡 Expand to see what happens under the hood. +
`dvc push` copied the data cached locally to the remote storage we set up earlier. The remote storage directory should look like this: @@ -166,9 +158,7 @@ Having DVC-tracked data and models stored remotely, it can be downloaded when needed in other copies of this project with `dvc pull`. Usually, we run it after `git clone` and `git pull`. -
- -### ⚙ī¸ Expand to delete locally cached data. +
If you've run `dvc push`, you can delete the cache (`.dvc/cache`) and `data/data.xml` to experiment with `dvc pull`: @@ -205,9 +195,7 @@ $ dvc pull When you make a change to a file or directory, run `dvc add` again to track the latest version: -
- -### ⚙ī¸ Expand to make some changes. +
Let's say we obtained more data from some external source. We can pretend this is the case by doubling the dataset: @@ -254,9 +242,7 @@ $ git checkout <...> $ dvc checkout ``` -
- -### ⚙ī¸ Expand to get the previous version of the dataset. +
Let's go back to the original version of the data: diff --git a/content/docs/start/data-pipelines.md b/content/docs/start/data-pipelines.md index 436172037d..4efc9f633e 100644 --- a/content/docs/start/data-pipelines.md +++ b/content/docs/start/data-pipelines.md @@ -28,9 +28,7 @@ with Git) which form the steps of a _pipeline_. Stages also connect code to its corresponding data _input_ and _output_. Let's transform a Python script into a [stage](/doc/command-reference/run): -
- -### ⚙ī¸ Expand to download example code. +
Get the sample code like this: @@ -79,9 +77,7 @@ DVC uses these metafiles to track the data used and produced by the stage, so there's no need to use `dvc add` on `data/prepared` [manually](/doc/start/data-and-model-versioning). -
- -### 💡 Expand to see what happens under the hood. +
The command options used above mean the following: @@ -173,9 +169,7 @@ $ dvc run -n featurize \ The `dvc.yaml` file is updated automatically and should include two stages now. -
- -### 💡 Expand to see what happens under the hood. +
The changes to the `dvc.yaml` should look like this: @@ -205,9 +199,7 @@ The changes to the `dvc.yaml` should look like this:
-
- -### ⚙ī¸ Expand to add more stages. +
Let's add the training itself. Nothing new this time; just the same `dvc run` command with the same set of options: @@ -236,9 +228,7 @@ reproduce a pipeline: $ dvc repro ``` -
- -### ⚙ī¸ Expand to have some fun with it. +
Let's try to play a little bit with it. First, let's try to change one of the parameters for the training stage: @@ -272,9 +262,7 @@ it also doesn't rerun `train`! The previous run with the same set of inputs
-
- -### 💡 Expand to see what happens under the hood. +
`dvc repro` relies on the DAG definition from `dvc.yaml`, and uses `dvc.lock` to determine what exactly needs to be run. diff --git a/content/docs/start/experiments.md b/content/docs/start/experiments.md index eddbff5fe1..14d3a1c554 100644 --- a/content/docs/start/experiments.md +++ b/content/docs/start/experiments.md @@ -19,9 +19,7 @@ the [`example-dvc-experiments`][ede] project. [ede]: https://github.com/iterative/example-dvc-experiments -
- -## ⚙ī¸ Initializing a project with DVC experiments +
If you already have a DVC project, that's great. You can start to use `dvc exp` commands right away to run experiments in your project. (See the [User Guide] @@ -68,9 +66,7 @@ models, plots, and metrics in the respective directories. The experiment is then associated with the values found in the parameters file (`params.yaml`) and other dependencies, as well as the metrics produced. -
- -### ℹī¸ More information about (Hyper)parameters +
It's pretty common for data science projects to include configuration files that define adjustable parameters to train a model, adjust model architecture, do @@ -127,9 +123,7 @@ Experiment results have been applied to your workspace. ... ``` -
- -### ⚙ī¸ Run multiple experiments in parallel +
Instead of running the experiments one-by-one, we can define them to run in a batch. This is especially handy when you have long running experiments. @@ -206,9 +200,7 @@ $ dvc exp show --drop 'Created|train|loss' ─────────────────────────────────────────────────────────────── ``` -
- -### ℹī¸ More information about metrics +
Metrics are what you use to evaluate your models. DVC associates metrics to experiments for later comparison. Any scalar value can be used as a metric. You diff --git a/content/docs/start/index.md b/content/docs/start/index.md index d25ae77fd4..6da05c8683 100644 --- a/content/docs/start/index.md +++ b/content/docs/start/index.md @@ -9,9 +9,7 @@ manage experiments.' Assuming DVC is already [installed](/doc/install), let's initialize it by running `dvc init` inside a Git project: -
- -### ⚙ī¸ Expand to prepare the project. +
In expandable sections that start with the ⚙ī¸ emoji, we'll be providing more information for those trying to run the commands. It's up to you to pick the diff --git a/content/docs/start/metrics-parameters-plots.md b/content/docs/start/metrics-parameters-plots.md index fc6cd0927f..e24adcf290 100644 --- a/content/docs/start/metrics-parameters-plots.md +++ b/content/docs/start/metrics-parameters-plots.md @@ -29,9 +29,7 @@ $ dvc run -n evaluate \ data/features scores.json prc.json roc.json ``` -
- -### 💡 Expand to see what happens under the hood. +
The `-M` option here specifies a metrics file, while `--plots-no-cache` specifies a plots file (produced by this stage) which will not be @@ -164,9 +162,7 @@ featurize: - data/features ``` -
- -### ⚙ī¸ Expand to recall how it was generated. +
The `featurize` stage [was created](/doc/start/data-pipelines#dependency-graphs-dags) with this diff --git a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md index 0d807cd230..a196d1fbdb 100644 --- a/content/docs/use-cases/versioning-data-and-model-files/tutorial.md +++ b/content/docs/use-cases/versioning-data-and-model-files/tutorial.md @@ -63,9 +63,7 @@ $ source .env/bin/activate $ pip install -r requirements.txt ``` -
- -### Expand to learn about DVC internals +
The repository you cloned is already DVC-initialized. It already contains a `.dvc/` directory with the `config` and `.gitignore` files. These and other @@ -160,9 +158,7 @@ $ git commit -m "First model, trained with 1000 images" $ git tag -a "v1.0" -m "model v1.0, 1000 images" ``` -
- -### Expand to learn more about how DVC works +
As we mentioned briefly, DVC does not commit the `data/` directory and `model.h5` file with Git. Instead, `dvc add` stores them in the diff --git a/content/docs/user-guide/contributing/core.md b/content/docs/user-guide/contributing/core.md index df4ae71f05..d2c122e2bc 100644 --- a/content/docs/user-guide/contributing/core.md +++ b/content/docs/user-guide/contributing/core.md @@ -174,9 +174,7 @@ If one of your colleagues has already gone through this guide, you could just ask for their `remotes_env` file and Google Cloud credentials, and skip any env manipulations below. -
- -### Amazon S3 +
Install [aws cli](https://docs.aws.amazon.com/en_us/cli/latest/userguide/cli-chap-install.html) @@ -193,9 +191,7 @@ $ export DVC_TEST_AWS_REPO_BUCKET="...TEST-S3-BUCKET..."
-
- -### Microsoft Azure Blob Storage +
Install [Node.js](https://nodejs.org/en/download/) and then install and run Azurite: @@ -215,9 +211,7 @@ $ export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=http;AccountN
-
- -### Google Drive +
> 💡 Please remember that Google Drive access tokens are personal credentials > and should not be shared with anyone, otherwise risking unauthorized usage of @@ -236,9 +230,7 @@ $ export GDRIVE_USER_CREDENTIALS_DATA='mysecret'
-
- -### Google Cloud Storage +
Go through the [quick start](https://cloud.google.com/sdk/docs/quickstarts) for your OS. After that, you should have the `gcloud` command line tool available, @@ -276,9 +268,7 @@ may use different names.
-
- -### HDFS +
Tests currently only work on Linux. First you need to set up passwordless SSH auth to localhost: diff --git a/content/docs/user-guide/experiment-management/checkpoints.md b/content/docs/user-guide/experiment-management/checkpoints.md index 82b975b762..0289dfce7e 100644 --- a/content/docs/user-guide/experiment-management/checkpoints.md +++ b/content/docs/user-guide/experiment-management/checkpoints.md @@ -19,9 +19,7 @@ them using the `--rev` and `--reset` options of `dvc exp run` (see also the [execution]: /doc/user-guide/experiment-management/running-experiments#checkpoint-experiments -
- -### ⚙ī¸ How are checkpoints captured? +
Instead of a single reference like [regular experiments], checkpoint experiments have multiple commits under the custom Git reference (in `.git/refs/exps`), @@ -41,9 +39,7 @@ dataset. https://youtu.be/PcDo-hCvYpw -
- -### ⚙ī¸ Setting up the project +
You can follow along with the steps here or you can clone the repo directly from GitHub and play with it. To clone the repo, run the following commands. diff --git a/content/docs/user-guide/experiment-management/experiments-overview.md b/content/docs/user-guide/experiment-management/experiments-overview.md index 0eff5fbe53..3066616a4f 100644 --- a/content/docs/user-guide/experiment-management/experiments-overview.md +++ b/content/docs/user-guide/experiment-management/experiments-overview.md @@ -11,9 +11,7 @@ with temporary commits and branches. [run]: /doc/user-guide/experiment-management/running-experiments -
- -### ⚙ī¸ How does DVC track experiments? +
Experiments are custom [Git references](/blog/experiment-refs) (found in `.git/refs/exps`) with one or more commits based on `HEAD`. These commits are diff --git a/content/docs/user-guide/experiment-management/running-experiments.md b/content/docs/user-guide/experiment-management/running-experiments.md index 7269573e8e..c70afdc46b 100644 --- a/content/docs/user-guide/experiment-management/running-experiments.md +++ b/content/docs/user-guide/experiment-management/running-experiments.md @@ -112,9 +112,7 @@ $ dvc exp run --queue -S units=256 Queued experiment '4109ead' for future execution. ``` -
- -### How are experiments queued? +
Queued experiments are created similar to [Git stash](https://www.git-scm.com/docs/git-stash). The last experiment queued @@ -138,9 +136,7 @@ Their execution happens outside your workspace in temporary directories for isolation, so each experiment is derived from the workspace at the time it was queued. -
- -### How are experiments isolated? +
DVC creates a copy of the experiment's original workspace in `.dvc/tmp/exps/` and runs it there. All workspaces share the single project cache, diff --git a/content/docs/user-guide/external-dependencies.md b/content/docs/user-guide/external-dependencies.md index 15809de746..2b0f381eb3 100644 --- a/content/docs/user-guide/external-dependencies.md +++ b/content/docs/user-guide/external-dependencies.md @@ -40,9 +40,7 @@ downloads a file from an external location, on all the supported location types. > See the [Remote alias example](#example-using-dvc-remote-aliases) for info. on > using remote locations that require manual authentication setup. -
- -### Amazon S3 +
```dvc $ dvc run -n download_file \ @@ -53,9 +51,7 @@ $ dvc run -n download_file \
-
- -### Microsoft Azure Blob Storage +
```dvc $ dvc run -n download_file \ @@ -70,9 +66,7 @@ $ dvc run -n download_file \
-
- -### Google Cloud Storage +
```dvc $ dvc run -n download_file \ @@ -83,9 +77,7 @@ $ dvc run -n download_file \
-
- -### SSH +
```dvc $ dvc run -n download_file \ @@ -102,9 +94,7 @@ Please check that you are able to connect both ways with tools like `ssh` and
-
- -### HDFS +
```dvc $ dvc run -n download_file \ @@ -116,9 +106,7 @@ $ dvc run -n download_file \
-
- -### HTTP +
> Including HTTPs @@ -131,9 +119,7 @@ $ dvc run -n download_file \
-
- -### local file system paths +
```dvc $ dvc run -n download_file \ diff --git a/content/docs/user-guide/managing-external-data.md b/content/docs/user-guide/managing-external-data.md index 63621dc9e5..0fda05edd5 100644 --- a/content/docs/user-guide/managing-external-data.md +++ b/content/docs/user-guide/managing-external-data.md @@ -92,9 +92,7 @@ types: > setup, use the special `remote://` URL format in step 2. For example: > `dvc add --external remote://myxcache/existing-data`. -
- -### Amazon S3 +
```dvc $ dvc remote add s3cache s3://mybucket/cache @@ -110,9 +108,7 @@ $ dvc run -d data.txt \
-
- -### SSH +
```dvc $ dvc remote add sshcache ssh://user@example.com/cache @@ -134,9 +130,7 @@ Please check that you are able to connect both ways with tools like `ssh` and
-
- -### HDFS +
```dvc $ dvc remote add hdfscache hdfs://user@example.com/cache @@ -157,9 +151,7 @@ it. So systems like Hadoop, Hive, and HBase are supported!
-
- -### WebHDFS +
```dvc $ dvc remote add webhdfscache webhdfs://user@example.com/cache @@ -176,9 +168,7 @@ $ dvc run -d data.txt \
-
- -### local file system paths +
The default cache is in `.dvc/cache`, so there is no need to set a custom cache location for local paths outside of your project. diff --git a/plugins/gatsby-theme-iterative-docs/src/components/Documentation/Markdown/index.tsx b/plugins/gatsby-theme-iterative-docs/src/components/Documentation/Markdown/index.tsx index 27fd63d154..4c72a00f38 100644 --- a/plugins/gatsby-theme-iterative-docs/src/components/Documentation/Markdown/index.tsx +++ b/plugins/gatsby-theme-iterative-docs/src/components/Documentation/Markdown/index.tsx @@ -24,37 +24,56 @@ import { useLocation } from '@reach/router' import GithubSlugger from 'github-slugger' -const Details: React.FC<{ slugger: GithubSlugger }> = ({ +const Details: React.FC<{ slugger: GithubSlugger; title: string }> = ({ + title, slugger, children }) => { const [isOpen, setIsOpen] = useState(false) const location = useLocation() + let trigger - const filteredChildren: ReactNode[] = ( - children as Array<{ props: { children: ReactNode } } | string> - ).filter(child => child !== '\n') - const firstChild = filteredChildren[0] as JSX.Element + if (!title) { + const filteredChildren: ReactNode[] = ( + children as Array<{ props: { children: ReactNode } } | string> + ).filter(child => child !== '\n') + const firstChild = filteredChildren[0] as JSX.Element - if (!/^h.$/.test(firstChild.type)) { - throw new Error('The first child of a details element must be a heading!') - } + if (!/^h.$/.test(firstChild.type)) { + throw new Error( + 'Either provide title as props to details element or the first child of a details element must be a heading!' + ) + } + + /* + To work around auto-linked headings, the last child of the heading node + must be removed. The only way around this is the change the autolinker, + which we currently have as an external package. + */ + const triggerChildren: ReactNode[] = firstChild.props.children.slice( + 0, + firstChild.props.children.length - 1 + ) as ReactNode[] - /* - To work around auto-linked headings, the last child of the heading node - must be removed. The only way around this is the change the autolinker, - which we currently have as an external package. - */ - const triggerChildren: ReactNode[] = firstChild.props.children.slice( - 0, - firstChild.props.children.length - 1 - ) as ReactNode[] + title = (triggerChildren as any[]).reduce((acc, cur) => { + return (acc += + typeof cur === 'string' + ? cur + : typeof cur === 'object' + ? cur?.props?.children?.toString() + : '') + }, '') - let slug = slugger.slug(triggerChildren.toString()) - if (slug[0] === 'ī¸') { - slug = slug.slice(1) + trigger = triggerChildren + children = filteredChildren.slice(1) + } else { + title = title.trim() + trigger = title } - const id = slug.startsWith('-') ? slug.slice(1) : slug + + let slug = slugger.slug(title) + slug = slug.startsWith('-') ? slug.slice(1) : slug + const id = slug.endsWith('-') ? slug.slice(0, -1) : slug useEffect(() => { if (location.hash === `#${id}`) { @@ -66,25 +85,17 @@ const Details: React.FC<{ slugger: GithubSlugger }> = ({ } }, [location.hash]) - /* - Collapsible's trigger type wants ReactElement, so we force a TS cast from - ReactNode here. - */ return (
- + - {filteredChildren.slice(1)} + {children}
)