Skip to content

Commit

Permalink
ref: create Remote Reference (config) (#4264)
Browse files Browse the repository at this point in the history
* ref: start Remote Reference (config)

* Restyled by prettier (#4265)

Co-authored-by: Restyled.io <[email protected]>

* guide: move Remote Storage ref into Data Mgmt

* start: links to new Remotes guide and

and some typo fixes

* guide: finalize S3 storage page and

and remove repeated content from cmd refs (link to guide)

* guide: move "local remotes" to Remotes (index page) and

update admonitions and links

* ref: remove S3 examples

* guide: emphasize that remotes use regular cloud storage config

* Update content/docs/user-guide/data-management/remote-storage/amazon-s3.md

* guide: drop `worktree` cloud versioning from Remotes Config

per #4264 (comment)

* guide: move cloud versioning near the top of Remote Config

per #4264 (review)

* Update content/docs/user-guide/data-management/remote-storage/amazon-s3.md

* Update content/docs/user-guide/data-management/remote-storage/index.md

* Restyled by prettier (#4331)

Co-authored-by: Restyled.io <[email protected]>

* Update content/docs/user-guide/data-management/remote-storage/index.md

---------

Co-authored-by: restyled-io[bot] <32688539+restyled-io[bot]@users.noreply.github.com>
Co-authored-by: Restyled.io <[email protected]>
Co-authored-by: Dave Berenbaum <[email protected]>
  • Loading branch information
4 people authored Feb 23, 2023
1 parent e68f3fa commit 9bf61df
Show file tree
Hide file tree
Showing 10 changed files with 472 additions and 602 deletions.
20 changes: 13 additions & 7 deletions content/docs/command-reference/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -250,9 +250,8 @@ location. A [DVC remote](/doc/command-reference/remote) name is used (instead of
the URL) because often it's necessary to configure authentication or other
connection settings, and configuring a remote is the way that can be done.

- `cache.local` - name of a _local remote_ to use as external cache (refer to
`dvc remote` for more info. on "local remotes".) This will overwrite the value
in `cache.dir` (see `dvc cache dir`).
- `cache.local` - name of a [local remote] to use as external cache. This will
overwrite the value in `cache.dir` (see `dvc cache dir`).

- `cache.s3` - name of an Amazon S3 remote to use as external cache.

Expand All @@ -265,10 +264,17 @@ connection settings, and configuring a remote is the way that can be done.
- `cache.webhdfs` - name of an HDFS remote with WebHDFS enabled to use as
external cache.

> ⚠️ Avoid using the same [remote storage](/doc/command-reference/remote) used
> for `dvc push` and `dvc pull` as external cache, because it may cause file
> hash overlaps: the hash of an external <abbr>output</abbr> could collide with
> that of a local file with different content.
<admon type="warn">

Avoid using the same [remote storage](/doc/command-reference/remote) used for
`dvc push` and `dvc pull` as external cache, because it may cause file hash
overlaps: the hash of an external <abbr>output</abbr> could collide with that
of a local file with different content.

</admon>

[local remote]:
/doc/user-guide/data-management/remote-storage#file-systems-local-remotes

### exp

Expand Down
146 changes: 7 additions & 139 deletions content/docs/command-reference/remote/add.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ A [default remote] is expected by `dvc push`, `dvc pull`, `dvc status`,
</admon>

The remote `name` (required) is used to identify the remote and must be unique.
DVC will determine the [type of remote](#supported-storage-types) based on the
DVC will determine the [storage type](#supported-storage-types) based on the
provided `url` (also required), a URL or path for the location.

<admon type="info">
Expand Down Expand Up @@ -121,60 +121,15 @@ $ pip install "dvc[s3]"

## Supported storage types

The following are the types of remote storage (protocols) supported:
The following are the supported types of storage protocols and platforms.

<details>

### Amazon S3

> 💡 Before adding an S3 remote, be sure to
> [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html).
```cli
$ dvc remote add -d myremote s3://mybucket/path
```

By default, DVC authenticates using your AWS CLI
[configuration](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html)
(if set). This uses the default AWS credentials file. To use a custom
authentication method, use the parameters described in `dvc remote modify`.

Make sure you have the following permissions enabled: `s3:ListBucket`,
`s3:GetObject`, `s3:PutObject`, `s3:DeleteObject`. This enables the S3 API
methods that are performed by DVC (`list_objects_v2` or `list_objects`,
`head_object`, `upload_file`, `download_file`, `delete_object`, `copy`).

> See `dvc remote modify` for a full list of S3 parameters.
</details>

<details>

### S3-compatible storage

For object storage that supports an S3-compatible API (e.g.
[Minio](https://min.io/),
[DigitalOcean Spaces](https://www.digitalocean.com/products/spaces/),
[IBM Cloud Object Storage](https://www.ibm.com/cloud/object-storage) etc.),
configure the `endpointurl` parameter. For example, let's set up a DigitalOcean
"space" (equivalent to a bucket in S3) called `mystore` that uses the `nyc3`
region:

```cli
$ dvc remote add -d myremote s3://mystore/path
$ dvc remote modify myremote endpointurl \
https://nyc3.digitaloceanspaces.com
```
### Cloud providers

By default, DVC authenticates using your AWS CLI
[configuration](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html)
(if set). This uses the default AWS credentials file. To use a custom
authentication method, use the parameters described in `dvc remote modify`.
- [Amazon S3] (AWS) and [S3-compatible] e.g. MinIO

Any other S3 parameter can also be set for S3-compatible storage. Whether
they're effective depends on each storage platform.

</details>
[amazon s3]: /doc/user-guide/data-management/remote-storage/amazon-s3
[s3-compatible]:
/doc/user-guide/data-management/remote-storage/amazon-s3#s3-compatible-servers-non-amazon

<details>

Expand Down Expand Up @@ -396,90 +351,3 @@ $ dvc remote add -d myremote \
> See `dvc remote modify` for a full list of WebDAV parameters.
</details>

<details>

### local remote

A "local remote" is a directory in the machine's file system. Not to be confused
with the `--local` option of `dvc remote` (and other config) commands!

> While the term may seem contradictory, it doesn't have to be. The "local" part
> refers to the type of location where the storage is: another directory in the
> same file system. "Remote" is how we call storage for <abbr>DVC
> projects</abbr>. It's essentially a local backup for data tracked by DVC.
Using an absolute path (recommended):

```cli
$ dvc remote add -d myremote /tmp/dvcstore
$ cat .dvc/config
...
['remote "myremote"']
url = /tmp/dvcstore
...
```

> Note that the absolute path `/tmp/dvcstore` is saved as is.
Using a relative path. It will be resolved against the current working
directory, but saved **relative to the config file location**:

```cli
$ dvc remote add -d myremote ../dvcstore
$ cat .dvc/config
...
['remote "myremote"']
url = ../../dvcstore
...
```

> Note that `../dvcstore` has been resolved relative to the `.dvc/` dir,
> resulting in `../../dvcstore`.
</details>

## Example: Customize an S3 remote

Add an Amazon S3 remote as the _default_ (via the `-d` option), and modify its
region.

> 💡 Before adding an S3 remote, be sure to
> [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html).
```cli
$ dvc remote add -d myremote s3://mybucket/path
Setting 'myremote' as a default remote.
$ dvc remote modify myremote region us-east-2
```

The <abbr>project</abbr>'s config file (`.dvc/config`) now looks like this:

```ini
['remote "myremote"']
url = s3://mybucket/path
region = us-east-2
[core]
remote = myremote
```

The list of remotes should now be:

```cli
$ dvc remote list
myremote s3://mybucket/path
```

You can overwrite existing remotes using `-f` with `dvc remote add`:

```cli
$ dvc remote add -f myremote s3://mybucket/another-path
```

List remotes again to view the updated remote:

```cli
$ dvc remote list
myremote s3://mybucket/another-path
```
12 changes: 4 additions & 8 deletions content/docs/command-reference/remote/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,16 +58,12 @@ default). Alternatively, the config files can be edited manually.

## Example: Add a default local remote

<details>
<admon type="tip">

### What is a "local remote" ?
Learn more about
[local remotes](/doc/user-guide/data-management/remote-storage#file-systems-local-remotes).

While the term may seem contradictory, it doesn't have to be. The "local" part
refers to the type of location where the storage is: another directory in the
same file system. "Remote" is what we call storage for <abbr>DVC
projects</abbr>. It's essentially a local backup for data tracked by DVC.

</details>
</admon>

We use the `-d` (`--default`) option of `dvc remote add` for this:

Expand Down
16 changes: 4 additions & 12 deletions content/docs/command-reference/remote/list.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,18 +40,7 @@ and local config files (in that order).

## Examples

For simplicity, let's add a default local remote:

<details>

### What is a "local remote" ?

While the term may seem contradictory, it doesn't have to be. The "local" part
refers to the type of location where the storage is: another directory in the
same file system. "Remote" is how we call storage for <abbr>DVC projects</abbr>.
It's essentially a local backup for data tracked by DVC.

</details>
For simplicity, let's add a default [local remote]:

```cli
$ dvc remote add -d myremote /path/to/remote
Expand All @@ -66,3 +55,6 @@ myremote /path/to/remote
```

The list will also include any previously added remotes.

[local remote]:
/doc/user-guide/data-management/remote-storage#file-systems-local-remotes
Loading

0 comments on commit 9bf61df

Please sign in to comment.