-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
guide: remote storage #4058
guide: remote storage #4058
Changes from all commits
7350938
203f6a6
90eaa5d
eb246bb
a3687ec
fad0bad
4e3c3da
7f02c15
f41d16e
3a9a045
df40521
adc13ee
c0b92f1
636872a
c2303c0
62997ab
fc74c53
8c40a03
1a8ca61
27be87f
aaee7af
dd99f21
118e3eb
24c331a
ff85dcc
266a8f7
b04f20a
1c77de4
8e7c320
9b904f5
3b5e520
73e2f55
7fc7fa3
ff7e666
c0026fc
71b599c
9774855
e5c6f13
ec1af6d
2f31bb6
a84c442
ab55389
31d5288
a13f989
601c99e
a8bad84
c8cc17b
3cb84cb
a13cb0f
e3ba70b
c29d9ec
875fba3
831ad1d
8ddda9c
28322e5
179d172
d979a5e
74bc156
e138096
f904038
e1772ea
cc0390e
4ee3223
149599b
f2acb66
723eb50
4b67b64
9eb7143
e598839
dd4466e
d637179
c007817
5a0fd57
3eb81ff
98e73ff
67b1717
f3af183
206ce77
075aaf3
7377500
baf5b4c
fb35df5
4475f78
20fbaae
1da7b8a
61e2865
ed63127
0109cf3
77330cc
956b03d
7152ad3
f29da1e
8f49a72
f876c17
ee3f721
061a918
ac50c94
d781fdd
a8acb25
882170a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,13 @@ | ||
# remote add | ||
|
||
Add a new [data remote](/doc/command-reference/remote). | ||
Register a new [DVC remote](/doc/user-guide/data-management/remote-storage). | ||
|
||
> Depending on your storage type, you may also need `dvc remote modify` to | ||
> provide credentials and/or configure other remote parameters. | ||
<admon type="tip"> | ||
Comment on lines
-3
to
+5
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All the
|
||
|
||
Depending on your storage type, you may also need `dvc remote modify` to provide | ||
credentials and/or configure other remote parameters. | ||
|
||
</admon> | ||
|
||
## Synopsis | ||
|
||
|
@@ -26,9 +30,9 @@ for the first remote): | |
|
||
```ini | ||
['remote "myremote"'] | ||
url = /tmp/dvcstore | ||
url = /tmp/dvcstore | ||
[core] | ||
remote = myremote | ||
remote = myremote | ||
``` | ||
|
||
> 💡 Default remotes are expected by commands that accept a `-r`/`--remote` | ||
|
@@ -379,10 +383,10 @@ Using an absolute path (recommended): | |
```cli | ||
$ dvc remote add -d myremote /tmp/dvcstore | ||
$ cat .dvc/config | ||
... | ||
['remote "myremote"'] | ||
url = /tmp/dvcstore | ||
... | ||
... | ||
['remote "myremote"'] | ||
url = /tmp/dvcstore | ||
... | ||
``` | ||
|
||
> Note that the absolute path `/tmp/dvcstore` is saved as is. | ||
|
@@ -393,10 +397,10 @@ directory, but saved **relative to the config file location**: | |
```cli | ||
$ dvc remote add -d myremote ../dvcstore | ||
$ cat .dvc/config | ||
... | ||
['remote "myremote"'] | ||
url = ../../dvcstore | ||
... | ||
... | ||
['remote "myremote"'] | ||
url = ../../dvcstore | ||
... | ||
``` | ||
|
||
> Note that `../dvcstore` has been resolved relative to the `.dvc/` dir, | ||
|
@@ -423,10 +427,10 @@ The <abbr>project</abbr>'s config file (`.dvc/config`) now looks like this: | |
|
||
```ini | ||
['remote "myremote"'] | ||
url = s3://mybucket/path | ||
region = us-east-2 | ||
url = s3://mybucket/path | ||
region = us-east-2 | ||
[core] | ||
remote = myremote | ||
remote = myremote | ||
``` | ||
|
||
The list of remotes should now be: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -125,9 +125,10 @@ | |
"source": false, | ||
"children": [ | ||
"large-dataset-optimization", | ||
"remote-storage", | ||
"cloud-versioning", | ||
"importing-external-data", | ||
"managing-external-data", | ||
"cloud-versioning" | ||
"managing-external-data" | ||
Comment on lines
126
to
+131
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reordered this a little bit. |
||
] | ||
}, | ||
{ | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,113 @@ | ||
# Remote Storage | ||
|
||
_DVC remotes_ provide optional/additional storage to backup and share your data | ||
and ML model. For example, you can download data artifacts created by colleagues | ||
without spending time and resources to regenerate them locally. See `dvc push` | ||
and `dvc pull`. | ||
|
||
<admon type="info"> | ||
|
||
DVC remotes are similar to [Git remotes], but for <abbr>cached</abbr> data. | ||
|
||
[git remotes]: https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes | ||
|
||
</admon> | ||
|
||
This is somehow like GitHub or GitLab providing hosting for source code | ||
repositories. However, DVC does not provide or recommend a specific storage | ||
service. Instead, it adopts a bring-your-own-platform approach, supporting a | ||
wide variety of [storage types](#supported-storage-types). | ||
|
||
The main uses of remote storage are: | ||
|
||
- Synchronize DVC-tracked data (previously <abbr>cached</abbr>). | ||
- Centralize or distribute large file storage for sharing and collaboration. | ||
- Back up different versions of your data and models. | ||
- Save space in your working environment (by deleting pushed files/directories). | ||
|
||
## Configuration | ||
|
||
You can set up one or more remote storage locations, mainly with the | ||
`dvc remote add` and `dvc remote modify` commands. These read and write to the | ||
[`remote`] section of the project's configuration file (`.dvc/config`), which | ||
you could edit manually as well. | ||
|
||
Typically, you'll first register a DVC remote by adding its name and URL (or | ||
file path), e.g.: | ||
|
||
```cli | ||
$ dvc remote add mybucket s3://my-bucket | ||
``` | ||
|
||
Then, you'll usually need or want to configure the remote's authentication | ||
credentials or other properties, etc. For example: | ||
|
||
```cli | ||
$ dvc remote modify --local \ | ||
mybucket credentialpath ~/.aws/alt | ||
|
||
$ dvc remote modify mybucket connect_timeout 300 | ||
``` | ||
|
||
<admon type="warn"> | ||
|
||
Make sure to use the `--local` flag when writing secrets to configuration. This | ||
creates a second config file in `.dvc/config.local` that is ignored by Git. This | ||
way your secrets do not get to the repository. See `dvc config` for more info. | ||
|
||
This also means each copy of the <abbr>DVC repository</abbr> may have to | ||
re-configure remote storage authentication. | ||
|
||
</admon> | ||
|
||
<details> | ||
|
||
### Click to see the resulting config files. | ||
|
||
```ini | ||
# .dvc/config | ||
['remote "mybucket"'] | ||
url = s3://my-bucket | ||
connect_timeout = 300 | ||
``` | ||
|
||
```ini | ||
# .dvc/config.local | ||
['remote "mybucket"'] | ||
credentialpath = ~/.aws/alt | ||
``` | ||
|
||
```ini | ||
# .gitignore | ||
.dvc/config.local | ||
``` | ||
|
||
</details> | ||
|
||
Finally, you can `git commit` the changes to share the general configuration of | ||
your remote (`.dvc/config`) via the Git repo. | ||
|
||
[`remote`]: /doc/command-reference/config#remote | ||
|
||
## Supported storage types | ||
|
||
> See more [details](/doc/command-reference/remote/add#supported-storage-types). | ||
|
||
### Cloud providers | ||
|
||
- Amazon S3 (AWS) | ||
- S3-compatible e.g. MinIO | ||
- Microsoft Azure Blob Storage | ||
- Google Drive | ||
- Google Cloud Storage (GCP) | ||
- Aliyun OSS | ||
|
||
### Self-hosted / On-premises | ||
|
||
- SSH servers; Like `scp` | ||
- HDFS & WebHDFS | ||
- HTTP | ||
- WebDAV | ||
- Local directories, mounted drives; Like `rsync` | ||
> Includes network resources e.g. network-attached storage (NAS) or other | ||
> external devices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
core.remote
description, linked to/from Remotes guide, and added a simple example.