Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

guide: more links to Remote Storage page #4262

Merged
merged 113 commits into from
Feb 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
7350938
guide: draft structure of Data Mgmt and
jorgeorpinel Oct 13, 2022
203f6a6
guide: full text for draft intro to DM
jorgeorpinel Oct 14, 2022
90eaa5d
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Oct 17, 2022
eb246bb
guide: hide cloud versioning info
jorgeorpinel Oct 17, 2022
a3687ec
guide: clarify Data Mgmt parts and
jorgeorpinel Oct 18, 2022
fad0bad
guide: add figure drafts to Data Mgmt
jorgeorpinel Oct 19, 2022
4e3c3da
guide: SCM->VC (Data Mgmt)
jorgeorpinel Oct 19, 2022
7f02c15
guide: update 2 figs and add 1 more (Data Mgmt)
jorgeorpinel Oct 19, 2022
f41d16e
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Oct 20, 2022
3a9a045
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Oct 20, 2022
df40521
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Oct 20, 2022
adc13ee
Merge branch 'guide/data-mgmt-flows' into guide/data-mgmt/remote-config
jorgeorpinel Oct 21, 2022
c0b92f1
guide: roll back unrelated changes
jorgeorpinel Oct 21, 2022
636872a
Merge branch 'guide/data-mgmt-flows' into guide/data-mgmt/remote-config
jorgeorpinel Oct 22, 2022
c2303c0
guide: mention clouds first (DM) and
jorgeorpinel Oct 22, 2022
62997ab
guide: flatten DM index
jorgeorpinel Oct 22, 2022
fc74c53
guide: udpates to DM/ DV
jorgeorpinel Oct 22, 2022
8c40a03
guide: add DM/ Data Versioning page
jorgeorpinel Oct 22, 2022
1a8ca61
guide: update outdated link
jorgeorpinel Oct 22, 2022
27be87f
guide: revert more unrelatedly chaqnged files
jorgeorpinel Oct 22, 2022
aaee7af
guide: remove unused ref link
jorgeorpinel Oct 22, 2022
dd99f21
Merge branch 'guide/data-mgmt-flows' into guide/data-mgmt/remote-config
jorgeorpinel Oct 22, 2022
118e3eb
guide: DM/ Remote Storage (not just Setup) and
jorgeorpinel Oct 22, 2022
24c331a
guide: remove a comment
jorgeorpinel Oct 22, 2022
ff85dcc
Merge branch 'guide/data-mgmt-flows' into guide/data-mgmt/remote-config
jorgeorpinel Oct 22, 2022
266a8f7
guide: draft for DM/ Remote Storage content
jorgeorpinel Oct 22, 2022
b04f20a
ref: expand config.remote and link to/from Remotes guide
jorgeorpinel Oct 23, 2022
1c77de4
ref: fix remote config file examples
jorgeorpinel Oct 23, 2022
8e7c320
guide: complete Remote Config section and
jorgeorpinel Oct 23, 2022
bc9f588
ref: rewrite remote add and modify Descs
jorgeorpinel Oct 24, 2022
9b904f5
guide: complete list of supported storage types
jorgeorpinel Oct 24, 2022
c80f8ed
Merge branch 'guide/data-mgmt/remote-config' into guide/data-mgmt/rem…
jorgeorpinel Oct 24, 2022
33f46fc
ref: rewrite remote index page from
jorgeorpinel Oct 24, 2022
3b5e520
guide: clarify `remote modify` phrase in
jorgeorpinel Oct 24, 2022
abf3a87
Merge branch 'guide/data-mgmt/remote-config' into guide/data-mgmt/rem…
jorgeorpinel Oct 24, 2022
73e2f55
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Oct 27, 2022
7fc7fa3
Merge branch 'guide/data-mgmt-flows' into guide/data-mgmt/remote-config
jorgeorpinel Oct 27, 2022
d619d6b
Merge branch 'guide/data-mgmt/remote-config' into guide/data-mgmt/rem…
jorgeorpinel Oct 27, 2022
ff7e666
Update content/docs/user-guide/data-management/data-versioning.md
Oct 27, 2022
c0026fc
guide: update versioning config
jorgeorpinel Oct 27, 2022
71b599c
guide: don't call remote storage "additional" here
jorgeorpinel Oct 27, 2022
9774855
guide: pull -> download (DM/ RS intro)
jorgeorpinel Oct 27, 2022
e5c6f13
guide: remove "optional" from Remote Storage nav & title
jorgeorpinel Oct 27, 2022
ec1af6d
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Oct 28, 2022
2f31bb6
guide: splits and notes around Data Mgmt index page
jorgeorpinel Oct 28, 2022
a84c442
guide: Data Mgmt intro + note updates
jorgeorpinel Oct 29, 2022
ab55389
guide: draft of all contents +
jorgeorpinel Oct 29, 2022
31d5288
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Nov 1, 2022
a13f989
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Nov 2, 2022
601c99e
guide: small impros to Data Mgmt
jorgeorpinel Nov 2, 2022
a8bad84
guide: rewrite Data Mgmt index in before/after form
jorgeorpinel Nov 3, 2022
c8cc17b
guide: add draft figure for Data Mgmt
jorgeorpinel Nov 4, 2022
3cb84cb
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Nov 8, 2022
a13cb0f
guide: simplify/refocus data mgmt index
jorgeorpinel Nov 8, 2022
e3ba70b
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Nov 17, 2022
c29d9ec
work around commented header bug
jorgeorpinel Nov 17, 2022
875fba3
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Nov 23, 2022
831ad1d
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Nov 25, 2022
8ddda9c
guide: drop DM/ DV page
jorgeorpinel Nov 25, 2022
28322e5
guide: rewrite DM intro and
jorgeorpinel Nov 25, 2022
179d172
guide: use DM table instead of figure for now
jorgeorpinel Nov 25, 2022
d979a5e
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Nov 30, 2022
74bc156
guide: rewrite Data Mgmt story
jorgeorpinel Nov 30, 2022
e138096
guide: add draft figures to Data Mgmt
jorgeorpinel Nov 30, 2022
f904038
guide: simplify Data Mgmt story and benefits
jorgeorpinel Dec 1, 2022
e1772ea
guide: remove unused images (DM)
jorgeorpinel Dec 1, 2022
cc0390e
guide: update Data Mgmt figures (v1)
jorgeorpinel Dec 2, 2022
4ee3223
guide: rewrite text of Data Mgmt index
jorgeorpinel Dec 8, 2022
149599b
Merge branch 'main' of github.com:iterative/dvc.org into guide/data-m…
rogermparent Dec 8, 2022
f2acb66
guide: update Data Mgmt figures
jorgeorpinel Dec 8, 2022
723eb50
guide: iterate on Data Mgmt again
jorgeorpinel Dec 14, 2022
4b67b64
guide: update Data Mgmt figs
jorgeorpinel Dec 14, 2022
9eb7143
guide: more supporting info about Data Mgmt
jorgeorpinel Dec 18, 2022
e598839
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Dec 21, 2022
dd4466e
guide: update figures (much more concrete) and
jorgeorpinel Dec 21, 2022
d637179
guide: edits to How it works (Data Mgmt)
jorgeorpinel Dec 21, 2022
c007817
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Dec 22, 2022
5a0fd57
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Dec 22, 2022
3eb81ff
guide: update Data Mgmt figures
jorgeorpinel Dec 22, 2022
98e73ff
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Dec 23, 2022
67b1717
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Dec 27, 2022
f3af183
guide: emphaisze dataset versions in UG fig 1
jorgeorpinel Dec 27, 2022
206ce77
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Jan 4, 2023
075aaf3
guide: update Data Mgmt figures (with notes),
jorgeorpinel Jan 5, 2023
7377500
guide: more updates to text and figure styles,
jorgeorpinel Jan 5, 2023
baf5b4c
guide: update figures and text (Data Mgmt) ...
jorgeorpinel Jan 9, 2023
fb35df5
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Jan 11, 2023
4475f78
guide: Data Management text (section 1)
jorgeorpinel Jan 11, 2023
20fbaae
guide: Data Management (main text)
jorgeorpinel Jan 11, 2023
1da7b8a
guide: Data Management (secondary text)
jorgeorpinel Jan 12, 2023
61e2865
Merge branch 'guide/data-mgmt-flows' of github.com:iterative/dvc.org …
jorgeorpinel Jan 12, 2023
ed63127
guide: add DVC data mgmt technical diagram &
jorgeorpinel Jan 12, 2023
0109cf3
guide: update Data Mgmt text
jorgeorpinel Jan 18, 2023
77330cc
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Jan 18, 2023
956b03d
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Jan 19, 2023
7152ad3
guide: udpate text and 2nd figure (Data Mgmt)
jorgeorpinel Jan 19, 2023
f29da1e
guide: draft 2nd and 3rd figures
jorgeorpinel Jan 19, 2023
8f49a72
guide: rewrite Data Mgmt/ How it works &
jorgeorpinel Jan 20, 2023
f876c17
guide: update drafts of Data Mgmt figures 2, 3
jorgeorpinel Jan 20, 2023
ee3f721
guide: Data Mgmt improvements and
jorgeorpinel Jan 24, 2023
061a918
Merge branch 'main' into guide/data-mgmt-flows
jorgeorpinel Jan 24, 2023
ac50c94
Merge branch 'guide/data-mgmt-flows' into guide/data-mgmt/remote-config
jorgeorpinel Jan 24, 2023
d781fdd
guide: separate from Data Mgmt work
jorgeorpinel Jan 24, 2023
00f0993
Merge branch 'guide/data-mgmt/remote-config' into guide/data-mgmt/rem…
jorgeorpinel Jan 24, 2023
91a7384
Apply suggestions from code review
jorgeorpinel Jan 24, 2023
b07f81c
Merge branch main +
jorgeorpinel Jan 24, 2023
a10ac26
Merge branch 'main' into guide/data-mgmt/remote-storage-types
jorgeorpinel Jan 24, 2023
c2311dd
other: links to Remotes guide
jorgeorpinel Jan 25, 2023
2ba220d
install: Remote Storage guide links
jorgeorpinel Jan 25, 2023
6c59be0
start: Remote Storage guide links +
jorgeorpinel Jan 25, 2023
a4d934b
guide: links to Remote Storage page
jorgeorpinel Jan 25, 2023
2faf000
Merge branch 'main' into guide/data-mgmt/remote-storage-links-more
jorgeorpinel Feb 20, 2023
ffad035
Restyled by prettier (#4323)
restyled-io[bot] Feb 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions content/docs/install/linux.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,11 @@ Note that Python 3.8+ is needed to get the latest version of DVC.
$ pip install dvc
```

Depending on the type of the [remote storage](/doc/command-reference/remote) you
plan to use, you might need to install optional dependencies: `[s3]`,
`[gdrive]`, `[gs]`, `[azure]`, `[ssh]`, `[hdfs]`, `[webdav]`, `[oss]`. Use
`[all]` to include them all.
Depending on the type of the [remote storage] you plan to use, you might need to
install optional dependencies: `[s3]`, `[gdrive]`, `[gs]`, `[azure]`, `[ssh]`,
`[hdfs]`, `[webdav]`, `[oss]`. Use `[all]` to include them all.

[remote storage]: /doc/user-guide/data-management/remote-storage

<details id="example-pip-with-support-for-amazon-s3-storage">

Expand Down Expand Up @@ -65,9 +66,9 @@ $ conda install -c conda-forge mamba # installs much faster than conda
$ mamba install -c conda-forge dvc
```

Depending on the type of the [remote storage](/doc/command-reference/remote) you
plan to use, you might need to install optional dependencies: `dvc-s3`,
`dvc-azure`, `dvc-gdrive`, `dvc-gs`, `dvc-oss`, `dvc-ssh`.
Depending on the type of the [remote storage] you plan to use, you might need to
install optional dependencies: `dvc-s3`, `dvc-azure`, `dvc-gdrive`, `dvc-gs`,
`dvc-oss`, `dvc-ssh`.

<details id="example-conda-with-support-for-amazon-s3-storage">

Expand Down
15 changes: 8 additions & 7 deletions content/docs/install/macos.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,11 @@ Note that Python 3.8+ is needed to get the latest version of DVC.
$ pip install dvc
```

Depending on the type of the [remote storage](/doc/command-reference/remote) you
plan to use, you might need to install optional dependencies: `[s3]`,
`[gdrive]`, `[gs]`, `[azure]`, `[ssh]`, `[hdfs]`, `[webdav]`, `[oss]`. Use
`[all]` to include them all.
Depending on the type of the [remote storage] you plan to use, you might need to
install optional dependencies: `[s3]`, `[gdrive]`, `[gs]`, `[azure]`, `[ssh]`,
`[hdfs]`, `[webdav]`, `[oss]`. Use `[all]` to include them all.

[remote storage]: /doc/user-guide/data-management/remote-storage

<details id="example-pip-with-support-for-amazon-s3-storage">

Expand Down Expand Up @@ -90,9 +91,9 @@ $ conda install -c conda-forge mamba # installs much faster than conda
$ mamba install -c conda-forge dvc
```

Depending on the type of the [remote storage](/doc/command-reference/remote) you
plan to use, you might need to install optional dependencies: `dvc-s3`,
`dvc-azure`, `dvc-gdrive`, `dvc-gs`, `dvc-oss`, `dvc-ssh`.
Depending on the type of the [remote storage] you plan to use, you might need to
install optional dependencies: `dvc-s3`, `dvc-azure`, `dvc-gdrive`, `dvc-gs`,
`dvc-oss`, `dvc-ssh`.

<details id="example-conda-with-support-for-amazon-s3-storage">

Expand Down
14 changes: 8 additions & 6 deletions content/docs/install/windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,11 @@ $ conda install -c conda-forge mamba # installs much faster than conda
$ mamba install -c conda-forge dvc
```

Depending on the type of the [remote storage](/doc/command-reference/remote) you
plan to use, you might need to install optional dependencies: `dvc-s3`,
`dvc-azure`, `dvc-gdrive`, `dvc-gs`, `dvc-oss`, `dvc-ssh`.
Depending on the type of the [remote storage] you plan to use, you might need to
install optional dependencies: `dvc-s3`, `dvc-azure`, `dvc-gdrive`, `dvc-gs`,
`dvc-oss`, `dvc-ssh`.

[remote storage]: /doc/user-guide/data-management/remote-storage

<details id="example-conda-with-support-for-amazon-s3-storage">

Expand Down Expand Up @@ -81,9 +83,9 @@ Note that Python 3.8+ is needed to get the latest version of DVC.
$ pip install dvc
```

Depending on the type of the [remote storage](/doc/command-reference/remote) you
plan to use, you might need to install optional dependencies: `[s3]`, `[azure]`,
`[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Use `[all]` to include them all.
Depending on the type of the [remote storage] you plan to use, you might need to
install optional dependencies: `[s3]`, `[azure]`, `[gdrive]`, `[gs]`, `[oss]`,
`[ssh]`. Use `[all]` to include them all.

<details id="example-pip-with-support-for-amazon-s3-storage">

Expand Down
17 changes: 12 additions & 5 deletions content/docs/start/data-management/data-and-model-access.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,12 @@ a specific version of a model? Or reuse datasets across different projects?
<admon type="tip">

These questions tend to come up when you browse the files that DVC saves to
remote storage (e.g.
[remote storage] (e.g.
`s3://dvc-public/remote/get-started/fb/89904ef053f04d64eafcc3d70db673` 😱
instead of the original file name such as `model.pkl` or `data.xml`).

[remote storage]: /doc/user-guide/data-management/remote-storage

</admon>

Remember those `.dvc` files `dvc add` generates? Those files (and `dvc.lock`,
Expand Down Expand Up @@ -86,10 +88,15 @@ bring in changes from the data source later using `dvc update`.

### 💡 Expand to see what happens under the hood.

> Note that the
> [dataset registry](https://github.com/iterative/dataset-registry) repository
> doesn't actually contain a `get-started/data.xml` file. Like `dvc get`,
> `dvc import` downloads from [remote storage](/doc/command-reference/remote).
<admon type="info">

The [dataset registry] repository doesn't actually contain a
`get-started/data.xml` file. Like `dvc get`, `dvc import` downloads from [remote
storage].

[dataset registry]: https://github.com/iterative/dataset-registry

</admon>

`.dvc` files created by `dvc import` have special fields, such as the data
source `repo` and `path` (under `deps`):
Expand Down
21 changes: 15 additions & 6 deletions content/docs/start/data-management/data-versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,19 +95,28 @@ outs:
## Storing and sharing

You can upload DVC-tracked data or model files with `dvc push`, so they're
safely stored [remotely](/doc/command-reference/remote). This also means they
can be retrieved on other environments later with `dvc pull`. First, we need to
set up a remote storage location:
safely stored [remotely]. This also means they can be retrieved on other
environments later with `dvc pull`. First, we need to set up a remote storage
location:

[remotely]: /doc/user-guide/data-management/remote-storage

```cli
$ dvc remote add -d storage s3://mybucket/dvcstore
$ git add .dvc/config
$ git commit -m "Configure remote storage"
```

> DVC supports many remote storage types, including Amazon S3, SSH, Google
> Drive, Azure Blob Storage, and HDFS. See `dvc remote add` for more details and
> examples.
<admon type="info">

DVC supports many [remote storage types], including Amazon S3, SSH, Google
Drive, Azure Blob Storage, and HDFS. See `dvc remote add` for more details and
examples.

[remote storage types]:
/doc/user-guide/data-management/remote-storage#supported-storage-types

</admon>

<details>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,11 @@ you want to visualize in Iterative Studio.
### Data remotes (cloud/remote storage)

The metrics and parameters that you want to include in the project may also be
present in a [data remote](/doc/command-reference/remote#description) (cloud
storage or another location outside the Git repo). If you want to include such
data in your projects, then you will have to grant Iterative Studio access to
the data remote.
present in a [data remote] (cloud storage or another location outside the Git
repo). If you want to include such data in your projects, then you will have to
grant Iterative Studio access to the data remote.

[data remote]: /doc/user-guide/data-management/remote-storage

## Configuring project settings

Expand All @@ -82,9 +83,8 @@ which you are trying to connect.

### Data remotes / cloud storage credentials

If you need to provide credentials for
[DVC data remotes](/doc/command-reference/remote#description), you will need to
do it after your project has been created. First, create your project without
If you need to provide credentials for a [data remote], you will need to do it
after your project has been created. First, create your project without
specifying the data remotes. Once your project is created, open its settings.
Open the `Data remotes / cloud storage credentials` section. The data remotes
that are used in your DVC repo will be listed.
Expand All @@ -93,8 +93,8 @@ that are used in your DVC repo will be listed.

Now, click on `Add new credentials`. In the form that opens up, select the
provider (Amazon S3, GCP, etc.). For details on what types of remote storage
(protocols) are supported, refer to the DVC documentation on
[supported storage types](/doc/command-reference/remote/add#supported-storage-types).
(protocols) are supported, refer to the DVC documentation on [supported storage
types].

Depending on the provider, you will be asked for more details such as the
credentials name, username, password etc. Note that for each supported storage
Expand All @@ -103,17 +103,19 @@ type, the required details may be different.
![](https://static.iterative.ai/img/studio/s3_remote_settings_v2.png)

You will also have to ensure that the credentials you enter have the required
permissions on the cloud / remote storage. In the DVC documentation on
[supported storage types](/doc/command-reference/remote/add#supported-storage-types),
expand the section for the storage type you want to add. There, you will find
the details of the permissions that you need to grant to the account
(credentials) that you are configuring on Iterative Studio.
permissions on the cloud / remote storage. Refer to the [DVC Remote config
parameters] for more details about this.

Note that Iterative Studio uses the credentials only to read plots/metrics files
if they are not saved into Git. It does not access any other data in your remote
storage. And you do not need to provide the credentials if any DVC data remote
in not used in your Git repository.

[supported storage types]:
/doc/user-guide/data-management/remote-storage#supported-storage-types
[dvc remote config parameters]:
/doc/command-reference/remote/modify#available-parameters-per-storage-type

### Mandatory columns

##### (Tracking scope)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,12 +47,11 @@ job:
example below).

```yaml
...
steps:
- name: Train model
env:
STUDIO_TOKEN: ${{ secrets.STUDIO_TOKEN }}
...
---
steps:
- name: Train model
env:
STUDIO_TOKEN: ${{ secrets.STUDIO_TOKEN }}
```

2. `STUDIO_REPO_URL`: If you are running the experiment locally, you do not
Expand Down
11 changes: 6 additions & 5 deletions content/docs/use-cases/ci-cd-for-machine-learning.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,12 @@ configuration. Here are a few feature highlights:
**Models, Data, and Metrics as Code**: DVC removes the need to create versioning
databases, use special file/folder structures, or write bespoke interfacing
code. Instead, DVC stores meta-information in Git ("codifying" data and ML
models) while pushing the actual data content to
[cloud storage](/doc/command-reference/remote). DVC also provides metrics-driven
navigation in Git repositories --
[tabulating and plotting](/doc/start/data-management/metrics-parameters-plots)
model metrics changes across commits.
models) while pushing the actual data content to [cloud storage]. DVC also
provides metrics-driven navigation in Git repositories -- [tabulating and
plotting] model metrics changes across commits.

[cloud storage]: /doc/user-guide/data-management/remote-storage
[tabulating and plotting]: /doc/start/data-management/metrics-parameters-plots

**Low friction**: Our sister project CML provides
[lightweight machine resource orchestration](https://cml.dev/doc/self-hosted-runners)
Expand Down
2 changes: 1 addition & 1 deletion content/docs/use-cases/data-registry/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ cloud storage. Advantages:

[ci/cd for your data and models lifecycle]:
/doc/use-cases/ci-cd-for-machine-learning
[remote storage]: /doc/command-reference/remote
[remote storage]: /doc/user-guide/data-management/remote-storage

👩‍💻 Intrigued? Try our [registry tutorial] to learn how DVC looks and feels
firsthand.
Expand Down
2 changes: 1 addition & 1 deletion content/docs/use-cases/model-registry.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ with software engineering methods such as continuous integration (CI/CD), which
can sync with the state of the artifacts in your registry.

[modeling process]: /doc/start/data-management/data-pipelines
[remote storage]: /doc/command-reference/remote
[remote storage]: /doc/user-guide/data-management/remote-storage
[sharing]: /doc/start/data-management/data-and-model-access
[via cml]: https://cml.dev/doc/cml-with-dvc
[gitops]: https://www.gitops.tech/
22 changes: 13 additions & 9 deletions content/docs/use-cases/versioning-data-and-models/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,17 +55,21 @@ Benefits of our approach include:
editing these in source code.

- **Efficient data management**: Use a familiar and cost-effective storage
solution for your data and models (e.g. SFTP, S3, HDFS,
[etc.](/doc/command-reference/remote/add#supported-storage-types)) — free from
Git hosting
[constraints](https://docs.github.com/en/free-pro-team@latest/github/managing-large-files/what-is-my-disk-quota).
DVC [optimizes](/doc/user-guide/data-management/large-dataset-optimization)
storing and transferring large files.
solution for your data and models (e.g. SFTP, S3, HDFS, [etc.]) — free from
Git hosting [constraints]. DVC [optimizes] storing and transferring large
files.

[etc.]: /doc/user-guide/data-management/remote-storage#supported-storage-types
[constraints]:
https://docs.github.com/en/free-pro-team@latest/github/managing-large-files/what-is-my-disk-quota
[optimizes]: /doc/user-guide/data-management/large-dataset-optimization

- **Collaboration**: Easily distribute your project development and share its
data [internally](/doc/user-guide/how-to/share-a-dvc-cache) and
[remotely](/doc/command-reference/remote), or
[reuse](/doc/start/data-management/data-and-model-access) it in other places.
data [internally] and [remotely], or [reuse] it in other places.

[remotely]: /doc/user-guide/data-management/remote-storage
[internally]: /doc/user-guide/how-to/share-a-dvc-cache
[reuse]: /doc/start/data-management/data-and-model-access

- **Data compliance**: Review data modification attempts as Git
[pull requests](https://www.dummies.com/web-design-development/what-are-github-pull-requests/).
Expand Down
18 changes: 12 additions & 6 deletions content/docs/use-cases/versioning-data-and-models/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,12 +86,18 @@ $ unzip -q data.zip
$ rm -f data.zip
```

> `dvc get` can download any file or directory tracked in a <abbr>DVC
> repository</abbr> (and [stored remotely](/doc/command-reference/remote)). It's
> like `wget`, but for DVC or Git repos. In this case we use our
> [dataset registry](https://github.com/iterative/dataset-registry) repo as the
> data source (refer to [Data Registry](/doc/use-cases/data-registry) for more
> info.)
<admon type="info">

`dvc get` can download any file or directory tracked in a <abbr>DVC
repository</abbr> (and stored [remotely]). It's like `wget`, but for DVC or Git
repos. In this case we use our [dataset registry] repo as the data source (refer
to [Data Registry] for more info.)

[remotely]: /doc/user-guide/data-management/remote-storage
[dataset registry]: https://github.com/iterative/dataset-registry
[data registry]: /doc/use-cases/data-registry

</admon>

This command downloads and extracts our raw dataset, consisting of 1000 labeled
images for training and 800 labeled images for validation. In total, it's a 43
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,13 @@ types/protocols:
- HTTP
- Local files and directories outside the <abbr>workspace</abbr>

> Note that [remote storage](/doc/command-reference/remote) is a different
> feature.
<admon type="info">

[Remote storage] is a different feature.

[remote storage]: /doc/user-guide/data-management/remote-storage

</admon>

## Examples

Expand Down Expand Up @@ -151,8 +156,8 @@ be managed independently. This is useful if the connection requires
authentication, if multiple dependencies (or stages) reuse the same location, or
if the URL is likely to change in the future.

[DVC remotes](/doc/command-reference/remote) can do just this. You may use
`dvc remote add` to define them, and then use a special URL with format
[DVC remotes][remote storage] can do just this. You may use `dvc remote add` to
define them, and then use a special URL with format
`remote://{remote_name}/{path}` (remote alias) to define the external
dependency.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

[to-cache]: /doc/command-reference/add#example-transfer-to-an-external-cache
[to-remote]: /doc/command-reference/add#example-transfer-to-remote-storage
[remote storage]: /doc/command-reference/remote
[remote storage]: /doc/user-guide/data-management/remote-storage

There are cases when data is so large, or its processing is organized in such a
way, that its impossible to handle it in the local machine disk. For example
Expand Down
2 changes: 1 addition & 1 deletion content/docs/user-guide/data-management/remote-storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ DVC remotes are similar to [Git remotes], but for <abbr>cached</abbr> data.

</admon>

This is somehow like GitHub or GitLab providing hosting for source code
This is somewhat like GitHub or GitLab providing hosting for source code
repositories. However, DVC does not provide or recommend a specific storage
service. Instead, it adopts a bring-your-own-platform approach, supporting a
wide variety of [storage types](#supported-storage-types).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,9 @@ $ git push origin my-branch
If you only need to share code and metadata like parameters and metrics, then
pushing to Git is often enough. However, you may also have data, models, etc.
that are tracked and <abbr>cached</abbr> by DVC. If you need to share these
files, you can push them to [remote storage](/doc/command-reference/remote)
(e.g. Amazon S3 or Google Drive).
files, you can push them to [remote storage] (e.g. Amazon S3 or Google Drive).

[remote storage]: /doc/user-guide/data-management/remote-storage

```cli
$ dvc push
Expand Down
Loading