Skip to content

Commit

Permalink
ref: get/import/list updates
Browse files Browse the repository at this point in the history
extracted from #2302
jorgeorpinel committed Mar 18, 2021
1 parent d5f284a commit 415073a
Showing 4 changed files with 31 additions and 28 deletions.
6 changes: 3 additions & 3 deletions content/docs/command-reference/get.md
Original file line number Diff line number Diff line change
@@ -24,13 +24,13 @@ repository (e.g. source code, small image/other files). `dvc get` copies the
target file or directory (found at `path` in `url`) to the current working
directory. (Analogous to `wget`, but for repos.)

> See `dvc list` for a way to browse repository contents to find files or
> directories to download.
> Note that unlike `dvc import`, this command does not track the downloaded
> files (does not create a `.dvc` file). For that reason, it doesn't require an
> existing DVC project to run in.
> See `dvc list` for a way to browse repository contents to find files or
> directories to download.
The `url` argument specifies the address of the DVC or Git repository containing
the data source. Both HTTP and SSH protocols are supported (e.g.
`[user@]server:project.git`). `url` can also be a local file system path
7 changes: 5 additions & 2 deletions content/docs/command-reference/import-url.md
Original file line number Diff line number Diff line change
@@ -51,8 +51,11 @@ changed (see `dvc update`).
💡 The `--to-remote` option lets you store an import on a
[DVC remote](/doc/command-reference/remote) without using the local file system.

> Note that the imported data can be [pushed](/doc/command-reference/push) to
> remote storage normally.
> Note that data imported from external locaitons can be
> [pushed](/doc/command-reference/push) and
> [pulled](/doc/command-reference/pull) to/from
> [remote storage](/doc/command-reference/remote) normally (unlike for
> `dvc import`).
`.dvc` files support references to data in an external location, see
[External Dependencies](/doc/user-guide/external-dependencies). In such an
12 changes: 6 additions & 6 deletions content/docs/command-reference/import.md
Original file line number Diff line number Diff line change
@@ -27,20 +27,20 @@ target file or directory (found at `path` in `url`), and tracks it in the local
project. This makes it possible to update the import later, if the data source
has changed (see `dvc update`).

> Note that `dvc get` corresponds to the first step this command performs (just
> download the data).
> See `dvc list` for a way to browse repository contents to find files or
> directories to import.
> Note that `dvc get` corresponds to the first step this command performs (just
> download the data).
The imported data is <abbr>cached</abbr>, and linked (or copied) to the current
working directory with its original file name e.g. `data.txt` (or to a location
provided with `--out`). An _import `.dvc` file_ is created in the same location
e.g. `data.txt.dvc` – similar to using `dvc add` after downloading the data.

(ℹ️) DVC won't push or pull imported data to/from
[remote storage](/doc/command-reference/remote), it will rely on it's original
source.
(ℹ️) DVC won't push or pull data imported from other DVC repos to/from
[remote storage](/doc/command-reference/remote). `dvc pull` will download from
the original source instead.

The `url` argument specifies the address of the DVC or Git repository containing
the data source. Both HTTP and SSH protocols are supported (e.g.
34 changes: 17 additions & 17 deletions content/docs/command-reference/list.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# list

List repository contents, including files, models, and directories tracked by
DVC (as <abbr>outputs</abbr>) and by Git.
List project contents, including files, models, and directories tracked by DVC
and by Git.

> Useful to find data to `dvc get`, `dvc import`, or for `dvc.api` functions.
## Synopsis

@@ -16,17 +18,15 @@ positional arguments:

## Description

A side-effect of DVC is that it hides actual data paths, by effectively
replacing files and directories with <abbr>DVC files</abbr>. So you don't see
data files/dirs when you browse a <abbr>DVC repository</abbr> on Git hosting
(e.g. GitHub), you just see the `dvc.yaml` and `.dvc` files. This can make it
hard to navigate the project, for example to find files or directories for use
with `dvc get`, `dvc import`, or `dvc.api` functions.
Produces a view of a <abbr>DVC repository</abbr> (usually online), listing data
files and directories tracked by DVC alongside the remaining Git repo contents.
This is useful because when you browse a hosted repository (e.g. on GitHub or
with `git ls-remote`), you only see the `dvc.yaml` and `.dvc` files with your
code (files tracked by Git).

This command produces a view of a DVC repository, as if files and directories
tracked by DVC were found directly in the Git repo. Its output is equivalent to
cloning the repo and [pulling](/doc/command-reference/pull) the data (except
that nothing is downloaded by `dvc list`), like this:
This command's output is equivalent to cloning the repo and
[pulling](/doc/command-reference/pull) the data (except that nothing is
downloaded), like this:

```dvc
$ git clone <url> example
@@ -35,17 +35,17 @@ $ dvc pull
$ ls <path>
```

Only the root directory is listed by default, but the `-R` option can be used to
list files recursively.

The `url` argument specifies the address of the DVC or Git repository containing
the data source. Both HTTP and SSH protocols are supported (e.g.
`[user@]server:project.git`). `url` can also be a local file system path
(including the current project e.g. `.`).

The optional `path` argument is used to specify a directory to list within the
source repository at `url` (including paths inside tracked directories). It's
similar to providing a path to list to commands such as `ls` or `aws s3 ls`.
Git repo at `url` (including paths inside tracked directories). It's similar to
providing a path to list to commands such as `ls` or `aws s3 ls`.

Only the root directory is listed by default, but the `-R` option can be used to
list files recursively.

Please note that `dvc list` doesn't check whether the listed data (tracked by
DVC) actually exists in remote storage, so it's not guaranteed whether it can be

0 comments on commit 415073a

Please sign in to comment.