Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref: get/import/list updates #2317

Merged
merged 3 commits into from
Mar 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions content/docs/command-reference/get.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,13 @@ repository (e.g. source code, small image/other files). `dvc get` copies the
target file or directory (found at `path` in `url`) to the current working
directory. (Analogous to `wget`, but for repos.)

> See `dvc list` for a way to browse repository contents to find files or
> directories to download.

> Note that unlike `dvc import`, this command does not track the downloaded
> files (does not create a `.dvc` file). For that reason, it doesn't require an
> existing DVC project to run in.

> See `dvc list` for a way to browse repository contents to find files or
> directories to download.

The `url` argument specifies the address of the DVC or Git repository containing
the data source. Both HTTP and SSH protocols are supported (e.g.
`[user@]server:project.git`). `url` can also be a local file system path
Expand Down
7 changes: 5 additions & 2 deletions content/docs/command-reference/import-url.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,11 @@ An _import `.dvc` file_ is created in the same location e.g. `data.txt.dvc` –
similar to using `dvc add` after downloading the data. This makes it possible to
update the import later, if the data source has changed (see `dvc update`).

> Note that the imported data can be [pushed](/doc/command-reference/push) to
> remote storage normally.
> Note that data imported from external locaitons can be
> [pushed](/doc/command-reference/push) and
> [pulled](/doc/command-reference/pull) to/from
> [remote storage](/doc/command-reference/remote) normally (unlike for
> `dvc import`).

`.dvc` files support references to data in an external location, see
[External Dependencies](/doc/user-guide/external-dependencies). In such an
Expand Down
12 changes: 6 additions & 6 deletions content/docs/command-reference/import.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,20 +27,20 @@ target file or directory (found at `path` in `url`), and tracks it in the local
project. This makes it possible to update the import later, if the data source
has changed (see `dvc update`).

> Note that `dvc get` corresponds to the first step this command performs (just
> download the data).

> See `dvc list` for a way to browse repository contents to find files or
> directories to import.

> Note that `dvc get` corresponds to the first step this command performs (just
> download the data).

The imported data is <abbr>cached</abbr>, and linked (or copied) to the current
working directory with its original file name e.g. `data.txt` (or to a location
provided with `--out`). An _import `.dvc` file_ is created in the same location
e.g. `data.txt.dvc` – similar to using `dvc add` after downloading the data.

(ℹ️) DVC won't push or pull imported data to/from
[remote storage](/doc/command-reference/remote), it will rely on it's original
source.
(ℹ️) DVC won't push data imported from other DVC repos to
[remote storage](/doc/command-reference/remote). `dvc pull` will download from
the original source.

The `url` argument specifies the address of the DVC or Git repository containing
the data source. Both HTTP and SSH protocols are supported (e.g.
Expand Down
34 changes: 17 additions & 17 deletions content/docs/command-reference/list.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# list

List repository contents, including files, models, and directories tracked by
DVC (as <abbr>outputs</abbr>) and by Git.
List project contents, including files, models, and directories tracked by DVC
and by Git.

> Useful to find data to `dvc get`, `dvc import`, or for `dvc.api` functions.

## Synopsis

Expand All @@ -16,17 +18,15 @@ positional arguments:

## Description

A side-effect of DVC is that it hides actual data paths, by effectively
replacing files and directories with <abbr>DVC files</abbr>. So you don't see
data files/dirs when you browse a <abbr>DVC repository</abbr> on Git hosting
(e.g. GitHub), you just see the `dvc.yaml` and `.dvc` files. This can make it
hard to navigate the project, for example to find files or directories for use
with `dvc get`, `dvc import`, or `dvc.api` functions.
Produces a view of a <abbr>DVC repository</abbr> (usually online), listing data
files and directories tracked by DVC alongside the remaining Git repo contents.
This is useful because when you browse a hosted repository (e.g. on GitHub or
with `git ls-remote`), you only see the `dvc.yaml` and `.dvc` files with your
code (files tracked by Git).

This command produces a view of a DVC repository, as if files and directories
tracked by DVC were found directly in the Git repo. Its output is equivalent to
cloning the repo and [pulling](/doc/command-reference/pull) the data (except
that nothing is downloaded by `dvc list`), like this:
This command's output is equivalent to cloning the repo and
[pulling](/doc/command-reference/pull) the data (except that nothing is
downloaded), like this:

```dvc
$ git clone <url> example
Expand All @@ -35,17 +35,17 @@ $ dvc pull
$ ls <path>
```

Only the root directory is listed by default, but the `-R` option can be used to
list files recursively.

The `url` argument specifies the address of the DVC or Git repository containing
the data source. Both HTTP and SSH protocols are supported (e.g.
`[user@]server:project.git`). `url` can also be a local file system path
(including the current project e.g. `.`).

The optional `path` argument is used to specify a directory to list within the
source repository at `url` (including paths inside tracked directories). It's
similar to providing a path to list to commands such as `ls` or `aws s3 ls`.
Git repo at `url` (including paths inside tracked directories). It's similar to
providing a path to list to commands such as `ls` or `aws s3 ls`.

Only the root directory is listed by default, but the `-R` option can be used to
list files recursively.

Please note that `dvc list` doesn't check whether the listed data (tracked by
DVC) actually exists in remote storage, so it's not guaranteed whether it can be
Expand Down