diff --git a/content/docs/command-reference/get-url.md b/content/docs/command-reference/get-url.md index 031911669e..2f2243a1a6 100644 --- a/content/docs/command-reference/get-url.md +++ b/content/docs/command-reference/get-url.md @@ -31,6 +31,13 @@ while `out` can be used to specify the directory and/or file name desired for the downloaded data. If an existing directory is specified, then the file or directory will be placed inside. + + +See `dvc list-url` for a way to browse the external location for files and +directories to download. + + + DVC supports several types of (local or) remote data sources (protocols): | Type | Description | `url` format example | diff --git a/content/docs/command-reference/import-url.md b/content/docs/command-reference/import-url.md index 5b4f1e3230..0cc35697ce 100644 --- a/content/docs/command-reference/import-url.md +++ b/content/docs/command-reference/import-url.md @@ -52,6 +52,13 @@ The imported data is cached, and linked (or copied) to the current working directory with its original file name e.g. `data.txt` (or to a location provided with `out`). + + +See `dvc list-url` for a way to browse the external location for files and +directories to download. + + + An _import `.dvc` file_ is created in the same location e.g. `data.txt.dvc` – similar to using `dvc add` after downloading the data. This makes it possible to update the import later, if the data source has changed (see `dvc update`). diff --git a/content/docs/command-reference/list-url.md b/content/docs/command-reference/list-url.md new file mode 100644 index 0000000000..02f3d499a1 --- /dev/null +++ b/content/docs/command-reference/list-url.md @@ -0,0 +1,97 @@ +# list-url + + + +Aliased to `dvc ls-url` + + + +List contents from a supported URL (for example `s3://`, `ssh://`, and other +protocols). + + + +Useful to find data to `dvc get-url` or `dvc import-url`. + + + +## Synopsis + +```usage +usage: dvc list-url [-h] [-q | -v] [-R] url + +positional arguments: + url (See supported URLs in the description) +``` + +## Description + +Lists files and directories from an external location. `dvc list-url` provides a +uniform interface to browse the contents of an external location using any +protocol that is understood by `dvc get-url` or `dvc import-url`. For example, +it is roughly equivalent to `aws s3 ls` when using the `s3://` protocol, or +`ssh user@host ls -a` when using `ssh://`. + +The `url` argument specifies the location of the data to be listed. It supports +several kinds of external data sources: + +| Type | Description | `url` format example | +| ------- | ---------------------------- | ------------------------------------- | +| `s3` | Amazon S3 | `s3://bucket/data` | +| `azure` | Microsoft Azure Blob Storage | `azure://container/data` | +| `gs` | Google Cloud Storage | `gs://bucket/data` | +| `ssh` | SSH server | `ssh://user@example.com/path/to/data` | +| `local` | Local path | `/path/to/local/data` | + + + +If you installed DVC via `pip` and plan to access cloud services as external +data sources, you might need to install these optional dependencies: `[s3]`, +`[azure]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to include them +all. The command should look like this: `pip install "dvc[s3]"`. (This example +installs `boto3` library along with DVC to support S3 storage.) + + + +Only the root directory is listed by default, but the `-R` option can be used to +list files recursively. + +## Options + +- `-R`, `--recursive` - recursively list files in all subdirectories. + +- `-h`, `--help` - prints the usage/help message, and exit. + +- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no + problems arise, otherwise a non-zero value. + +- `-v`, `--verbose` - displays detailed tracing information. + +## Example: Amazon S3 + +This command will list objects and common prefixes under the specified path: + +```dvc +$ dvc list-url s3://bucket/path +``` + +DVC expects that AWS CLI is already +[configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html). +DVC will use the AWS credentials file to access S3. + +## Example: SSH + +```dvc +$ dvc list-url ssh://user@example.com/path/to/data +``` + +Using default SSH credentials, the above command lists files and directories +inside `data`. + +## Example: local file system + +```dvc +$ dvc list-url /local/path/to/data +``` + +The above command will list the `/local/path/to/data` directory. diff --git a/content/docs/command-reference/list.md b/content/docs/command-reference/list.md index 6f63afee7b..e004dee7c4 100644 --- a/content/docs/command-reference/list.md +++ b/content/docs/command-reference/list.md @@ -58,7 +58,7 @@ accessed with `dvc get`, `dvc import`, or `dvc.api`. ## Options -- `-R`, `--recursive` - recursively prints contents of all subdirectories. +- `-R`, `--recursive` - recursively list files in all subdirectories. - `--dvc-only` - show only DVC-tracked files and directories (outputs). @@ -111,7 +111,7 @@ We can now, for example, download the model file with: $ dvc get https://github.com/iterative/example-get-started model.pkl ``` -## Example: List all files and directories in a data registry +## Example: List all files in a data registry Let's imagine a DVC repo used as a [data registry](/doc/use-cases/data-registry#using-registries), structured with diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json index 457d724a9f..30cd1fce0d 100644 --- a/content/docs/sidebar.json +++ b/content/docs/sidebar.json @@ -323,6 +323,10 @@ "label": "install", "slug": "install" }, + { + "label": "list-url", + "slug": "list-url" + }, { "label": "list", "slug": "list" diff --git a/content/linked-terms.js b/content/linked-terms.js index e0c6f05f55..140e564faa 100644 --- a/content/linked-terms.js +++ b/content/linked-terms.js @@ -23,6 +23,10 @@ module.exports = [ matches: 'dvc experiments', url: '/doc/command-reference/exp' }, + { + matches: 'dvc ls-url', + url: '/doc/command-reference/list-url' + }, { matches: 'dvc ls', url: '/doc/command-reference/list'