Skip to content

Commit

Permalink
remote: use consistent order and terminology for remote types
Browse files Browse the repository at this point in the history
  • Loading branch information
jorgeorpinel committed Dec 9, 2019
1 parent d9ab97f commit 63a4ac9
Show file tree
Hide file tree
Showing 17 changed files with 221 additions and 220 deletions.
7 changes: 4 additions & 3 deletions pages/features.js
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,10 @@ export default function FeaturesPage() {
</Icon>
<Name>Storage agnostic</Name>
<Description>
Use S3, Azure, Google Drive, GCP, SSH, SFTP, Aliyun OSS rsync or
any network-attached storage to store data. The list of supported
protocols is constantly expanding.
Use Amazon S3, Microsoft Azure Blob Storage, Google Drive, Google
Cloud Storage, Aliyun OSS, SSH/SFTP, HDFS, HTTP, network-attached
storage, or rsync to store data. The list of supported remote
storage is constantly expanding.
</Description>
</Feature>
<Feature>
Expand Down
4 changes: 2 additions & 2 deletions src/Diagram/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ const ColumnOne = () => (
<Description fullWidth>
<p>
Version control machine learning models, data sets and intermediate
files. DVC connects them with code and uses S3, Azure, Google Drive,
GCP, SSH, Aliyun OSS or to store file contents.
files. DVC connects them with code, and uses cloud storage, SSH, NAS,
etc. to store file contents.
</p>
<p>
Full code and data provenance help track the complete evolution of every
Expand Down
2 changes: 1 addition & 1 deletion static/docs/command-reference/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ for more details.)
- `cache.hdfs` - name of an
[HDFS remote to use as external cache](/doc/user-guide/managing-external-data#hdfs).

- `cache.azure` - name of an Azure remote to use as
- `cache.azure` - name of a Microsoft Azure Blob Storage remote to use as
[external cache](/doc/user-guide/managing-external-data).

### state
Expand Down
9 changes: 5 additions & 4 deletions static/docs/command-reference/get-url.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,11 @@ DVC supports several types of (local or) remote locations (protocols):
| `hdfs` | HDFS | `hdfs://[email protected]/path/to/data.csv` |
| `http` | HTTP to file | `https://example.com/path/to/data.csv` |

> Depending on the remote locations type you plan to download data from you
> might need to specify one of the optional dependencies: `[s3]`, `[ssh]`,
> `[gs]`, `[azure]`, `[gdrive]`, and `[oss]` (or `[all]` to include them all)
> when [installing DVC](/doc/install) with `pip`.
> If you installed DVC via `pip` and plan to use cloud services as remote
> storage, you might need to install these optional dependencies: `[s3]`,
> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to
> include them all. The command should look like this: `pip install "dvc[s3]"`.
> (This example installs `boto3` library along with DVC to support S3 storage.)
Another way to understand the `dvc get-url` command is as a tool for downloading
data files.
Expand Down
9 changes: 5 additions & 4 deletions static/docs/command-reference/import-url.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,11 @@ DVC supports several types of (local or) remote locations (protocols):
| `http` | HTTP to file with _strong ETag_ (see explanation below) | `https://example.com/path/to/data.csv` |
| `remote` | Remote path (see explanation below) | `remote://myremote/path/to/file` |

> Depending on the remote locations type you plan to download data from you
> might need to specify one of the optional dependencies: `[s3]`, `[ssh]`,
> `[gs]`, `[azure]`, `[gdrive]`, and `[oss]` (or `[all]` to include them all)
> when [installing DVC](/doc/install) with `pip`.
> If you installed DVC via `pip` and plan to use cloud services as remote
> storage, you might need to install these optional dependencies: `[s3]`,
> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to
> include them all. The command should look like this: `pip install "dvc[s3]"`.
> (This example installs `boto3` library along with DVC to support S3 storage.)
<!-- Separate MD quote: -->

Expand Down
190 changes: 95 additions & 95 deletions static/docs/command-reference/remote/add.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,20 +23,20 @@ positional arguments:

## Description

`name` and `url` are required. `url` specifies a location to store your data. It
can be an SSH, S3 path, Azure, Google Drive path, Google Cloud path, Aliyun OSS,
local directory, etc. (See all the supported remote storage types in the
examples below.) If `url` is a local relative path, it will be resolved relative
to the current working directory but saved **relative to the config file
location** (see LOCAL example below). Whenever possible DVC will create a remote
directory if it doesn't exists yet. It won't create an S3 bucket though and will
rely on default access settings.

> If you installed DVC via `pip`, depending on the remote storage type you plan
> to use you might need to install optional dependencies: `[s3]`, `[ssh]`,
> `[gs]`, `[azure]`, `[gdrive]`, and `[oss]`; or `[all]` to include them all.
> The command should look like this: `pip install "dvc[s3]"`. This installs
> `boto3` library along with DVC to support Amazon S3 storage.
`name` and `url` are required. `url` specifies a location (path, address,
endpoint) to store your data. It can represent a cloud storage service, an SSH
server, network-attached storage, or even a directory in the local file system.
(See all the supported remote storage types in the examples below.) If `url` is
a relative path, it will be resolved against the current working directory, but
saved **relative to the config file location** (see LOCAL example below).
Whenever possible, DVC will create a remote directory if it doesn't exists yet.
(It won't create an S3 bucket though, and will rely on default access settings.)

> If you installed DVC via `pip` and plan to use cloud services as remote
> storage, you might need to install these optional dependencies: `[s3]`,
> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to
> include them all. The command should look like this: `pip install "dvc[s3]"`.
> (This example installs `boto3` library along with DVC to support S3 storage.)
This command creates a section in the <abbr>DVC project</abbr>'s
[config file](/doc/command-reference/config) and optionally assigns a default
Expand Down Expand Up @@ -89,46 +89,6 @@ These are the possible remote storage (protocols) DVC can work with:

<details>

### Click for local remote

A "local remote" is a directory in the machine's file system.

> While the term may seem contradictory, it doesn't have to be. The "local" part
> refers to the machine where the project is stored, so it can be any directory
> accessible to the same system. The "remote" part refers specifically to the
> project/repository itself.
Using an absolute path (recommended):

```dvc
$ dvc remote add myremote /tmp/my-dvc-storage
$ cat .dvc/config
...
['remote "myremote"']
url = /tmp/my-dvc-storage
...
```

> Note that the absolute path `/tmp/my-dvc-storage` is saved as is.
Using a relative path:

```dvc
$ dvc remote add myremote ../my-dvc-storage
$ cat .dvc/config
...
['remote "myremote"']
url = ../../my-dvc-storage
...
```

> Note that `../my-dvc-storage` has been resolved relative to the `.dvc/` dir,
> resulting in `../../my-dvc-storage`.
</details>

<details>

### Click for Amazon S3

> **Note!** Before adding a new remote be sure to login into AWS services and
Expand Down Expand Up @@ -196,7 +156,7 @@ For more information about the variables DVC supports, please visit

<details>

### Click for Azure
### Click for Microsoft Azure Blob Storage

```dvc
$ dvc remote add myremote azure://my-container-name/path
Expand Down Expand Up @@ -282,15 +242,70 @@ $ dvc remote add myremote gs://bucket/path

<details>

### Click for Aliyun OSS

First you need to setup OSS storage on Aliyun Cloud and then use an S3 style URL
for OSS storage and make the endpoint value configurable. An example is shown
below:

```dvc
$ dvc remote add myremote oss://my-bucket/path
```

To set key id, key secret and endpoint you need to use `dvc remote modify`.
Example usage is show below. Make sure to use the `--local` option to avoid
committing your secrets into Git:

```dvc
$ dvc remote modify myremote --local oss_key_id my-key-id
$ dvc remote modify myremote --local oss_key_secret my-key-secret
$ dvc remote modify myremote oss_endpoint endpoint
```

You can also set environment variables and use them later, to set environment
variables use following environment variables:

```dvc
$ export OSS_ACCESS_KEY_ID="my-key-id"
$ export OSS_ACCESS_KEY_SECRET="my-key-secret"
$ export OSS_ENDPOINT="endpoint"
```

#### Test your OSS storage using docker

Start a container running an OSS emulator.

```dvc
$ git clone https://github.com/nanaya-tachibana/oss-emulator.git
$ docker image build -t oss:1.0 oss-emulator
$ docker run --detach -p 8880:8880 --name oss-emulator oss:1.0
```

Setup environment variables.

```dvc
$ export OSS_BUCKET='my-bucket'
$ export OSS_ENDPOINT='localhost:8880'
$ export OSS_ACCESS_KEY_ID='AccessKeyID'
$ export OSS_ACCESS_KEY_SECRET='AccessKeySecret'
```

> Uses default key id and key secret when they are not given, which gives read
> access to public read bucket and public bucket.
</details>

<details>

### Click for SSH

```dvc
$ dvc remote add myremote ssh://[email protected]/path/to/dir
```

> **Note!** DVC requires both SSH and SFTP access to work with SSH remote
> storage. Please check that you are able to connect to the remote location with
> tools like `ssh` and `sftp` (GNU/Linux).
> storage. Please check that you are able to connect both ways to the remote
> location, with tools like `ssh` and `sftp` (GNU/Linux).
<!-- Separate MD quote: -->

Expand Down Expand Up @@ -336,56 +351,41 @@ $ dvc remote add myremote https://example.com/path/to/dir

<details>

### Click for Aliyun OSS

First you need to setup OSS storage on Aliyun Cloud and then use an S3 style URL
for OSS storage and make the endpoint value configurable. An example is shown
below:

```dvc
$ dvc remote add myremote oss://my-bucket/path
```
### Click for local remote

To set key id, key secret and endpoint you need to use `dvc remote modify`.
Example usage is show below. Make sure to use the `--local` option to avoid
committing your secrets into Git:
A "local remote" is a directory in the machine's file system.

```dvc
$ dvc remote modify myremote --local oss_key_id my-key-id
$ dvc remote modify myremote --local oss_key_secret my-key-secret
$ dvc remote modify myremote oss_endpoint endpoint
```
> While the term may seem contradictory, it doesn't have to be. The "local" part
> refers to the machine where the project is stored, so it can be any directory
> accessible to the same system. The "remote" part refers specifically to the
> project/repository itself.
You can also set environment variables and use them later, to set environment
variables use following environment variables:
Using an absolute path (recommended):

```dvc
$ export OSS_ACCESS_KEY_ID="my-key-id"
$ export OSS_ACCESS_KEY_SECRET="my-key-secret"
$ export OSS_ENDPOINT="endpoint"
$ dvc remote add myremote /tmp/my-dvc-storage
$ cat .dvc/config
...
['remote "myremote"']
url = /tmp/my-dvc-storage
...
```

#### Test your OSS storage using docker

Start a container running an OSS emulator.

```dvc
$ git clone https://github.com/nanaya-tachibana/oss-emulator.git
$ docker image build -t oss:1.0 oss-emulator
$ docker run --detach -p 8880:8880 --name oss-emulator oss:1.0
```
> Note that the absolute path `/tmp/my-dvc-storage` is saved as is.
Setup environment variables.
Using a relative path:

```dvc
$ export OSS_BUCKET='my-bucket'
$ export OSS_ENDPOINT='localhost:8880'
$ export OSS_ACCESS_KEY_ID='AccessKeyID'
$ export OSS_ACCESS_KEY_SECRET='AccessKeySecret'
$ dvc remote add myremote ../my-dvc-storage
$ cat .dvc/config
...
['remote "myremote"']
url = ../../my-dvc-storage
...
```

> Uses default key id and key secret when they are not given, which gives read
> access to public read bucket and public bucket.
> Note that `../my-dvc-storage` has been resolved relative to the `.dvc/` dir,
> resulting in `../../my-dvc-storage`.
</details>

Expand Down
10 changes: 5 additions & 5 deletions static/docs/command-reference/remote/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,11 @@ DVC supports several types of remote storage: local file system, SSH, Amazon S3,
Google Cloud Storage, HTTP, HDFS, among others. Refer to `dvc remote add` for
more details.

> If you installed DVC via `pip`, depending on the remote storage type you plan
> to use you might need to install optional dependencies: `[s3]`, `[ssh]`,
> `[gs]`, `[azure]`, `[gdrive]`, and `[oss]`; or `[all]` to include them all.
> The command should look like this: `pip install "dvc[s3]"`. This installs
> `boto3` library along with DVC to support S3 storage.
> If you installed DVC via `pip` and plan to use cloud services as remote
> storage, you might need to install these optional dependencies: `[s3]`,
> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to
> include them all. The command should look like this: `pip install "dvc[s3]"`.
> (This example installs `boto3` library along with DVC to support S3 storage.)
Using DVC with a remote data storage is optional. By default, DVC is configured
to use a local data storage only (usually the `.dvc/cache` directory). This
Expand Down
Loading

0 comments on commit 63a4ac9

Please sign in to comment.