diff --git a/pages/features.js b/pages/features.js index 12d4d87b44..6e4d8b8117 100644 --- a/pages/features.js +++ b/pages/features.js @@ -53,9 +53,10 @@ export default function FeaturesPage() { Storage agnostic - Use S3, Azure, Google Drive, GCP, SSH, SFTP, Aliyun OSS rsync or - any network-attached storage to store data. The list of supported - protocols is constantly expanding. + Use Amazon S3, Microsoft Azure Blob Storage, Google Drive, Google + Cloud Storage, Aliyun OSS, SSH/SFTP, HDFS, HTTP, network-attached + storage, or rsync to store data. The list of supported remote + storage is constantly expanding. diff --git a/src/Diagram/index.js b/src/Diagram/index.js index d027ae3f0d..7a0ea4ae12 100644 --- a/src/Diagram/index.js +++ b/src/Diagram/index.js @@ -41,8 +41,8 @@ const ColumnOne = () => (

Version control machine learning models, data sets and intermediate - files. DVC connects them with code and uses S3, Azure, Google Drive, - GCP, SSH, Aliyun OSS or to store file contents. + files. DVC connects them with code, and uses cloud storage, SSH, NAS, + etc. to store file contents.

Full code and data provenance help track the complete evolution of every diff --git a/static/docs/command-reference/config.md b/static/docs/command-reference/config.md index afb1415cd6..0bdd86116a 100644 --- a/static/docs/command-reference/config.md +++ b/static/docs/command-reference/config.md @@ -164,7 +164,7 @@ for more details.) - `cache.hdfs` - name of an [HDFS remote to use as external cache](/doc/user-guide/managing-external-data#hdfs). -- `cache.azure` - name of an Azure remote to use as +- `cache.azure` - name of a Microsoft Azure Blob Storage remote to use as [external cache](/doc/user-guide/managing-external-data). ### state diff --git a/static/docs/command-reference/get-url.md b/static/docs/command-reference/get-url.md index 2f6ef0a027..b0f6359562 100644 --- a/static/docs/command-reference/get-url.md +++ b/static/docs/command-reference/get-url.md @@ -45,10 +45,11 @@ DVC supports several types of (local or) remote locations (protocols): | `hdfs` | HDFS | `hdfs://user@example.com/path/to/data.csv` | | `http` | HTTP to file | `https://example.com/path/to/data.csv` | -> Depending on the remote locations type you plan to download data from you -> might need to specify one of the optional dependencies: `[s3]`, `[ssh]`, -> `[gs]`, `[azure]`, `[gdrive]`, and `[oss]` (or `[all]` to include them all) -> when [installing DVC](/doc/install) with `pip`. +> If you installed DVC via `pip` and plan to use cloud services as remote +> storage, you might need to install these optional dependencies: `[s3]`, +> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to +> include them all. The command should look like this: `pip install "dvc[s3]"`. +> (This example installs `boto3` library along with DVC to support S3 storage.) Another way to understand the `dvc get-url` command is as a tool for downloading data files. diff --git a/static/docs/command-reference/import-url.md b/static/docs/command-reference/import-url.md index 3de23da252..15854a0fd1 100644 --- a/static/docs/command-reference/import-url.md +++ b/static/docs/command-reference/import-url.md @@ -58,10 +58,11 @@ DVC supports several types of (local or) remote locations (protocols): | `http` | HTTP to file with _strong ETag_ (see explanation below) | `https://example.com/path/to/data.csv` | | `remote` | Remote path (see explanation below) | `remote://myremote/path/to/file` | -> Depending on the remote locations type you plan to download data from you -> might need to specify one of the optional dependencies: `[s3]`, `[ssh]`, -> `[gs]`, `[azure]`, `[gdrive]`, and `[oss]` (or `[all]` to include them all) -> when [installing DVC](/doc/install) with `pip`. +> If you installed DVC via `pip` and plan to use cloud services as remote +> storage, you might need to install these optional dependencies: `[s3]`, +> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to +> include them all. The command should look like this: `pip install "dvc[s3]"`. +> (This example installs `boto3` library along with DVC to support S3 storage.) diff --git a/static/docs/command-reference/remote/add.md b/static/docs/command-reference/remote/add.md index 891d150db0..05f617b7ae 100644 --- a/static/docs/command-reference/remote/add.md +++ b/static/docs/command-reference/remote/add.md @@ -23,20 +23,20 @@ positional arguments: ## Description -`name` and `url` are required. `url` specifies a location to store your data. It -can be an SSH, S3 path, Azure, Google Drive path, Google Cloud path, Aliyun OSS, -local directory, etc. (See all the supported remote storage types in the -examples below.) If `url` is a local relative path, it will be resolved relative -to the current working directory but saved **relative to the config file -location** (see LOCAL example below). Whenever possible DVC will create a remote -directory if it doesn't exists yet. It won't create an S3 bucket though and will -rely on default access settings. - -> If you installed DVC via `pip`, depending on the remote storage type you plan -> to use you might need to install optional dependencies: `[s3]`, `[ssh]`, -> `[gs]`, `[azure]`, `[gdrive]`, and `[oss]`; or `[all]` to include them all. -> The command should look like this: `pip install "dvc[s3]"`. This installs -> `boto3` library along with DVC to support Amazon S3 storage. +`name` and `url` are required. `url` specifies a location (path, address, +endpoint) to store your data. It can represent a cloud storage service, an SSH +server, network-attached storage, or even a directory in the local file system. +(See all the supported remote storage types in the examples below.) If `url` is +a relative path, it will be resolved against the current working directory, but +saved **relative to the config file location** (see LOCAL example below). +Whenever possible, DVC will create a remote directory if it doesn't exists yet. +(It won't create an S3 bucket though, and will rely on default access settings.) + +> If you installed DVC via `pip` and plan to use cloud services as remote +> storage, you might need to install these optional dependencies: `[s3]`, +> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to +> include them all. The command should look like this: `pip install "dvc[s3]"`. +> (This example installs `boto3` library along with DVC to support S3 storage.) This command creates a section in the DVC project's [config file](/doc/command-reference/config) and optionally assigns a default @@ -89,46 +89,6 @@ These are the possible remote storage (protocols) DVC can work with:

-### Click for local remote - -A "local remote" is a directory in the machine's file system. - -> While the term may seem contradictory, it doesn't have to be. The "local" part -> refers to the machine where the project is stored, so it can be any directory -> accessible to the same system. The "remote" part refers specifically to the -> project/repository itself. - -Using an absolute path (recommended): - -```dvc -$ dvc remote add myremote /tmp/my-dvc-storage -$ cat .dvc/config - ... - ['remote "myremote"'] - url = /tmp/my-dvc-storage - ... -``` - -> Note that the absolute path `/tmp/my-dvc-storage` is saved as is. - -Using a relative path: - -```dvc -$ dvc remote add myremote ../my-dvc-storage -$ cat .dvc/config - ... - ['remote "myremote"'] - url = ../../my-dvc-storage - ... -``` - -> Note that `../my-dvc-storage` has been resolved relative to the `.dvc/` dir, -> resulting in `../../my-dvc-storage`. - -
- -
- ### Click for Amazon S3 > **Note!** Before adding a new remote be sure to login into AWS services and @@ -196,7 +156,7 @@ For more information about the variables DVC supports, please visit
-### Click for Azure +### Click for Microsoft Azure Blob Storage ```dvc $ dvc remote add myremote azure://my-container-name/path @@ -282,6 +242,61 @@ $ dvc remote add myremote gs://bucket/path
+### Click for Aliyun OSS + +First you need to setup OSS storage on Aliyun Cloud and then use an S3 style URL +for OSS storage and make the endpoint value configurable. An example is shown +below: + +```dvc +$ dvc remote add myremote oss://my-bucket/path +``` + +To set key id, key secret and endpoint you need to use `dvc remote modify`. +Example usage is show below. Make sure to use the `--local` option to avoid +committing your secrets into Git: + +```dvc +$ dvc remote modify myremote --local oss_key_id my-key-id +$ dvc remote modify myremote --local oss_key_secret my-key-secret +$ dvc remote modify myremote oss_endpoint endpoint +``` + +You can also set environment variables and use them later, to set environment +variables use following environment variables: + +```dvc +$ export OSS_ACCESS_KEY_ID="my-key-id" +$ export OSS_ACCESS_KEY_SECRET="my-key-secret" +$ export OSS_ENDPOINT="endpoint" +``` + +#### Test your OSS storage using docker + +Start a container running an OSS emulator. + +```dvc +$ git clone https://github.com/nanaya-tachibana/oss-emulator.git +$ docker image build -t oss:1.0 oss-emulator +$ docker run --detach -p 8880:8880 --name oss-emulator oss:1.0 +``` + +Setup environment variables. + +```dvc +$ export OSS_BUCKET='my-bucket' +$ export OSS_ENDPOINT='localhost:8880' +$ export OSS_ACCESS_KEY_ID='AccessKeyID' +$ export OSS_ACCESS_KEY_SECRET='AccessKeySecret' +``` + +> Uses default key id and key secret when they are not given, which gives read +> access to public read bucket and public bucket. + +
+ +
+ ### Click for SSH ```dvc @@ -289,8 +304,8 @@ $ dvc remote add myremote ssh://user@example.com/path/to/dir ``` > **Note!** DVC requires both SSH and SFTP access to work with SSH remote -> storage. Please check that you are able to connect to the remote location with -> tools like `ssh` and `sftp` (GNU/Linux). +> storage. Please check that you are able to connect both ways to the remote +> location, with tools like `ssh` and `sftp` (GNU/Linux). @@ -336,56 +351,41 @@ $ dvc remote add myremote https://example.com/path/to/dir
-### Click for Aliyun OSS - -First you need to setup OSS storage on Aliyun Cloud and then use an S3 style URL -for OSS storage and make the endpoint value configurable. An example is shown -below: - -```dvc -$ dvc remote add myremote oss://my-bucket/path -``` +### Click for local remote -To set key id, key secret and endpoint you need to use `dvc remote modify`. -Example usage is show below. Make sure to use the `--local` option to avoid -committing your secrets into Git: +A "local remote" is a directory in the machine's file system. -```dvc -$ dvc remote modify myremote --local oss_key_id my-key-id -$ dvc remote modify myremote --local oss_key_secret my-key-secret -$ dvc remote modify myremote oss_endpoint endpoint -``` +> While the term may seem contradictory, it doesn't have to be. The "local" part +> refers to the machine where the project is stored, so it can be any directory +> accessible to the same system. The "remote" part refers specifically to the +> project/repository itself. -You can also set environment variables and use them later, to set environment -variables use following environment variables: +Using an absolute path (recommended): ```dvc -$ export OSS_ACCESS_KEY_ID="my-key-id" -$ export OSS_ACCESS_KEY_SECRET="my-key-secret" -$ export OSS_ENDPOINT="endpoint" +$ dvc remote add myremote /tmp/my-dvc-storage +$ cat .dvc/config + ... + ['remote "myremote"'] + url = /tmp/my-dvc-storage + ... ``` -#### Test your OSS storage using docker - -Start a container running an OSS emulator. - -```dvc -$ git clone https://github.com/nanaya-tachibana/oss-emulator.git -$ docker image build -t oss:1.0 oss-emulator -$ docker run --detach -p 8880:8880 --name oss-emulator oss:1.0 -``` +> Note that the absolute path `/tmp/my-dvc-storage` is saved as is. -Setup environment variables. +Using a relative path: ```dvc -$ export OSS_BUCKET='my-bucket' -$ export OSS_ENDPOINT='localhost:8880' -$ export OSS_ACCESS_KEY_ID='AccessKeyID' -$ export OSS_ACCESS_KEY_SECRET='AccessKeySecret' +$ dvc remote add myremote ../my-dvc-storage +$ cat .dvc/config + ... + ['remote "myremote"'] + url = ../../my-dvc-storage + ... ``` -> Uses default key id and key secret when they are not given, which gives read -> access to public read bucket and public bucket. +> Note that `../my-dvc-storage` has been resolved relative to the `.dvc/` dir, +> resulting in `../../my-dvc-storage`.
diff --git a/static/docs/command-reference/remote/index.md b/static/docs/command-reference/remote/index.md index ec1c504230..e02052d7f2 100644 --- a/static/docs/command-reference/remote/index.md +++ b/static/docs/command-reference/remote/index.md @@ -37,11 +37,11 @@ DVC supports several types of remote storage: local file system, SSH, Amazon S3, Google Cloud Storage, HTTP, HDFS, among others. Refer to `dvc remote add` for more details. -> If you installed DVC via `pip`, depending on the remote storage type you plan -> to use you might need to install optional dependencies: `[s3]`, `[ssh]`, -> `[gs]`, `[azure]`, `[gdrive]`, and `[oss]`; or `[all]` to include them all. -> The command should look like this: `pip install "dvc[s3]"`. This installs -> `boto3` library along with DVC to support S3 storage. +> If you installed DVC via `pip` and plan to use cloud services as remote +> storage, you might need to install these optional dependencies: `[s3]`, +> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to +> include them all. The command should look like this: `pip install "dvc[s3]"`. +> (This example installs `boto3` library along with DVC to support S3 storage.) Using DVC with a remote data storage is optional. By default, DVC is configured to use a local data storage only (usually the `.dvc/cache` directory). This diff --git a/static/docs/command-reference/remote/modify.md b/static/docs/command-reference/remote/modify.md index 094679466a..d88e7480ae 100644 --- a/static/docs/command-reference/remote/modify.md +++ b/static/docs/command-reference/remote/modify.md @@ -27,8 +27,8 @@ positional arguments: ## Description Remote `name` and `option` name are required. Option names are remote type -specific. See below examples and a list of remote storage types: Amazon S3, -Google Cloud, Azure, Google Drive, SSH, ALiyun OSS, among others. +specific. See `dvc remote add` and **Available settings** section below for a +list of remote storage types. This command modifies a `remote` section in the project's [config file](/doc/command-reference/config). Alternatively, `dvc config` or @@ -64,7 +64,7 @@ The following are the types of remote storage (protocols) supported:
-### Click for Amazon S3 available options +### Click for Amazon S3 options By default DVC expects your AWS CLI is already [configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html). @@ -132,7 +132,7 @@ these settings, you could use the following options:
-### Click for S3 API compatible storage available options +### Click for S3 API compatible storage options To communicate with a remote object storage that supports an S3 compatible API (e.g. [Minio](https://min.io/), @@ -162,7 +162,7 @@ For more information about the variables DVC supports, please visit
-### Click for Azure available options +### Click for Microsoft Azure Blob Storage options - `url` - remote location URL. @@ -187,7 +187,7 @@ For more information on configuring Azure Storage connection strings, visit
-### Click for Google Drive available options +### Click for Google Drive options - `url` - remote location URL. @@ -211,7 +211,7 @@ For more information on configuring Azure Storage connection strings, visit
-### Click for Google Cloud Storage available options +### Click for Google Cloud Storage options - `projectname` - project name to use. @@ -236,7 +236,31 @@ For more information on configuring Azure Storage connection strings, visit
-### Click for SSH available options +### Click for Aliyun OSS options + +- `oss_key_id` - OSS key id to use to access a remote. + + ```dvc + $ dvc remote modify myremote --local oss_key_id my-key-id + ``` + +- `oss_key_secret` - OSS secret key for authorizing access into a remote. + + ```dvc + $ dvc remote modify myremote --local oss_key_secret my-key-secret + ``` + +- `oss_endpoint endpoint` - OSS endpoint values for accessing remote container. + + ```dvc + $ dvc remote modify myremote oss_endpoint endpoint + ``` + +
+ +
+ +### Click for SSH options - `url` - remote location URL. @@ -304,7 +328,7 @@ For more information on configuring Azure Storage connection strings, visit
-### Click for HDFS available options +### Click for HDFS options - `user` - username to use to access a remote. @@ -314,30 +338,6 @@ For more information on configuring Azure Storage connection strings, visit
-
- -### Click for Aliyun OSS available options - -- `oss_key_id` - OSS key id to use to access a remote. - - ```dvc - $ dvc remote modify myremote --local oss_key_id my-key-id - ``` - -- `oss_key_secret` - OSS secret key for authorizing access into a remote. - - ```dvc - $ dvc remote modify myremote --local oss_key_secret my-key-secret - ``` - -- `oss_endpoint endpoint` - OSS endpoint values for accessing remote container. - - ```dvc - $ dvc remote modify myremote oss_endpoint endpoint - ``` - -
- ## Example: Customize an S3 remote Let's first set up a _default_ S3 remote: diff --git a/static/docs/get-started/configure.md b/static/docs/get-started/configure.md index 1d84e22092..09de1420cd 100644 --- a/static/docs/get-started/configure.md +++ b/static/docs/get-started/configure.md @@ -31,23 +31,23 @@ $ git commit .dvc/config -m "Configure local remote" > to use DVC. For most [use cases](/doc/use-cases), other "more remote" types of > remotes will be required. -Adding a remote should be specified by both its type (protocol) and its path. -DVC currently supports seven types of remotes: +[Adding a remote](/doc/command-reference/remote/add) should be specified by both +its type (protocol) and its path. DVC currently supports these types of remotes: -- `local`: Local Directory - `s3`: Amazon Simple Storage Service -- `gs`: Google Cloud Storage -- `azure`: Azure Blob Storage +- `azure`: Microsoft Azure Blob Storage - `gdrive` : Google Drive -- `ssh`: Secure Shell +- `gs`: Google Cloud Storage +- `ssh`: Secure Shell (requires SFTP) - `hdfs`: Hadoop Distributed File System - `http`: HTTP and HTTPS protocols +- `local`: Directory in the local file system -> If you installed DVC via `pip`, depending on the remote type you plan to use -> you might need to install optional dependencies: `[s3]`, `[ssh]`, `[gs]`, -> `[azure]`, `[gdrive]`, and `[oss]`; or `[all]` to include them all. The -> command should look like this: `pip install "dvc[s3]"`. This installs `boto3` -> library along with DVC to support Amazon S3 storage. +> If you installed DVC via `pip` and plan to use cloud services as remote +> storage, you might need to install these optional dependencies: `[s3]`, +> `[azure]`, `[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Alternatively, use `[all]` to +> include them all. The command should look like this: `pip install "dvc[s3]"`. +> (This example installs `boto3` library along with DVC to support S3 storage.) For example, to setup an S3 remote we would use something like this (make sure that `mybucket` exists): diff --git a/static/docs/install/linux.md b/static/docs/install/linux.md index 5d779f03ba..297389845b 100644 --- a/static/docs/install/linux.md +++ b/static/docs/install/linux.md @@ -13,8 +13,8 @@ $ pip install dvc ``` Depending on the type of the [remote storage](/doc/command-reference/remote) you -plan to use, you might need to install optional dependencies: `[s3]`, `[ssh]`, -`[gs]`, `[azure]`, `[gdrive]`, and `[oss]`. Use `[all]` to include them all. +plan to use, you might need to install optional dependencies: `[s3]`, `[azure]`, +`[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Use `[all]` to include them all.
diff --git a/static/docs/install/macos.md b/static/docs/install/macos.md index 428a26816a..c5ef6274c0 100644 --- a/static/docs/install/macos.md +++ b/static/docs/install/macos.md @@ -36,8 +36,8 @@ $ pip install dvc ``` Depending on the type of the [remote storage](/doc/command-reference/remote) you -plan to use, you might need to install optional dependencies: `[s3]`, `[ssh]`, -`[gs]`, `[azure]`, `[gdrive]`, and `[oss]`. Use `[all]` to include them all. +plan to use, you might need to install optional dependencies: `[s3]`, `[azure]`, +`[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Use `[all]` to include them all.
diff --git a/static/docs/install/windows.md b/static/docs/install/windows.md index 47fffd24a9..95a799edb4 100644 --- a/static/docs/install/windows.md +++ b/static/docs/install/windows.md @@ -37,8 +37,8 @@ $ pip install dvc ``` Depending on the type of the [remote storage](/doc/command-reference/remote) you -plan to use, you might need to install optional dependencies: `[s3]`, `[ssh]`, -`[gs]`, `[azure]`, `[gdrive]`, and `[oss]`. Use `[all]` to include them all. +plan to use, you might need to install optional dependencies: `[s3]`, `[azure]`, +`[gdrive]`, `[gs]`, `[oss]`, `[ssh]`. Use `[all]` to include them all.
diff --git a/static/docs/understanding-dvc/core-features.md b/static/docs/understanding-dvc/core-features.md index 8b0fa2be09..4e3abcdc7c 100644 --- a/static/docs/understanding-dvc/core-features.md +++ b/static/docs/understanding-dvc/core-features.md @@ -15,5 +15,5 @@ - It's **Open-source** and **Self-serve**: DVC is free and doesn't require any additional services. -- DVC supports cloud storage (Amazon S3, Azure Blob Storage, Google Drive, and - Google Cloud Storage) for **data sources and pre-trained model sharing**. +- DVC supports cloud storage (Amazon S3, Microsoft Azure Blob Storage, Google + Cloud Storage, etc.) for **data sources and pre-trained model sharing**. diff --git a/static/docs/understanding-dvc/how-it-works.md b/static/docs/understanding-dvc/how-it-works.md index 32a525e763..2701aff59f 100644 --- a/static/docs/understanding-dvc/how-it-works.md +++ b/static/docs/understanding-dvc/how-it-works.md @@ -73,7 +73,7 @@ ``` - The cache of a DVC project can be shared with colleagues through Amazon S3, - Azure Blob Storage, Google Drive, and Google Cloud Storage, among others: + Microsoft Azure Blob Storage, Google Cloud Storage, among others: ```dvc $ git push diff --git a/static/docs/use-cases/sharing-data-and-model-files.md b/static/docs/use-cases/sharing-data-and-model-files.md index c351fc3519..ce16ceadf8 100644 --- a/static/docs/use-cases/sharing-data-and-model-files.md +++ b/static/docs/use-cases/sharing-data-and-model-files.md @@ -5,10 +5,9 @@ easy to consistently get all your data files and directories into any machine, along with matching source code. All you need to do is to setup [remote storage](/doc/command-reference/remote) for your DVC project, and push the data there, so others can reach it. Currently DVC -supports Amazon S3, Google Cloud Storage, Microsoft Azure Blob Storage, Google -Drive, SSH, HDFS, and other remote locations, and the list is constantly -growing. (For a complete list and configuration instructions, take a look at the -examples in `dvc remote add`.) +supports Amazon S3, Microsoft Azure Blob Storage, Google Drive, Google Cloud +Storage, SSH, HDFS, and other remote locations. The list is constantly growing. +(For a complete list and configuration instructions, refer to `dvc remote add`.) ![](/static/img/model-sharing-digram.png) diff --git a/static/docs/use-cases/versioning-data-and-model-files.md b/static/docs/use-cases/versioning-data-and-model-files.md index 24e5be449e..00fcee8a36 100644 --- a/static/docs/use-cases/versioning-data-and-model-files.md +++ b/static/docs/use-cases/versioning-data-and-model-files.md @@ -19,9 +19,9 @@ In this basic scenario, DVC is a better replacement for `git-lfs` (see [Related Technologies](/doc/understanding-dvc/related-technologies)) and for ad-hoc scripts on top of Amazon S3 (or any other cloud) used to manage ML data artifacts like raw data, models, etc. Unlike `git-lfs`, DVC -doesn't require installing a dedicated server; It can be used on-premises (NAS, -SSH, for example) or with any major cloud provider (S3, Google Cloud, Azure, -Google Drive). +doesn't require installing a dedicated server; It can be used on-premises (e.g. +SSH, NAS) or with any major cloud storage provider (Amazon S3, Microsoft Azure +Blob Storage, Google Drive, Google Cloud Storage, etc). Let's say you already have a Git repository that uses a bunch of images stored in the `images/` directory and has a `model.pkl` file – a model file deployed to diff --git a/static/docs/user-guide/contributing/core.md b/static/docs/user-guide/contributing/core.md index 0fdfa99df6..22a748c4de 100644 --- a/static/docs/user-guide/contributing/core.md +++ b/static/docs/user-guide/contributing/core.md @@ -153,10 +153,9 @@ Install requirements for whatever remotes you are going to test: ```dvc $ pip install -e ".[s3]" -$ pip install -e ".[gs]" $ pip install -e ".[azure]" $ pip install -e ".[gdrive]" -$ pip install -e ".[ssh]" +$ pip install -e ".[gs]" # or $ pip install -e ".[all]" ``` @@ -182,7 +181,7 @@ manipulations below.
-### Click for S3 testing instructions +### Click for Amazon S3 instructions Install [aws cli](https://docs.aws.amazon.com/en_us/cli/latest/userguide/cli-chap-install.html) @@ -201,47 +200,7 @@ $ export DVC_TEST_AWS_REPO_BUCKET="...TEST-S3-BUCKET..."
-### Click for Google Cloud Storage testing instructions - -Go through the [quick start](https://cloud.google.com/sdk/docs/quickstarts) for -your OS. After that you should have `gcloud` command line tool available and -authenticated with your google account. - -You then need to create a bucket, a service account and get its credentials. You -can do this via web UI or terminal. Then you need to put your keys to -`scripts/ci/gcp-creds.json` and add these to your env vars: - -```dvc -$ export GOOGLE_APPLICATION_CREDENTIALS=".gcp-creds.json" -$ export GCP_CREDS="yes" -$ export DVC_TEST_GCP_REPO_BUCKET="dvc-test-xyz" -``` - -Here are some command examples to do this: - -```dvc -# This name needs to be globally unique -$ export GCP_NAME="dvc-test-xyz" -$ gcloud projects create $GCP_NAME -$ gcloud iam service-accounts create $GCP_NAME --project=$GCP_NAME -$ gcloud iam service-accounts keys create \ - scripts/ci/gcp-creds.json \ - --iam-account=$GCP_NAME@$GCP_NAME.iam.gserviceaccount.com - -$ gcloud auth activate-service-account \ - --key-file=scripts/ci/gcp-creds.json -$ gcloud config set project $GCP_NAME -$ gsutil mb gs://$GCP_NAME/ -``` - -I used the same name for project, service account and bucket for simplicity. You -may use different names. - -
- -
- -### Click for Azure testing instructions +### Click for Microsoft Azure Blob Storage instructions Install [Node.js](https://nodejs.org/en/download/) and then install and run Azurite: @@ -263,7 +222,7 @@ $ export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=http;AccountN
-### Click for Google Drive testing instructions +### Click for Google Drive instructions > Please remember that Google Drive access tokens are personal credentials and > should not be shared with anyone, otherwise risking unauthorized usage of the @@ -284,7 +243,47 @@ $ export GDRIVE_USER_CREDENTIALS_DATA='CONTENT_of_gdrive-user-credentials.json'
-### Click for HDFS testing instructions +### Click for Google Cloud Storage instructions + +Go through the [quick start](https://cloud.google.com/sdk/docs/quickstarts) for +your OS. After that you should have `gcloud` command line tool available and +authenticated with your google account. + +You then need to create a bucket, a service account and get its credentials. You +can do this via web UI or terminal. Then you need to put your keys to +`scripts/ci/gcp-creds.json` and add these to your env vars: + +```dvc +$ export GOOGLE_APPLICATION_CREDENTIALS=".gcp-creds.json" +$ export GCP_CREDS="yes" +$ export DVC_TEST_GCP_REPO_BUCKET="dvc-test-xyz" +``` + +Here are some command examples to do this: + +```dvc +# This name needs to be globally unique +$ export GCP_NAME="dvc-test-xyz" +$ gcloud projects create $GCP_NAME +$ gcloud iam service-accounts create $GCP_NAME --project=$GCP_NAME +$ gcloud iam service-accounts keys create \ + scripts/ci/gcp-creds.json \ + --iam-account=$GCP_NAME@$GCP_NAME.iam.gserviceaccount.com + +$ gcloud auth activate-service-account \ + --key-file=scripts/ci/gcp-creds.json +$ gcloud config set project $GCP_NAME +$ gsutil mb gs://$GCP_NAME/ +``` + +I used the same name for project, service account and bucket for simplicity. You +may use different names. + +
+ +
+ +### Click for HDFS instructions Tests currently only work on Linux. First you need to set up passwordless ssh access to localhost: