Skip to content

Commit

Permalink
[ROCm] Fix azcopy issue on ROCm ci pipeline (#13365)
Browse files Browse the repository at this point in the history
### Description
<!-- Describe your changes. -->

Use SAS Token to fix error` failed to perform copy command due to error:
no SAS token or OAuth token is present and the resource is not public`

Generate SAS Token of target data, add it into Key vault, and use it as
Pipeline Variable.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
  • Loading branch information
2 people authored and snnn committed Oct 25, 2022
1 parent 8816bce commit 861125c
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 5 deletions.
8 changes: 7 additions & 1 deletion orttraining/tools/ci_test/download_azure_blob_archive.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,13 @@ def main():
with tempfile.TemporaryDirectory() as temp_dir, get_azcopy() as azcopy_path:
archive_path = os.path.join(temp_dir, "archive.zip")
print("Downloading archive from '{}'...".format(args.azure_blob_url))
_download(azcopy_path, args.azure_blob_url, archive_path)

azure_blob_url = args.azure_blob_url
azure_blob_sas_token = os.getenv("AZURE_BLOB_SAS_TOKEN", None)
if azure_blob_sas_token and azure_blob_sas_token != "":
azure_blob_url = azure_blob_url + "?" + azure_blob_sas_token

_download(azcopy_path, azure_blob_url, archive_path)
if args.archive_sha256_digest:
_check_file_sha256_digest(archive_path, args.archive_sha256_digest)
print("Extracting to '{}'...".format(args.target_dir))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,12 @@ jobs:
eval "$('/home/ciagent/conda/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
echo "Selecting GPU based on HIP_VISIBLE_DEVICES=$HIP_VISIBLE_DEVICES"
displayName: 'Initialize environment'
# update these if the E2E test data changes
- script: |-
export AZURE_BLOB_SAS_TOKEN="$(onnxruntimetestdata-storage-training-container-sas-token)"
python orttraining/tools/ci_test/download_azure_blob_archive.py \
--azure_blob_url https://onnxruntimetestdata.blob.core.windows.net/training/onnxruntime_training_data.zip?snapshot=2020-06-15T23:17:35.8314853Z \
--azure_blob_url https://onnxruntimetestdata.blob.core.windows.net/training/onnxruntime_training_data.zip \
--target_dir training_e2e_test_data \
--archive_sha256_digest B01C169B6550D1A0A6F1B4E2F34AE2A8714B52DBB70AC04DA85D371F691BDFF9
displayName: 'Download onnxruntime_training_data.zip data'
Expand Down Expand Up @@ -66,7 +67,7 @@ jobs:
--gpu_sku MI100_32G
displayName: 'Run C++ BERT-L performance test'
condition: succeededOrFailed() # ensure all tests are run
- script: |-
python orttraining/tools/ci_test/run_convergence_test.py \
--binary_dir build/RelWithDebInfo \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -278,11 +278,13 @@ jobs:
- task: CmdLine@2
inputs:
script: |-
export AZURE_BLOB_SAS_TOKEN="$(onnxruntimetestdata-storage-training-container-sas-token)"
python orttraining/tools/ci_test/download_azure_blob_archive.py \
--azure_blob_url https://onnxruntimetestdata.blob.core.windows.net/training/onnxruntime_training_data.zip?snapshot=2020-06-15T23:17:35.8314853Z \
--azure_blob_url https://onnxruntimetestdata.blob.core.windows.net/training/onnxruntime_training_data.zip \
--target_dir training_e2e_test_data \
--archive_sha256_digest B01C169B6550D1A0A6F1B4E2F34AE2A8714B52DBB70AC04DA85D371F691BDFF9
condition: and(succeededOrFailed(), eq(variables.onnxruntimeBuildSucceeded, 'true')) # ensure all tests are run when the build successed
retryCountOnTaskFailure: 2
displayName: 'Download onnxruntime_training_data.zip data'

- task: CmdLine@2
Expand Down

0 comments on commit 861125c

Please sign in to comment.