Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Consider creating a standalone action for fetching CTK components #281

Closed
leofang opened this issue Dec 9, 2024 · 0 comments · Fixed by #302
Closed

CI: Consider creating a standalone action for fetching CTK components #281

leofang opened this issue Dec 9, 2024 · 0 comments · Fixed by #302
Labels
CI/CD CI/CD infrastructure P1 Medium priority - Should do

Comments

@leofang
Copy link
Member

leofang commented Dec 9, 2024

Contrary to the direction in #278, here's one good example that can be made as a (composite) action:

- name: Set up CTK cache variable
shell: bash --noprofile --norc -xeuo pipefail {0}
run: |
echo "CTK_CACHE_KEY=mini-ctk-${{ inputs.cuda-version }}-${{ inputs.host-platform }}" >> $GITHUB_ENV
echo "CTK_CACHE_FILENAME=mini-ctk-${{ inputs.cuda-version }}-${{ inputs.host-platform }}.tar.gz" >> $GITHUB_ENV
- name: Download CTK cache
id: ctk-get-cache
uses: actions/cache/restore@v4
continue-on-error: true
with:
key: ${{ env.CTK_CACHE_KEY }}
path: ./${{ env.CTK_CACHE_FILENAME }}
- name: Get CUDA components
if: ${{ steps.ctk-get-cache.outputs.cache-hit != 'true' }}
shell: bash --noprofile --norc -xeuo pipefail {0}
run: |
CUDA_PATH="./cuda_toolkit"
mkdir $CUDA_PATH
# The binary archives (redist) are guaranteed to be updated as part of the release posting.
CTK_BASE_URL="https://developer.download.nvidia.com/compute/cuda/redist/"
CTK_JSON_URL="$CTK_BASE_URL/redistrib_${{ inputs.cuda-version }}.json"
if [[ "${{ inputs.host-platform }}" == linux* ]]; then
if [[ "${{ inputs.host-platform }}" == "linux-x64" ]]; then
CTK_SUBDIR="linux-x86_64"
elif [[ "${{ inputs.host-platform }}" == "linux-aarch64" ]]; then
CTK_SUBDIR="linux-sbsa"
fi
function extract() {
tar -xvf $1 -C $CUDA_PATH --strip-components=1
}
elif [[ "${{ inputs.host-platform }}" == "win-x64" ]]; then
CTK_SUBDIR="windows-x86_64"
function extract() {
_TEMP_DIR_=$(mktemp -d)
unzip $1 -d $_TEMP_DIR_
cp -r $_TEMP_DIR_/*/* $CUDA_PATH
rm -rf $_TEMP_DIR_
}
fi
function populate_cuda_path() {
# take the component name as a argument
function download() {
curl -kLSs $1 -o $2
}
CTK_COMPONENT=$1
CTK_COMPONENT_REL_PATH="$(curl -s $CTK_JSON_URL |
python -c "import sys, json; print(json.load(sys.stdin)['${CTK_COMPONENT}']['${CTK_SUBDIR}']['relative_path'])")"
CTK_COMPONENT_URL="${CTK_BASE_URL}/${CTK_COMPONENT_REL_PATH}"
CTK_COMPONENT_COMPONENT_FILENAME="$(basename $CTK_COMPONENT_REL_PATH)"
download $CTK_COMPONENT_URL $CTK_COMPONENT_COMPONENT_FILENAME
extract $CTK_COMPONENT_COMPONENT_FILENAME
rm $CTK_COMPONENT_COMPONENT_FILENAME
}
# Get headers and shared libraries in place
# Note: the existing artifact would need to be manually deleted (ex: through web UI)
# if this list is changed, as the artifact actions do not offer any option for us to
# invalidate the artifact.
populate_cuda_path cuda_nvcc
populate_cuda_path cuda_cudart
populate_cuda_path cuda_nvrtc
populate_cuda_path cuda_profiler_api
populate_cuda_path libnvjitlink
ls -l $CUDA_PATH
# Prepare the cache
# Note: try to escape | and > ...
tar -czvf ${CTK_CACHE_FILENAME} ${CUDA_PATH}
# Note: the headers will be copied into the cibuildwheel manylinux container,
# so setting the CUDA_PATH env var here is meaningless.
- name: Upload CTK cache
if: ${{ always() &&
steps.ctk-get-cache.outputs.cache-hit != 'true' }}
uses: actions/cache/save@v4
with:
key: ${{ env.CTK_CACHE_KEY }}
path: ./${{ env.CTK_CACHE_FILENAME }}
- name: Restore CTK cache
if: ${{ steps.ctk-get-cache.outputs.cache-hit == 'true' }}
shell: bash --noprofile --norc -xeuo pipefail {0}
run: |
ls -l
CUDA_PATH="./cuda_toolkit"
tar -xzvf $CTK_CACHE_FILENAME
ls -l $CUDA_PATH
if [ ! -d "$CUDA_PATH/include" ]; then
exit 1
fi

which can then be called in both build and test jobs.

One thing to investigate before commitment is to check if this action https://github.com/Jimver/cuda-toolkit can already achieve what we need (and if it has good perf). If it works equally well, or only missing some features that we can upstream, then we should just use it and help improve it, instead of rolling out our own solution.

@leofang leofang added CI/CD CI/CD infrastructure P1 Medium priority - Should do labels Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/CD CI/CD infrastructure P1 Medium priority - Should do
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant