Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[T145005253] Improvements to OSS builds and the Release Process #1627

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 31 additions & 16 deletions .github/scripts/setup_env.bash
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ print_exec () {
echo "+ $*"
echo ""
"$@"
echo ""
}

exec_with_retries () {
Expand Down Expand Up @@ -238,6 +239,30 @@ free_disk_space () {
# Info Functions
################################################################################

print_gpu_info () {
echo "################################################################################"
echo "[INFO] Check GPU info ..."
install_system_packages lshw
print_exec sudo lshw -C display

echo "################################################################################"
echo "[INFO] Check NVIDIA GPU info ..."

if [[ "${ENFORCE_NVIDIA_GPU}" ]]; then
# Ensure that nvidia-smi is available and returns GPU entries
if ! nvidia-smi; then
echo "[CHECK] NVIDIA driver is required, but does not appear to have been installed. This will cause FBGEMM_GPU installation to fail!"
return 1
fi

else
if which nvidia-smi; then
# If nvidia-smi is installed on a machine without GPUs, this will return error
(print_exec nvidia-smi) || true
fi
fi
}

print_system_info () {
echo "################################################################################"
echo "# Print System Info"
Expand All @@ -264,17 +289,6 @@ print_system_info () {
print_exec uname -a
print_exec cat /proc/version
print_exec cat /etc/os-release

echo "################################################################################"
echo "[INFO] Check GPU info ..."
install_system_packages lshw
print_exec sudo lshw -C display

if which nvidia-smi; then
echo "################################################################################"
echo "[INFO] Check NVIDIA GPU info ..."
print_exec nvidia-smi
fi
}

print_ec2_info () {
Expand Down Expand Up @@ -335,7 +349,7 @@ setup_miniconda () {
print_exec . ~/.bashrc

echo "[SETUP] Updating Miniconda base packages ..."
print_exec conda update -n base -c defaults -y conda
(exec_with_retries conda update -n base -c defaults -y conda) || return 1

# Print Conda info
print_exec conda info
Expand Down Expand Up @@ -369,12 +383,12 @@ create_conda_environment () {
(exec_with_retries conda create -y --name "${env_name}" python="${python_version}") || return 1

echo "[SETUP] Upgrading PIP to latest ..."
print_exec conda run -n "${env_name}" pip install --upgrade pip
(exec_with_retries conda run -n "${env_name}" pip install --upgrade pip) || return 1

# The pyOpenSSL and cryptography packages versions need to line up for PyPI publishing to work
# https://stackoverflow.com/questions/74981558/error-updating-python3-pip-attributeerror-module-lib-has-no-attribute-openss
echo "[SETUP] Upgrading pyOpenSSL ..."
print_exec conda run -n "${env_name}" python -m pip install "pyOpenSSL>22.1.0"
(exec_with_retries conda run -n "${env_name}" python -m pip install "pyOpenSSL>22.1.0") || return 1

# This test fails with load errors if the pyOpenSSL and cryptography package versions don't align
echo "[SETUP] Testing pyOpenSSL import ..."
Expand Down Expand Up @@ -886,7 +900,7 @@ prepare_fbgemm_gpu_build () {
git submodule update --init --recursive

echo "[BUILD] Installing other build dependencies ..."
print_exec conda run -n "${env_name}" python -m pip install -r requirements.txt
(exec_with_retries conda run -n "${env_name}" python -m pip install -r requirements.txt) || return 1

(test_python_import "${env_name}" numpy) || return 1
(test_python_import "${env_name}" skbuild) || return 1
Expand Down Expand Up @@ -1095,7 +1109,7 @@ install_fbgemm_gpu_package () {
print_exec sha1sum "${package_name}"

echo "[INSTALL] Installing FBGEMM-GPU wheel: ${package_name} ..."
conda run -n "${env_name}" python -m pip install "${package_name}"
(exec_with_retries conda run -n "${env_name}" python -m pip install "${package_name}") || return 1

echo "[INSTALL] Checking imports ..."
(test_python_import "${env_name}" fbgemm_gpu) || return 1
Expand Down Expand Up @@ -1217,4 +1231,5 @@ publish_to_pypi () {
"${package_name}"

echo "[PUBLISH] Successfully published package(s) to PyPI: ${package_name}"
echo "[PUBLISH] NOTE: The publish command is a successful no-op if the wheel version already existed in PyPI; please double check!"
}
6 changes: 6 additions & 0 deletions .github/workflows/fbgemm_gpu_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ jobs:
- name: Display System Info
run: . $PRELUDE; print_system_info

- name: Display GPU Info
run: . $PRELUDE; print_gpu_info

- name: Free Disk Space
run: . $PRELUDE; free_disk_space

Expand Down Expand Up @@ -150,6 +153,9 @@ jobs:
- name: Display System Info
run: . $PRELUDE; print_system_info

- name: Display GPU Info
run: . $PRELUDE; print_gpu_info

- name: Setup Miniconda
run: |
. $PRELUDE; setup_miniconda $HOME/miniconda
Expand Down
18 changes: 12 additions & 6 deletions .github/workflows/fbgemm_nightly_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ jobs:
matrix:
os: [ linux.12xlarge ]
python-version: [ "3.8", "3.9", "3.10" ]
cuda-version: [ "11.7.1" ]
cuda-version: [ "11.7.1", "11.8.0" ]

steps:
- name: Checkout the Repository
Expand All @@ -57,6 +57,9 @@ jobs:
- name: Display System Info
run: . $PRELUDE; print_system_info

- name: Display GPU Info
run: . $PRELUDE; print_gpu_info

- name: Setup Miniconda
run: |
. $PRELUDE; setup_miniconda $HOME/miniconda
Expand Down Expand Up @@ -103,12 +106,15 @@ jobs:
env:
PRELUDE: .github/scripts/setup_env.bash
BUILD_ENV: build_binary
ENFORCE_NVIDIA_GPU: 1
strategy:
fail-fast: false
matrix:
os: [ linux.g5.4xlarge.nvidia.gpu ]
python-version: [ "3.8", "3.9", "3.10" ]
cuda-version: [ "11.7.1" ]
cuda-version: [ "11.7.1", "11.8.0" ]
# Specify exactly ONE CUDA version for artifact publish
cuda-version-publish: [ "11.7.1" ]
needs: build_artifact

steps:
Expand All @@ -118,10 +124,10 @@ jobs:
submodules: true

- name: Display System Info
run: . $PRELUDE; print_system_info
run: . $PRELUDE; print_system_info; print_ec2_info

- name: Display EC2 Info
run: . $PRELUDE; print_ec2_info
- name: Display GPU Info
run: . $PRELUDE; print_gpu_info

- name: Setup Miniconda
run: |
Expand Down Expand Up @@ -157,7 +163,7 @@ jobs:
run: . $PRELUDE; cd fbgemm_gpu/test; run_fbgemm_gpu_tests $BUILD_ENV

- name: Push FBGEMM_GPU Nightly Binary to PYPI
if: ${{ github.event_name != 'pull_request' && github.event_name != 'push' }}
if: ${{ github.event_name != 'pull_request' && github.event_name != 'push' && matrix.cuda-version == matrix.cuda-version-publish }}
env:
PYPI_TOKEN: ${{ secrets.PYPI_TOKEN }}
run: . $PRELUDE; publish_to_pypi $BUILD_ENV fbgemm_gpu_nightly-*.whl "$PYPI_TOKEN"
9 changes: 6 additions & 3 deletions .github/workflows/fbgemm_nightly_build_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,9 @@ jobs:
- name: Display System Info
run: . $PRELUDE; print_system_info

- name: Display GPU Info
run: . $PRELUDE; print_gpu_info

- name: Setup Miniconda
run: |
. $PRELUDE; setup_miniconda $HOME/miniconda
Expand Down Expand Up @@ -110,10 +113,10 @@ jobs:
submodules: true

- name: Display System Info
run: . $PRELUDE; print_system_info
run: . $PRELUDE; print_system_info; print_ec2_info

- name: Display EC2 Info
run: . $PRELUDE; print_ec2_info
- name: Display GPU Info
run: . $PRELUDE; print_gpu_info

- name: Setup Miniconda
run: |
Expand Down
18 changes: 12 additions & 6 deletions .github/workflows/fbgemm_release_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ jobs:
matrix:
os: [ linux.12xlarge ]
python-version: [ "3.8", "3.9", "3.10" ]
cuda-version: [ "11.7.1" ]
cuda-version: [ "11.7.1", "11.8.0" ]

steps:
- name: Checkout the Repository
Expand All @@ -49,6 +49,9 @@ jobs:
- name: Display System Info
run: . $PRELUDE; print_system_info

- name: Display GPU Info
run: . $PRELUDE; print_gpu_info

- name: Setup Miniconda
run: |
. $PRELUDE; setup_miniconda $HOME/miniconda
Expand Down Expand Up @@ -95,12 +98,15 @@ jobs:
env:
PRELUDE: .github/scripts/setup_env.bash
BUILD_ENV: build_binary
ENFORCE_NVIDIA_GPU: 1
strategy:
fail-fast: false
matrix:
os: [ linux.g5.4xlarge.nvidia.gpu ]
python-version: [ "3.8", "3.9", "3.10" ]
cuda-version: [ "11.7.1" ]
cuda-version: [ "11.7.1", "11.8.0" ]
# Specify exactly ONE CUDA version for artifact publish
cuda-version-publish: [ "11.7.1" ]
needs: build_artifact
steps:
- name: Checkout the Repository
Expand All @@ -109,10 +115,10 @@ jobs:
submodules: true

- name: Display System Info
run: . $PRELUDE; print_system_info
run: . $PRELUDE; print_system_info; print_ec2_info

- name: Display EC2 Info
run: . $PRELUDE; print_ec2_info
- name: Display GPU Info
run: . $PRELUDE; print_gpu_info

- name: Setup Miniconda
run: |
Expand Down Expand Up @@ -148,7 +154,7 @@ jobs:
run: . $PRELUDE; cd fbgemm_gpu/test; run_fbgemm_gpu_tests $BUILD_ENV

- name: Push FBGEMM_GPU Binary to PYPI
if: ${{ github.event_name != 'pull_request' && github.event_name != 'push' }}
if: ${{ github.event_name != 'pull_request' && github.event_name != 'push' && matrix.cuda-version == matrix.cuda-version-publish }}
env:
PYPI_TOKEN: ${{ secrets.PYPI_TOKEN }}
run: . $PRELUDE; publish_to_pypi $BUILD_ENV fbgemm_gpu-*.whl "$PYPI_TOKEN"
9 changes: 6 additions & 3 deletions .github/workflows/fbgemm_release_build_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,9 @@ jobs:
- name: Display System Info
run: . $PRELUDE; print_system_info

- name: Display GPU Info
run: . $PRELUDE; print_gpu_info

- name: Setup Miniconda
run: |
. $PRELUDE; setup_miniconda $HOME/miniconda
Expand Down Expand Up @@ -102,10 +105,10 @@ jobs:
submodules: true

- name: Display System Info
run: . $PRELUDE; print_system_info
run: . $PRELUDE; print_system_info; print_ec2_info

- name: Display EC2 Info
run: . $PRELUDE; print_ec2_info
- name: Display GPU Info
run: . $PRELUDE; print_gpu_info

- name: Setup Miniconda
run: |
Expand Down
3 changes: 2 additions & 1 deletion fbgemm_gpu/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -328,7 +328,8 @@ if(NOT FBGEMM_CPU_ONLY)
src/merge_pooled_embeddings_gpu.cpp
src/topology_utils.cpp)
else()
message(STATUS "Could not find NVML_LIB_PATH; will NOT include certain sources into the build!")
message(STATUS
"Could not find NVML_LIB_PATH; skipping certain sources into the build")
endif()
endif()

Expand Down
1 change: 1 addition & 0 deletions fbgemm_gpu/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ jinja2
ninja
numpy
scikit-build
setuptools_git_versioning
52 changes: 29 additions & 23 deletions fbgemm_gpu/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,29 +7,41 @@
import argparse
import os
import random
import re
import subprocess
import sys

from datetime import date
from typing import List, Optional

import setuptools_git_versioning as gitversion
import torch
from skbuild import setup


def get_version():
# get version string from version.py
# TODO: ideally the version.py should be generated when setup is run
version_file = os.path.join(os.path.dirname(__file__), "version.py")
version_regex = r"__version__ = ['\"]([^'\"]*)['\"]"
with open(version_file, "r") as f:
version = re.search(version_regex, f.read(), re.M).group(1)
return version
def generate_package_version(package_name: str):
print("[SETUP.PY] Generating the package version ...")

if "nightly" in package_name:
# Use date stamp for nightly versions
print("[SETUP.PY] Package is for NIGHTLY; using timestamp for the versioning")
today = date.today()
version = f"{today.year}.{today.month}.{today.day}"

elif "test" in package_name:
# Use date stamp for nightly versions
print("[SETUP.PY] Package is for TEST: using random number for the versioning")
version = (f"0.0.{random.randint(0, 1000)}",)

else:
# Use git tag / branch / commit info to generate a PEP-440-compliant version string
print("[SETUP.PY] Package is for RELEASE: using git info for the versioning")
print(
f"[SETUP.PY] TAG: {gitversion.get_tag()}, BRANCH: {gitversion.get_branch()}, SHA: {gitversion.get_sha()}"
)
version = gitversion.version_from_git()

def get_nightly_version():
today = date.today()
return f"{today.year}.{today.month}.{today.day}"
print(f"[SETUP.PY] Setting the package version: {version}")
return version


def get_cxx11_abi():
Expand Down Expand Up @@ -170,23 +182,15 @@ def main(argv: List[str]) -> None:
if args.nvml_lib_path:
cmake_args.append(f"-DNVML_LIB_PATH={args.nvml_lib_path}")

name = args.package_name
print("name: ", name)
is_nightly = "nightly" in name
is_test = "test" in name

version = get_nightly_version() if is_nightly else get_version()
if is_test:
version = (f"0.0.{random.randint(0, 1000)}",)
print(f"-- {name} building version: {version}")
package_version = generate_package_version(args.package_name)

# Repair command line args for setup.
sys.argv = [sys.argv[0]] + unknown

setup(
# Metadata
name=name,
version=version,
name=args.package_name,
version=package_version,
author="FBGEMM Team",
author_email="[email protected]",
long_description=long_description,
Expand All @@ -210,6 +214,8 @@ def main(argv: List[str]) -> None:
"License :: OSI Approved :: BSD License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
],
)
Expand Down
Loading