Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] [python-package] check distributions with pydistcheck #5838

Merged
merged 12 commits into from
May 4, 2023
31 changes: 31 additions & 0 deletions .ci/check_python_dists.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,35 @@ if { test "${TASK}" = "bdist" || test "${METHOD}" = "wheel"; }; then
check-wheel-contents ${DIST_DIR}/*.whl || exit -1
fi

PY_MINOR_VER=$(python -c "import sys; print(sys.version_info.minor)")
if [ $PY_MINOR_VER -gt 7 ]; then
echo "pydistcheck..."
pip install pydistcheck
if [ $TASK == "CUDA" ] && [ $METHOD == "wheel" ]; then
pydistcheck \
--inspect \
--ignore 'compiled-objects-have-debug-symbols,max-allowed-size-compressed' \
--max-allowed-size-uncompressed '60M' \
--max-allowed-files 800 \
${DIST_DIR}/* || exit -1
Copy link
Collaborator Author

@jameslamb jameslamb Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if branch accounts for the fact that the CUDA 11.x wheel currently has the following warnings from pydistcheck:

1. [compiled-objects-have-debug-symbols] Found compiled object containing debug symbols. For details, extract the distribution contents and run 'nm --debug-syms "lightgbm/lib_lightgbm.so"'.
2. [distro-too-large-compressed] Compressed size 35.1M is larger than the allowed size (5.0M).
3. [distro-too-large-uncompressed] Uncompressed size 54.8M is larger than the allowed size (15.0M).
errors found while checking: 3

(build link)

I'm not sure why the CUDA build is so much larger than the non-CUDA one or where those debug symbols are coming from...maybe we are statically linking to CUDA libraries? Maybe those debug symbols are coming from the use off -lineinfo compiler flag?

set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -O3 -lineinfo")

Either way, I'm proposing not spending time investigating that right now, and instead just at least adding this check to detect if future changes in this project make the wheel significantly larger (which could be a problem in storage-sensitive environments like AWS Lambda).

elif [ $ARCH == "aarch64" ]; then
pydistcheck \
--inspect \
--ignore 'compiled-objects-have-debug-symbols' \
--max-allowed-size-compressed '5M' \
--max-allowed-size-uncompressed '15M' \
--max-allowed-files 800 \
${DIST_DIR}/* || exit -1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aarch64 wheel we produces one error from pydistcheck:

1. [compiled-objects-have-debug-symbols] Found compiled object containing debug symbols. For details, extract the distribution contents and run 'nm --debug-syms "lightgbm/lib_lightgbm.so"'.
errors found while checking: 1

(build link)

I'm not sure where those are coming from, but I'm proposing ignoring them for now. The aarch64 linux integrated OpenCL wheel has a very similar compressed size (2.3MB) to the arm64 linux wheels (1.6M).

else
pydistcheck \
--inspect \
--max-allowed-size-compressed '5M' \
--max-allowed-size-uncompressed '15M' \
--max-allowed-files 800 \
${DIST_DIR}/* || exit -1
fi
else
echo "skipping pydistcheck (does not support Python 3.${PY_MINOR_VER})"
fi

echo "done checking Python package distributions"