Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Contributes to rapidsai/build-planning#57. Similar to rapidsai/ucxx#226, proposes using the new UCX wheels from https://github.com/rapidsai/ucx-wheels, instead of vendoring system versions of `libuc{m,p,s,t}.so`. ## Benefits of these changes Allows users of `ucx-py` to avoid needing system installations of the UCX libraries. Shrinks the `ucx-py` wheels by 6.7MB compressed (77%) and 19.1 MB uncompressed (73%). <details><summary>how I calculated that (click me)</summary> Mounting in a directory with a wheel built from this branch... ```shell docker run \ --rm \ -v $(pwd)/final_dist:/opt/work \ -it python:3.10 \ bash pip install pydistcheck pydistcheck --inspect /opt/work/*.whl ``` ```text ----- package inspection summary ----- file size * compressed size: 2.0M * uncompressed size: 7.0M * compression space saving: 71.3% contents * directories: 10 * files: 38 (2 compiled) size by extension * .so - 6.9M (97.7%) * .py - 0.1M (2.0%) * .pyx - 9.3K (0.1%) * no-extension - 7.1K (0.1%) * .pyi - 3.9K (0.1%) * .c - 1.7K (0.0%) * .txt - 39.0B (0.0%) largest files * (5.3M) ucp/_libs/ucx_api.cpython-310-x86_64-linux-gnu.so * (1.6M) ucp/_libs/arr.cpython-310-x86_64-linux-gnu.so * (36.3K) ucp/core.py * (20.3K) ucp/benchmarks/cudf_merge.py * (12.1K) ucp/benchmarks/send_recv.py ``` Compared to a recent nightly release. ```shell pip download \ -d /tmp/delete-me \ --prefer-binary \ --extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/simple \ 'ucx-py-cu12>=0.38.0a' pydistcheck --inspect /tmp/delete-me/*.whl ``` ```text ----- package inspection summary ----- file size * compressed size: 8.7M * uncompressed size: 26.1M * compression space saving: 66.8% contents * directories: 11 * files: 65 (21 compiled) size by extension * .0 - 14.4M (55.4%) * .so - 8.4M (32.2%) * .a - 1.8M (6.7%) * .140 - 0.7M (2.5%) * .12 - 0.7M (2.5%) * .py - 0.1M (0.5%) * .pyx - 9.3K (0.0%) * no-extension - 7.3K (0.0%) * .la - 4.2K (0.0%) * .pyi - 3.9K (0.0%) * .c - 1.7K (0.0%) * .txt - 39.0B (0.0%) largest files * (8.7M) ucx_py_cu12.libs/libucp-5720f0c9.so.0.0.0 * (5.3M) ucp/_libs/ucx_api.cpython-310-x86_64-linux-gnu.so * (2.0M) ucx_py_cu12.libs/libucs-3c3009f0.so.0.0.0 * (1.6M) ucp/_libs/arr.cpython-310-x86_64-linux-gnu.so * (1.5M) ucx_py_cu12.libs/libuct-2a15b69b.so.0.0.0 ``` </details> ## Notes for Reviewers Left some comments on the diff describing specific design choices. ### The libraries from the `libucx` wheel are only used if a system installation isn't available Built a wheel in a container using the same image used here in CI. ```shell docker run \ --rm \ --gpus 1 \ --env-file "${HOME}/.aws/creds.env" \ --env CI=true \ -v $(pwd):/opt/work \ -w /opt/work \ -it rapidsai/ci-wheel:cuda12.2.2-rockylinux8-py3.10 \ bash ci/build_wheel.sh ``` </details> Found that the libraries from the `libucx` wheel are correctly found at build time, and are later found at import time. <details><summary>using 'rapidsai/citestwheel' image and LD_DEBUG (click me)</summary> ```shell # run a RAPIDS wheel-testing container, mount in the directory with the built wheel docker run \ --rm \ --gpus 1 \ -v $(pwd)/final_dist:/opt/work \ -w /opt/work \ -it rapidsai/citestwheel:cuda12.2.2-ubuntu22.04-py3.10 \ bash ``` `rapidsai/citestwheel` does NOT the UCX libraries installed at `/usr/lib*`. ```shell find /usr -name 'libucm.so*' # (empty) ``` Installed the `ucx-py` wheel. ```shell # install the wheel pip install ./*.whl # now libuc{m,p,s,t} at found in site-packages find /usr -name 'libucm.so*' # (empty) find /pyenv -name 'libucm.so*' # /pyenv/versions/3.10.14/lib/python3.10/site-packages/libucx/lib/libucm.so.0.0.0 # /pyenv/versions/3.10.14/lib/python3.10/site-packages/libucx/lib/libucm.so.0 # /pyenv/versions/3.10.14/lib/python3.10/site-packages/libucx/lib/libucm.so # try importing ucx-py and track where 'ld' finds the ucx libraries LD_DEBUG="files,libs" LD_DEBUG_OUTPUT=out.txt \ python -c "from ucp._libs import arr" # 'ld' creates multiple files... combine them to 1 for easier searching cat out.txt.* > out-full.txt ``` In that output, saw that `ld` was finding `libucs.so` first. It searched all the system paths before finally finding it in the `libucx` wheel. ```text 1037: file=libucs.so [0]; dynamically loaded by /pyenv/versions/3.10.14/lib/python3.10/lib-dynload/_ctypes.cpython-310-x86_64-linux-gnu.so [0] 1037: find library=libucs.so [0]; searching 1037: search path= (LD_LIBRARY_PATH) 1037: search path=/pyenv/versions/3.10.14/lib (RUNPATH from file /pyenv/versions/3.10.14/bin/python) 1037: trying file=/pyenv/versions/3.10.14/lib/libucs.so 1037: search cache=/etc/ld.so.cache 1037: search path=/lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v3:/lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v2:/lib/x86_64-linux-gnu/tls/haswell/x86_64:/lib/x86_64-linux-gnu/tls/haswell:/lib/x86_64-linux-gnu/tls/x86_64:/lib/x86_64-linux-gnu/tls:/lib/x86_64-linux-gnu/haswell/x86_64:/lib/x86_64-linux-gnu/haswell:/lib/x86_64-linux-gnu/x86_64:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v3:/usr/lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v2:/usr/lib/x86_64-linux-gnu/tls/haswell/x86_64:/usr/lib/x86_64-linux-gnu/tls/haswell:/usr/lib/x86_64-linux-gnu/tls/x86_64:/usr/lib/x86_64-linux-gnu/tls:/usr/lib/x86_64-linux-gnu/haswell/x86_64:/usr/lib/x86_64-linux-gnu/haswell:/usr/lib/x86_64-linux-gnu/x86_64:/usr/lib/x86_64-linux-gnu:/lib/glibc-hwcaps/x86-64-v3:/lib/glibc-hwcaps/x86-64-v2:/lib/tls/haswell/x86_64:/lib/tls/haswell:/lib/tls/x86_64:/lib/tls:/lib/haswell/x86_64:/lib/haswell:/lib/x86_64:/lib:/usr/lib/glibc-hwcaps/x86-64-v3:/usr/lib/glibc-hwcaps/x86-64-v2:/usr/lib/tls/haswell/x86_64:/usr/lib/tls/haswell:/usr/lib/tls/x86_64:/usr/lib/tls:/usr/lib/haswell/x86_64:/usr/lib/haswell:/usr/lib/x86_64:/usr/lib (system search path) 1037: trying file=/lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v3/libucs.so 1037: trying file=/lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v2/libucs.so 1037: trying file=/lib/x86_64-linux-gnu/tls/haswell/x86_64/libucs.so 1037: trying file=/lib/x86_64-linux-gnu/tls/haswell/libucs.so 1037: trying file=/lib/x86_64-linux-gnu/tls/x86_64/libucs.so 1037: trying file=/lib/x86_64-linux-gnu/tls/libucs.so 1037: trying file=/lib/x86_64-linux-gnu/haswell/x86_64/libucs.so 1037: trying file=/lib/x86_64-linux-gnu/haswell/libucs.so 1037: trying file=/lib/x86_64-linux-gnu/x86_64/libucs.so 1037: trying file=/lib/x86_64-linux-gnu/libucs.so 1037: trying file=/usr/lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v3/libucs.so 1037: trying file=/usr/lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v2/libucs.so 1037: trying file=/usr/lib/x86_64-linux-gnu/tls/haswell/x86_64/libucs.so 1037: trying file=/usr/lib/x86_64-linux-gnu/tls/haswell/libucs.so 1037: trying file=/usr/lib/x86_64-linux-gnu/tls/x86_64/libucs.so 1037: trying file=/usr/lib/x86_64-linux-gnu/tls/libucs.so 1037: trying file=/usr/lib/x86_64-linux-gnu/haswell/x86_64/libucs.so 1037: trying file=/usr/lib/x86_64-linux-gnu/haswell/libucs.so 1037: trying file=/usr/lib/x86_64-linux-gnu/x86_64/libucs.so 1037: trying file=/usr/lib/x86_64-linux-gnu/libucs.so 1037: trying file=/lib/glibc-hwcaps/x86-64-v3/libucs.so 1037: trying file=/lib/glibc-hwcaps/x86-64-v2/libucs.so 1037: trying file=/lib/tls/haswell/x86_64/libucs.so 1037: trying file=/lib/tls/haswell/libucs.so 1037: trying file=/lib/tls/x86_64/libucs.so 1037: trying file=/lib/tls/libucs.so 1037: trying file=/lib/haswell/x86_64/libucs.so 1037: trying file=/lib/haswell/libucs.so 1037: trying file=/lib/x86_64/libucs.so 1037: trying file=/lib/libucs.so 1037: trying file=/usr/lib/glibc-hwcaps/x86-64-v3/libucs.so 1037: trying file=/usr/lib/glibc-hwcaps/x86-64-v2/libucs.so 1037: trying file=/usr/lib/tls/haswell/x86_64/libucs.so 1037: trying file=/usr/lib/tls/haswell/libucs.so 1037: trying file=/usr/lib/tls/x86_64/libucs.so 1037: trying file=/usr/lib/tls/libucs.so 1037: trying file=/usr/lib/haswell/x86_64/libucs.so 1037: trying file=/usr/lib/haswell/libucs.so 1037: trying file=/usr/lib/x86_64/libucs.so 1037: trying file=/usr/lib/libucs.so 1037: 1037: file=/pyenv/versions/3.10.14/lib/python3.10/site-packages/libucx/lib/libucs.so [0]; dynamically loaded by /pyenv/versions/3.10.14/lib/python3.10/lib-dynload/_ctypes.cpython-310-x86_64-linux-gnu.so [0] 1037: file=/pyenv/versions/3.10.14/lib/python3.10/site-packages/libucx/lib/libucs.so [0]; generating link map 1037: dynamic: 0x00007f4ce42d7c80 base: 0x00007f4ce427e000 size: 0x000000000006fda0 1037: entry: 0x00007f4ce4290ce0 phdr: 0x00007f4ce427e040 phnum: 1 ``` Then the others were found via the RPATH entries on `libucs.so`. `libucm.so.0`: ```text 196: file=libucm.so.0 [0]; needed by /pyenv/versions/3.10.14/lib/python3.10/site-packages/libucx/lib/libucs.so [0] 196: find library=libucm.so.0 [0]; searching 196: search path=...redacted...:/pyenv/versions/3.10.14/lib/python3.10/site-packages/libucx/lib (RPATH from file /pyenv/versions/3.10.14/lib/python3.10/site-packages/libucx/lib/libucs.so) ... ``` </details> However, the libraries from the `libucx` wheel appear to be **the last place `ld` searches**. That means that if you use these wheels on a system with a system installation of `libuc{m,p,s,t}`, that system installation's libraries will be loaded instead. <details><summary>using 'rapidsai/ci-wheel' image and LD_DEBUG (click me)</summary> ```shell docker run \ --rm \ --gpus 1 \ -v $(pwd)/final_dist:/opt/work \ -w /opt/work \ -it rapidsai/ci-wheel:cuda12.2.2-rockylinux8-py3.10 \ bash ``` `rapidsai/ci-wheel` has the UCX libraries installed at `/usr/lib64`. ```shell find /usr/ -name 'libucm.so*' # /usr/lib64/libucm.so.0.0.0 # /usr/lib64/libucm.so.0 # /usr/lib64/libucm.so ``` Installed a wheel and tried to import from it. ```shell pip install ./*.whl LD_DEBUG="files,libs" LD_DEBUG_OUTPUT=out.txt \ python -c "from ucp._libs import arr" cat out.txt.* > out-full.txt ``` In that situation, I saw the system libraries found before the one from the wheel. ```text 226: file=libucs.so [0]; dynamically loaded by /pyenv/versions/3.10.14/lib/python3.10/lib-dynload/_ctypes.cpython-310-x86_64-linux-gnu.so [0] 226: find library=libucs.so [0]; searching 226: search path=/pyenv/versions/3.10.14/lib (RPATH from file /pyenv/versions/3.10.14/bin/python) 226: trying file=/pyenv/versions/3.10.14/lib/libucs.so 226: search path=/pyenv/versions/3.10.14/lib (RPATH from file /pyenv/versions/3.10.14/bin/python) 226: trying file=/pyenv/versions/3.10.14/lib/libucs.so 226: search path=/opt/rh/gcc-toolset-11/root/usr/lib64/tls:/opt/rh/gcc-toolset-11/root/usr/lib64:/opt/rh/gcc-toolset-11/root/usr/lib (LD_LIBRARY_PATH) 226: trying file=/opt/rh/gcc-toolset-11/root/usr/lib64/tls/libucs.so 226: trying file=/opt/rh/gcc-toolset-11/root/usr/lib64/libucs.so 226: trying file=/opt/rh/gcc-toolset-11/root/usr/lib/libucs.so 226: search cache=/etc/ld.so.cache 226: trying file=/usr/lib64/libucs.so ``` In this case, when the system libraries are available, `site-packages/libucx/lib` isn't even searched. </details> To avoid any RAPIDS-specific stuff tricking me, I tried in a generic `python:3.10` image. Found that the library could be loaded and all the `libuc{m,p,s,t}` libraries from the `libucx` wheel are found 🎉 . <details><summary>using 'python:3.10' wheel (click me)</summary> ```shell docker run \ --rm \ --gpus 1 \ -v $(pwd)/final_dist:/opt/work \ -w /opt/work \ -it python:3.10 \ bash pip install \ --extra-index-url=https://pypi.anaconda.org/rapidsai-wheels-nightly/simple \ ./*.whl LD_DEBUG="files,libs" LD_DEBUG_OUTPUT=out.txt \ python -c "from ucp._libs import arr" ``` 💥 ```text 16: opening file=/usr/local/lib/python3.10/site-packages/libucx/lib/libucm.so.0 [0]; direct_opencount=1 16: 16: opening file=/usr/local/lib/python3.10/site-packages/libucx/lib/libucs.so [0]; direct_opencount=1 ``` </details> Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - Vyas Ramasubramani (https://github.com/vyasr) - Ray Douglass (https://github.com/raydouglass) URL: #1041
- Loading branch information