-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ci] simplify CI configurations, parallelize compilation, test CUDA on Ubuntu 22.04 #6458
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jameslamb
changed the title
WIP: [ci] reduce duplication of LGB_VER across CI configs
WIP: [ci] simplify some CI configurations
May 17, 2024
jameslamb
changed the title
WIP: [ci] simplify some CI configurations
WIP: [ci] simplify CI configurations, parallelize compilation for more builds
May 18, 2024
jameslamb
changed the title
WIP: [ci] simplify CI configurations, parallelize compilation for more builds
WIP: [ci] simplify CI configurations, parallelize compilation, test CUDA on Ubuntu 22.04
May 18, 2024
jameslamb
changed the title
WIP: [ci] simplify CI configurations, parallelize compilation, test CUDA on Ubuntu 22.04
[ci] simplify CI configurations, parallelize compilation, test CUDA on Ubuntu 22.04
May 18, 2024
jameslamb
requested review from
guolinke,
shiyu1994,
jmoralez and
borchero
as code owners
May 18, 2024 07:06
borchero
approved these changes
May 21, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for spending so much effort to improve the CI here @jameslamb! 🙏🏼
Sure, happy to do it! Thanks for all the reviews! |
This was referenced May 23, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposes the following changes to the CI setup:
In the CUDA jobs:
nvidia-docker
and restart the docker daemon" on every CUDA builddocker run
in ascript:
block/tmp
instead of a directory that's mounted in from the self-hosted runner/home/github/miniforge
already exists" on the next buildactions/checkout
fromv1
tov3
v4
yet because GLIBC in the container images used in this job aren't new enoughOn most of the CI jobs:
GITHUB_ACTIONS=true
GITHUB_ACTIONS
anyway (docs)VERSION.txt
" into the 2 CI scripts that need it, instead of having it defined as inline shell code across most of the CI configsCMAKE_BUILD_PARALLEL_LEVEL=4
environment variable (see Notes)If any of these generate a lot of discussion, I'll split this up into smaller PRs. But thought the sum total was small enough to do as a single PR.
Notes for Reviewers
Why set
CMAKE_BUILD_PARALLEL_LEVEL
?This environment variable is the equivalent of passing e.g.
-j4
tocmake --build
ormake
.It tells that build tool (Ninja, in most of our builds here), to compile multiple objects at a time.
We set that in builds that separately invoke
cmake
, like here:LightGBM/.ci/test.sh
Line 56 in 3e9ab53
But currently any builds that are just running
sh build-python.sh
orRscript build_r.R
are performing serial compilation.Setting this to a value greater than
1
should speed up builds.I chose
4
because we're already using-j4
in lots of places, and it seems to be working well.References:
scikit-build-core
docs recommending this (link)CMAKE_BUILD_PARALLEL_LEVEL
(link)Why update Ubuntu versions?
It helps with the GitHub Actions Node 16/20 situation: #6453 (comment).
But more importantly, I think it's more likely to match the set of operating systems and library versions that
lightgbm
users are using in their environments.Ubuntu 22.04 has been available for 2 years (Ubuntu release history) and all of RAPIDS CI uses Ubuntu 20.04 and 22.04:
https://github.com/rapidsai/shared-workflows/blob/19d17957e59cf81574f214e043adf8cff7db9447/.github/workflows/wheels-test.yaml#L81-L85
Other References
Some related PRs explaining the history of the CUDA jobs: