Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate fanout for rust-lang/rust CI #175

Open
2 of 13 tasks
MarcoIeni opened this issue Oct 25, 2024 · 0 comments
Open
2 of 13 tasks

Evaluate fanout for rust-lang/rust CI #175

MarcoIeni opened this issue Oct 25, 2024 · 0 comments
Assignees

Comments

@MarcoIeni
Copy link
Member

MarcoIeni commented Oct 25, 2024

To optimize our CI to remove work from large runners we want to evaluate how much fan out would help.

Tasks

Is fanout worth

Explain how much time we could save by using fanout instead of building the stage 1 compiler for every target:

  • Understand what are the platforms that we can fanout.
  • Understand if fanout is worth it

Implementation

  • Understand what are the artifacts of stage 1 - what do I need to cache and when
    • Maybe it's in build/$host/stage$stage in compile.rs file of bootstrap. Host should be the target_triple
  • Understand where I need to download these artifacts.
  • Understand when the first CI job should stop
  • Understand when the first CI job should start

Ideas 💡

  • Right now we don't have a way to understand how long each stage takes. We could upload metrics to datadog or S3 about each stage of the CI to analyze them. It would be nice to also include cpu utilization to understand if a bigger machine helps a certain step or if it's worth to parallelize some operations (if possible).
    • identify "checkpoints" where to send metrics.
    • Check metrics.json if this is present (https://ci-artifacts.rust-lang.org/rustc-builds-alt/...). If not, ask Jakub
  • Is it possible to avoid building LLVM on windows? Discussion on CI: unset NO_DOWNLOAD_CI_LLVM for 2 windows jobs rust#132781

OT optimizations

Random optimizations I find while studying this that it might be worth looking at later.

  • We could have one github workflow where we build stage 1 + stage 2 compiler and we could have various jobs that run tests that needs the previous job. So we build the stage 2 compiler in one job and then we split these jobs to run the tests.

Questions

  • Do we always run all CI jobs or just the CI jobs for the affected components? I.e. can we skip tests if the code change doesn't affect it? E.g. if the code only changes comments, do we need to run the tests across every OS and target?
    • answer: compining the compiler takes the most time, so this is not worth working on it.
  • We build the docker image everytime. How long does it take? Is caching working? https://github.com/MarcoIeni/rust/blob/a9d17627d241645a54c1134a20f1596127fedb60/src/ci/docker/run.sh#L93
    • answer: by enabling timestamps in github actions I found it doesn't take long to build the docker container (2 min)

Docs 📚

@MarcoIeni MarcoIeni self-assigned this Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant