Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use and pass jobserver instead of limiting cores directly #1338

Merged
merged 7 commits into from
Oct 26, 2023

Conversation

passcod
Copy link
Contributor

@passcod passcod commented Oct 14, 2023

Currently cargo-pgrx uses both rayon and some manual "max(1, cores / 3)" logic to use multiple cores without overwhelming the system.

This PR ditches these two mechanisms in favour of the jobserver concept, implemented by the jobslot crate. That is an implementation of the make jobserver protocol.

Basically a jobserver controls access to a limited pool of job slots: to do some compute, you grab a slot, blocking until one becomes available if necessary. A process in the jobserver context passes down the jobserver handle to subprocesses, who, if they're aware (like make is), all thus cooperate together.

In this PR, cargo-pgrx:

  • passes the jobserver handle to all calls to make
  • grabs a slot for all compute-bound steps (ie the untar and configure steps)
  • doesn't grab a slot for network-bound steps (ie the download step)
  • starts a thread per pg_config resolution task

This naturally and granularly distributes tasks across all available cores, without restricting each make to a fixed amount of jobs. Notably, if eg building pg12 and pg13, and pg12 finishes completely, the make for pg13 may scale up to use the freed-up cores.


Before (4 cores, notice how it doesn't even start downloading/untarring pg16 until (off-screen) one of the compiles is fully done):
Screenshot from 2023-10-14 23-14-51

After (4 cores, notice how there's first 5 downloads and then it restricts itself to 4 compute tasks at a time, interleaving as necessary):
Screenshot from 2023-10-14 23-17-09

Copy link
Member

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only scanned this and am not very familiar with jobserver, but this looks like a good improvement.

@eeeebbbbrrrr
Copy link
Contributor

I know it’s WIP but looks like CI wants a “cargo fmt” and I’d want the readme updated before we merge.

This is neat.

@passcod passcod marked this pull request as ready for review October 20, 2023 22:52
@workingjubilee
Copy link
Member

Conflicted with #1348

@passcod
Copy link
Contributor Author

passcod commented Oct 24, 2023

I'll fix up the conflicts on this and my other PR later today

@workingjubilee
Copy link
Member

No rush!

This one's fun: because we now have the jobserver controlling concurrency,
rayon is actually hindering us, because it's spawning num-cores threads,
and goes from there, when what we want to do is spawn one thread per
pgconfig, and let them fight (block) for the job token pool.
@passcod
Copy link
Contributor Author

passcod commented Oct 25, 2023

Rebased

@workingjubilee workingjubilee merged commit dc0d88d into pgcentralfoundation:develop Oct 26, 2023
9 checks passed
@workingjubilee
Copy link
Member

Thanks!

@passcod passcod deleted the pgrx-jobserver branch October 26, 2023 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants