-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for make jobservers #2
Conversation
rayon-core/src/jobserver.rs
Outdated
|
||
lazy_static! { | ||
// We can only call `from_env` once per process | ||
static ref GLOBAL_CLIENT: Option<Client> = unsafe { Client::from_env() }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexcrichton This should probably be moved into the jobserver
crate, so we'll only get one client from the environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me! I'd be fine adding something like Client::global_from_env() -> Option<&'static Client>
rayon-core/src/lib.rs
Outdated
@@ -363,6 +370,16 @@ impl ThreadPoolBuilder { | |||
self.breadth_first | |||
} | |||
|
|||
/// Use a jobserver if one is available | |||
pub fn jobserver(mut self) -> Self { | |||
self.breadth_first = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/breadth_first/jobserver/
rayon-core/src/registry.rs
Outdated
static WORKER_THREAD_STATE: Cell<*const WorkerThread> = | ||
Cell::new(0 as *const WorkerThread) | ||
} | ||
#[thread_local] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW This can be used for profiling, but we can't use this in rustc because it doesn't work on Windows (and possibly some other platforms, I forget)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do use that for thread_local!
on Windows right? It can't be too bugged. Also when are we getting a thread_local!
which accepts constant initializers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No on Windows we use the system thread-local APIs. We tried using #[thread_local]
in the past and it was buggy for a reason I don't remember
rayon-core/src/worker_local.rs
Outdated
panic!("WorkerLocal can only be used on the thread pool it was created on") | ||
} | ||
|
||
// Must have inline(never) so the use of WORKER_THREAD_STATE doesn't escape the crate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this all about? something to do with #[thread_local]
attribute?
rayon-core/src/jobserver.rs
Outdated
|
||
let proxy = Arc::new(Proxy { | ||
thread: GLOBAL_CLIENT.clone().map(|c| { | ||
Mutex::new(c.into_helper_thread(move |token| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: I find stuff like data2
pretty distracting. I prefer to do:
{
let data = data.clone();
move |token| {
... /* use data in here */ ...
}
}
``
I've changed this to now use callbacks for when threads start and stop doing work. |
I like the callbacks, but I think this should be a PR against https://github.com/rust-lang-nursery/rustc-rayon |
Add support for using a jobserver with Rayon The Rayon changes are here: Zoxc/rayon#2 cc @alexcrichton r? @nikomatsakis
These adaptors consume may many elements before deferring to a base folder's fullness checks, and so they need to be performed manually. For the `filter`s, there's no way to do it manually (rayon-rs#632), so the specialisations just have to be removed. For `fold` and `find_any` this can be done with a `take_while`. This extends the octillion tests to confirm this behaviour. This makes a program like the following slightly slower compared to the direct `consume_iter` without a check, but it's still faster than the non-specialized form. ``` extern crate test; extern crate rayon; use rayon::prelude::*; fn main() { let count = (0..std::u32::MAX) .into_par_iter() .map(test::black_box) .find_any(|_| test::black_box(false)); println!("{:?}", count); } ``` ``` $ hyperfine ./find-original ./find-no-check ./find-check Benchmark #1: ./find-original Time (mean ± σ): 627.6 ms ± 25.7 ms [User: 7.130 s, System: 0.014 s] Range (min … max): 588.4 ms … 656.4 ms 10 runs Benchmark #2: ./find-no-check Time (mean ± σ): 481.5 ms ± 10.8 ms [User: 5.415 s, System: 0.013 s] Range (min … max): 468.9 ms … 498.2 ms 10 runs Benchmark rayon-rs#3: ./find-check Time (mean ± σ): 562.3 ms ± 11.8 ms [User: 6.363 s, System: 0.013 s] Range (min … max): 542.5 ms … 578.2 ms 10 runs ``` (find-original = without specialization, find-no-check = custom `consume_iter` without `take_while`, find-check = this commit)
We have the `done` flag to indicate when the bridged iterator is out of items, but there may still be items left in the parallel deque as well! Threads should check both, in that order, otherwise they may drop out of the bridge party while there's still work to do.
We create a mutable slice to the uninitialized tail when collecting into a `Vec<T>`, but the Rust Reference says it's UB to produce "a reference or `Box<T>` that is dangling, unaligned, or points to an invalid value." [UCG#77] hasn't resolved the validity of references to uninitialized data yet, but we can avoid it by using `MaybeUninit<T>`. [UCG#77]: rust-lang/unsafe-code-guidelines#77
849: Make sure `ParBridge` threads consume all items after `done` r=nikomatsakis a=cuviper We have the `done` flag to indicate when the bridged iterator is out of items, but there may still be items left in the parallel deque as well! Threads should check both, in that order, otherwise they may drop out of the bridge party while there's still work to do. Fixes rayon-rs#848. Co-authored-by: Josh Stone <[email protected]>
852: Use `&mut [MaybeUninit<T>]` in `iter::collect` r=nikomatsakis a=cuviper We create a mutable slice to the uninitialized tail when collecting into a `Vec<T>`, but the Rust Reference says it's UB to produce "a reference or `Box<T>` that is dangling, unaligned, or points to an invalid value." [UCG#77] hasn't resolved the validity of references to uninitialized data yet, but we can avoid it by using `MaybeUninit<T>`. [UCG#77]: rust-lang/unsafe-code-guidelines#77 Fixes rayon-rs#851. Co-authored-by: Josh Stone <[email protected]>
… same thread as the creator of the scope
844: Implement `in_place_scope` r=cuviper a=rocallahan As discussed in rayon-rs#562. Co-authored-by: Robert O'Callahan <[email protected]>
This matches the recent `in_place_scope`, but using `ScopeFifo`. The scope constructors have also been refactored to avoid duplication.
855: Add `in_place_scope_fifo` r=nikomatsakis a=cuviper This matches the recent `in_place_scope`, but using `ScopeFifo`. The scope constructors have also been refactored to avoid duplication. Co-authored-by: Josh Stone <[email protected]>
859: Update demo dependencies r=cuviper a=cuviper Co-authored-by: Josh Stone <[email protected]>
These are like `Iterator::take_while` and `skip_while`, but the `any` names emphasize that they're not a straightforward parallelization, as they does not honor the original iterator order.
1024: Add `ParallelIterator::take_any_while` and `skip_any_while` r=cuviper a=cuviper These are like `Iterator::take_while` and `skip_while`, but the `any` names emphasize that they're not a straightforward parallelization, as they does not honor the original iterator order. Co-authored-by: Josh Stone <[email protected]>
1025: Stop deriving traits on JobRef r=cuviper a=cuviper I don't see any difference in benchmarks from this, and only a slight reduction in code size. If nothing else I think it's more principled that `JobRef` should not implement `Copy` in particular. `Registry::inject` taking a slice of jobs was an over-abstraction that we never used. Co-authored-by: Josh Stone <[email protected]>
This enables more efficient code -- stabilized in rust-lang/rust#91355.
1027: Use const-TLS for the `WorkerThread` pointer (MSRV 1.59) r=cuviper a=cuviper This enables more efficient code -- stabilized in rust-lang/rust#91355. Co-authored-by: Josh Stone <[email protected]>
After rayon-rs#1025, clippy is denying [`fn_address_comparisons`][1], and this is correct that we can't actually guarantee that the pointer we got from `as_job_ref` will be the same that we produce in a later `PartialEq`. It's less worrying that different functions could merge to the same address, because our `StackJob` will still have different data pointers. We can keep `Copy + Eq` off of `JobRef` with an opaque `id() -> impl Eq` instead, so we'll be comparing the exact same captured `fn` pointer. [1]: https://rust-lang.github.io/rust-clippy/master/index.html#fn_address_comparisons
1028: Be careful comparing `job_ref.execute_fn` r=cuviper a=cuviper After rayon-rs#1025, clippy is denying [`fn_address_comparisons`][1], and this is correct that we can't actually guarantee that the pointer we got from `as_job_ref` will be the same that we produce in a later `PartialEq`. It's less worrying that different functions could merge to the same address, because our `StackJob` will still have different data pointers. We can keep `Copy + Eq` off of `JobRef` with an opaque `id() -> impl Eq` instead, so we'll be comparing the exact same captured `fn` pointer. [1]: https://rust-lang.github.io/rust-clippy/master/index.html#fn_address_comparisons Co-authored-by: Josh Stone <[email protected]>
This makes a greater effort to ensure that in all cases where either `DrainProducer` or `SliceDrain` will drop data, there are no references left pointing to that memory. It's not actually clear that this would be a problem anyway, as long as there's nothing accessing that zombie data, but it's not too much trouble to avoid it. Fixes rayon-rs#1029.
1030: Be more cautious about drain drops r=cuviper a=cuviper This makes a greater effort to ensure that in all cases where either `DrainProducer` or `SliceDrain` will drop data, there are no references left pointing to that memory. It's not actually clear that this would be a problem anyway, as long as there's nothing accessing that zombie data, but it's not too much trouble to avoid it. Fixes rayon-rs#1029. Co-authored-by: Josh Stone <[email protected]>
1026: Add `yield_now` and `yield_local` r=cuviper a=cuviper * `yield_now` looks for work anywhere in the thread pool and executes it. * `yield_local` only looks in the thread-local deque (including broadcasts). In either case, they return: * `Some(Yield::Executed)` if work was found and executed. * `Some(Yield::Idle)` if no work was found in their pool/thread. * `None` if the current thread is not part of a thread pool. These do not call `std::thread::yield_now()`, but the docs suggest polling loops might want to add that as a fallback. Co-authored-by: Josh Stone <[email protected]>
1031: Release rayon 1.7.0 and rayon-core 1.11.0 r=cuviper a=cuviper Co-authored-by: Josh Stone <[email protected]>
1032: Use phantomdata in CopyOnDrop r=cuviper a=Manishearth There is a struct with a destructor borrowing from a local, it would be good to have some lifetimes protecting it from doing the wrong thing. mergesort.rs also has this type but it's constructing it from raw pointers and I didn't want to infect everything with lifetimes. It's less of a problem there since it's *not* borrowing from locals. Co-authored-by: Manish Goregaokar <[email protected]>
…er thread Co-authored-by: Ammar Askar <[email protected]>
… stock rayon-core
No description provided.