-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce IndexedParallelIterator::mapfold_collect_into_vec #652
Conversation
map_collect_fold_reduce_into_vec is now map_collect_fold_reduce_into_vec_with, and it takes an initial value, a mapfold function and a reduce function.
I changed a bunch of things, the method is now let absolutely_positioned_fragments = child_boxes
.par_iter()
.map_collect_fold_reduce_into_vec_with(
&mut child_fragments,
vec![],
|mut abspos_fragments, block_level_box| {
let (fragment, mut abspos) = block_level_box.layout();
abspos_fragments.append(&mut abspos);
(abspos_fragments, fragment)
},
|mut left, mut right| {
left.append(&mut right);
left
}
); |
I feel like there's probably a generic piece we can extract from this, something like: trait ParallelIterator {
fn collect_first<A, B, C>(self, collection: &mut C) -> CollectFirst<Self, C>
where
Self: ParallelIterator<Item = (A, B)>,
A: Send,
B: Send,
C: ParallelExtend<A>,
{
// extend the collection with A, while producing a new iterator over B
// (tricky, but see how unzip works -- might even share a lot of that)
}
}
trait IndexedParallelIterator {
fn collect_first_into_vec<A, B>(self, vec: &mut Vec<A>) -> CollectFirstIntoVec<Self>
where
Self: ParallelIterator<Item = (A, B)>,
A: Send,
B: Send,
{
// extend the vec with A, while producing a new iterator over B
// (not sure we need this separately -- opt_len() trickery might
// be enough for indexed collect)
}
} So your fragment code would look something like: let absolutely_positioned_fragments = child_boxes
.par_iter()
.map(|block_level_box| block_level_box.layout())
.collect_first_into_vec(&mut child_fragments)
.reduce(Vec::new, |mut left, mut right| {
left.append(&mut right);
left
}); |
I thought about it and I think it would be way nicer than what I suggested, but I'm not sure we can do that: what happens if a later parallel iterator adapter doesn't actually consume everything? Wouldn't the collected-to vector be incomplete then? |
It's already exposed, because it wouldn't require any internal access to implement |
I'll try to do that then! |
I think we can put in specific checks for this scenario though. When driven as a pure consumer, it's a push model -- the consumer can return If we implement |
Bike-shedding |
Yeah I'm not sure what's up with the current set of APIs, like |
Wait so I'm confused, does |
It should use the optimized path, yes. That's the effect of So yeah, your fragment code could be rewritten to |
Yeah sorry I meant something like If I understand things correctly, |
Maybe another interesting way to look at the problem would be to consider some sort of iterator adaptor zipping the input iterator with the uninitialised destinations, and then provide inherent methods on that iterator adaptor such as |
No, it fills them in an interleaved fashion, otherwise we would have to buffer all those second items somehow. These tuples could be the result of a map or something, so we don't have random access to them. And FWIW The indexing optimization is that we can write directly to the destination buffer, rather than collecting intermediates of unknown length. |
So given |
Yes.
The ability to do other parallel things with the second piece - but I'm not sure how useful this really is. Now that I think of it, your doc example with filtered negatives could use a newtype struct OnlyNegative<C>(C);
impl<C: ParallelExtend<i32>> ParallelExtend<i32> for OnlyNegative<C> {
fn par_extend<I>(&mut self, par_iter: I)
where
I: IntoParallelIterator<Item = i32>,
{
let iter = par_iter.into_par_iter().filter(|&i| i < 0);
self.0.par_extend(iter);
}
} With |
There are all kinds of crazy compositions possible -- it's hard to judge what's worth adding sugar. |
But the other things on the second part can always be expressed by iterator shenanigans from the |
You can hack it with |
Thanks for the info! I'll reopen a new PR with something like what we discuss whenever I need it. For now unzip is ok. |
So I was thinking about this again, and our two samples are not equivalent, but my example doesn't show it properly: What I actually want is: let absolutely_positioned_fragments = child_boxes
.par_iter()
.map_collect_fold_reduce_into_vec_with(
&mut child_fragments,
vec![],
|mut abspos_fragments, block_level_box| {
fragment = block_level_box.layout(&mut abspos_fragments);
(abspos_fragments, fragment)
},
|mut left, mut right| {
left.append(&mut right);
left
}
); Where an absolutely positioned box would push itself to |
I'll tell you what I want, what I really really want (so tell me what you want, what you really really want): I want a combination of Given this: pub trait ParallelIterator: Sized + Send {
type Item: Send;
fn map_with<F, T, R>(self, init: T, map_op: F) -> MapWith<Self, T, F>
where
F: Fn(&mut T, Self::Item) -> R + Sync + Send,
T: Send + Clone,
R: Send;
}
pub trait ParallelExtend<T>
where
T: Send,
{
fn par_extend<I>(&mut self, par_iter: I)
where
I: IntoParallelIterator<Item = T>;
} I want: pub trait ParallelIterator: Sized + Send {
type Item: Send;
fn extend_mapfold_into<T, U, F, C>(
self,
init: T,
mapfold_op: F,
target: &mut C,
) -> ExtendMapfoldInto<Self, T, F, C>
where
T: Send + Clone,
F: Fn(&mut T, Self::Item) -> U + Send + Sync,
C: ParallelExtend<U>;
}
impl<I, T, F, C> ParallelIterator for ExtendMapfoldInto<I, T, F, C> {
type Item = T;
} Does that make sense to you, @cuviper? AFAICT this cannot be emulated with your |
Nit: do bikeshed the name, I hate it anyway.
This method introduces a neat way to collect some parallel iterator in a vec in an allocation-efficient way, while still being able to do some operations on the data being collected. It uses a new trait,
MapFolder<Item>
, which is a generalised closure for the classicmapfold
operation.Given the very raison d'être of parallel iterators is to be able to do work by batches, the result of the various
MapFolder::complete
calls are passed to aReducer
and returned frommapfold_collect_into_vec
.Why
Because, as usual, I need this as part of some Servo-related stuff. :) In Victor, we build in parallel a vector of fragments from a vector of boxes:
https://github.com/SimonSapin/victor/blob/6ddce7345030ae4d25f846ca757d6b50f3f8aeac/victor/src/layout/mod.rs#L56-L59
Some CSS features (among which absolute positioning, in case you were curious) require us to make some of those fragments go up the fragment tree to where they should actually belong. To do this, we would like to be able map the boxes into fragments as usual, but also collect the absolutely positioned ones into a separate vec that we can then append to the parent's own list of absolutely positioned fragments, until we reach the final one. There are way fewer absolutely positioned boxes than normal ones so it's not a problem to concat them as we traverse the tree. This method allows us to achieve that this way:
Unresolved questions
Should this method really be relying on plumbing?
Probably not, I guess we should have a variety of different methods like
map_init
,map_with
,fold_with
etc, but as a proof of concept I didn't want to lose too much time on how the functionality should be exposed before I make the PR. Those methods will also require names, and I can't find any which doesn't annoy me.Should it be named that way?
Probably not.
Is there a way to avoid the need for a new trait
MapFolder
?I don't think so but I am certainly not sure of that.