Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffering #24

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Buffering #24

wants to merge 4 commits into from

Conversation

treeowl
Copy link
Contributor

@treeowl treeowl commented Jun 2, 2018

  • Add buffering to buffer compositionally.

  • Manually deforest evalBuffer and parBuffer.

  • Add more rules for evalBuffer and parBuffer.

This PR is layered on another; I could disentangle them if necessary.

Closes #23

@treeowl treeowl force-pushed the buffering branch 2 times, most recently from a15f6de to 0394965 Compare June 3, 2018 01:55
-- lazily, with runEval between steps. runEval is really just
-- unsafeDupablePerformIO. If we used an array-based queue, and
-- a thunk representing the tail of the result gets duplicated, then
-- we could scramble things quite badly. A pure queue can't run into
Copy link
Contributor

@bgamari bgamari Jun 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of "we could scramble things quite badly" perhaps say:

We could end up pushing a value more than once, potentially scrambling the queue.

Copy link
Contributor

@bgamari bgamari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite nice!

-- it sparks computations to evaluate list elements at least to weak
-- head normal form, disregarding a strategy argument 'r0'.
--
-- > parBuffer n strat = parBuffer n (rseq `dot` strat)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better!

@treeowl treeowl force-pushed the buffering branch 3 times, most recently from eaf6598 to a70ff96 Compare June 3, 2018 03:03
@treeowl
Copy link
Contributor Author

treeowl commented Jun 3, 2018

I'd actually prefer to do things a little differently with all the buffering functions. In particular, I think that buffering n xs should fill the buffer with n elements immediately, rather than doing nothing until the result is forced to WHNF. But consistency among these functions seems pretty important.

@treeowl
Copy link
Contributor Author

treeowl commented Jun 3, 2018

@simonmar, for the sake of consistency with the rest of the module, I don't think the poorly named buffering function should actually be added. Rather, I think it should replace evalBuffer. We could use rewrite rules internally to optimize evalBuffer (rseq `dot` strat), but I'm not sure that's really worth the trouble. On the other hand, that would be a breaking change.

Copy link
Member

@simonmar simonmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this kind of change I think it's pretty important to get some benchmarks to demonstrate that at the least it doesn't cause any regressions. We have nofib/parallel, and you'll need a machine with at least 8 cores.

I'm OK with adding things to the API, but we should be careful about breaking changes because this API is used in my book.

@@ -340,6 +342,7 @@ type SeqStrategy a = Control.Seq.Strategy a
--
r0 :: Strategy a
r0 x = return x
{-# INLINE [1] r0 #-}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Please add a comment.

@@ -394,7 +397,7 @@ rpar x = Eval $ IO $ \s -> spark# x s
#else
rpar x = case (par# x) of { _ -> Done x }
#endif
{-# INLINE rpar #-}
{-# INLINE [1] rpar #-}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

tieConses :: (a -> Eval b) -> [a] -> [b]
tieConses strat = foldr go []
where
go x r = runEval ((:r) <$> strat x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes my brain hurt.

{-# RULES
"evalBuffer/rseq" forall n . evalBuffer n rseq = evalBufferWHNF n
"parBuffer/rseq" forall n . parBuffer n rseq = parBufferWHNF n
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was pretty important to optimise this case, I'm worried this might be a regression.

Copy link
Contributor Author

@treeowl treeowl Jun 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, users wouldn't write parBuffer n rseq, because that's just the same as parBuffer n r0. I think the right way to catch this (and other such) is probably to write rules for the underlying operations:

runEval (r0 x) ==> x
runEval (rseq x) ==> x
runEval (rpar x) ==> x

Making the higher-order combinators INLINABLE should then expose things sufficiently to these low-level rules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note if you're following by email: I edited the above comment substantially.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment explaining what tieConses does. I don't think it's actually complicated. I wrote it with foldr so we can consider making the whole thing INLINABLE and get list fusion on one side. Perhaps it would be easier to see what's going on if you write it with explicit recursion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why parBuffer n rseq should be the same as parBuffer n r0? That seems counter-intuitive to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really because parBuffer isn't compositional. A better way is to use the runEval (rpar x) = x rule so the strategy is simplified before being passed to buffering. I can change the PR. I'm wondering if you would be okay with redefining evalBuffering to buffering; it's a breaking change, but makes things much more uniform across the module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonmar, that was actually more fallout from my misunderstanding of rparWith. Ugh. parBuffer n rseq is not the same as parBuffer n r0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonmar, um, sorry, what I just said was a bit confused. The current behavior of parBuffer actually equates parBuffer n r0 to parBuffer n rseq. My implementation in terms of buffering and rparWith does not, once rparWith is fixed.

Borgvall and others added 3 commits June 17, 2018 21:39
It does not pass with the current implementation of `rparWith`.
* Lift the result to avoid always reducing to WHNF.

* Rewrite the documentation of `rparWith`.

Fixes haskell#35
* Add `buffering` to buffer compositionally.

* Redefine `parBuffer` in terms of `buffering`.

* Manually deforest `evalBuffer`.

* Add more rules for `evalBuffer`.
Don't try to change lots of `RULES` and inlining in this PR;
keep it more confined.
@treeowl
Copy link
Contributor Author

treeowl commented Jun 18, 2018

@simonmar, I've made the changes much more conservative, which I'm hoping will make it easier to merge. In particular, I have ideas about reworking the way we deal with RULES and inlining, but they don't really belong here. I can likely borrow an 8-core machine to run the benchmarks; I'll have to try that later in the week. I continue to wonder what you think about the proposed semantic changes:

  1. Make evalBuffer only evaluate as far as the passed strategy says, rather than to WHNF. (This is not incorporated into the PR).
  2. Make parBuffer spark computations that only evaluate as far as the passed strategy says. (This is incorporated into the PR).

@simonmar
Copy link
Member

Hi @treeowl - I'm inclined to be very conservative with merging further changes to this package, given that we've all become confused at one time or another about how things are supposed to work. So let me propose that before merging any further PRs, especially unforced ones, we should

  • make the test suite work with Travis
  • do some benchmarking for every PR to ensure that it doesn't regress parallel performance (I'm quite concerned that we haven't done any benchmarking so far, and the whole reason this package exists is for performance)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants