Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a divide-and-conquer parallelism API? #14

Open
tkf opened this issue Mar 5, 2020 · 4 comments
Open

Define a divide-and-conquer parallelism API? #14

tkf opened this issue Mar 5, 2020 · 4 comments

Comments

@tkf
Copy link

tkf commented Mar 5, 2020

Hi, I just registered a small package SplittablesBase.jl which is used from Transducers.jl. Essentially, you can support parallel reduce and everything that can be derived from it (e.g., map) by defining a single function halve(collection). I wonder if you are interested in defining it.

@andyferris
Copy link
Owner

@tkf thanks - with the ordering changes in #13 I'm thinking we can do things like ensure tokens have a total ordering and that you can find the first and last token as well as a way of finding the (approximate) midpoint and iterating through sub-intervals. Therefore I imagine implementing this in a generic way should be possible.

I haven't really thought through what order such dependencies should take. So far I've been keeping dependencies out because, well, IMO Julia 2.0 deserves an ordered dictionary in Base as well as enhancements to the AbstractDict interface, and I intened this to be the playground for that.

@tkf
Copy link
Author

tkf commented Jun 8, 2020

Actually, I bit the bullet and defined halve for Base.Dict using some implementation details of iterate(::Dict). So, I think a similar trick can be used even for "old" HashDictionary too. I also added halve spec for unordered collection: https://juliafolds.github.io/SplittablesBase.jl/dev/#SplittablesBase.halve Of course, it'd be great if #13 helps implementing this.

I also want to move SplittablesBase.jl to Base too. I think we need something like it for composable data-parallel algorithms. So let's hope both of them get into Base so that we don't have to think about dependency issue 😆

Meanwhile, it's possible to implement halve outside of Dictionaries.jl. But from my experience from implementing iterate(::Dict), it requires to touch some internals of the dictionary. So, I think it's better to do it inside Dictionaries.jl. One mid-ground option may be to implement unexported Dictionaries.halve in Dictionaries.jl so that I can define SplittablesBase.halve(::AbstractDictionary) at import-time via @require.

@andyferris
Copy link
Owner

Yes. I will need approximate token bisection for searchsortedxxx functions when they land, so no worries.

To be clear, yes we can define halve for any concrete type, I was proposing to have it implemented for all tokenizable AbstractDictionary. I'm aiming for all the free stuff you get with AbstractArray by defining axes, IndexStyle, getindex, isassigned, setindex! and similar.

@tkf
Copy link
Author

tkf commented Jun 9, 2020

Oh, I see. Yes, it'd be great if there is a way to define halve on abstract types. It was a bit daunting to think that "complex" collections like dictionaries might need halve implementation for each concrete type. Token-based API sounds like a good "middleware" for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants