Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meta Issue: Support for parallelized/blocked algorithms #89

Closed
8 tasks done
kernelmachine opened this issue Feb 26, 2016 · 22 comments
Closed
8 tasks done

Meta Issue: Support for parallelized/blocked algorithms #89

kernelmachine opened this issue Feb 26, 2016 · 22 comments

Comments

@kernelmachine
Copy link

kernelmachine commented Feb 26, 2016

What are your thoughts on implementing something similar to http://dask.pydata.org/en/latest/ on top of ndarrays? I suspect parallelized computations on submatrices should be pretty natural to do in the Rust framework, and it seems you've already created sub-array view functions. Do you agree?

(Community Edits below)


Actionable sub issues:

@kernelmachine kernelmachine changed the title Support for threaded/blocked algorithms Support for parallelized/blocked algorithms Feb 26, 2016
@bluss
Copy link
Member

bluss commented Feb 26, 2016

The goal is absolutely to be able to support a project like that. Iterators already provide chunking in inner_iter, outer_iter, axis_iter, axis_chunks_iter and their mut counterparts. We also want to add just a few more split_at like interfaces to support easy chunking like this.

@bluss
Copy link
Member

bluss commented Feb 26, 2016

Integrating with https://github.com/nikomatsakis/rayon would be pretty exciting too.

@kernelmachine
Copy link
Author

Yup, that's exactly my thought! Would love to work on this, if you're
interested in collaborating.

On Fri, Feb 26, 2016 at 4:24 PM bluss [email protected] wrote:

Integrating with https://github.com/nikomatsakis/rayon would be pretty
exciting too.


Reply to this email directly or view it on GitHub
#89 (comment).

@kernelmachine
Copy link
Author

Also on the subject of integrations, I've been writing a crate that wraps Lapack/BLAS with high level, easy to use functions, inspired by the hmatrix library in Haskell. Focus is on compile-time, descriptive error checking, enumerated matrix types, and an easy interface. I wrote my own (simple) matrix representation for the project, but it actually seems way better to build the crate on top of ndarray.

How actively are you working on the BLAS integration that I see on the docs? Would love to exchange notes.

@bluss
Copy link
Member

bluss commented Feb 26, 2016

Not very actively, but it's the thing I must solve now. Not sure if ndarray wants to continue with rblas or use more raw blas bindings.

One problem is specialization, i.e. how to dispatch to use BLAS for element types f32, f64 while still supporting other array element types. Rust will learn specialization down the line, but what it looks now, we can do some dispatch using Any instead. Which is fine, it just adds that Any bound.

@bluss
Copy link
Member

bluss commented Feb 26, 2016

Note that Any allows static (compile time) dispatch on the element type.

@bluss
Copy link
Member

bluss commented Feb 26, 2016

As a high level library, ndarray has that strain that comes from supporting a much more general data layout than what BLAS does. So we must always have both the optimized code and the fallback code present for everything.

@bluss
Copy link
Member

bluss commented Feb 28, 2016

More splitting coming up #94

@kernelmachine
Copy link
Author

Awesome. I'll look into rayon integration via these split_at functions.

Yeah regarding the BLAS float issue, Any bound was my solution as well. In the initialization of the matrix I just tried to cast any ints to floats, else returned error.

@bluss
Copy link
Member

bluss commented Mar 16, 2016

Can you make this issue more concrete? Ndarray will not aim to develop or host a project that is similar to Dask, but we can make sure it can be built with ndarray.

More low level methods have been exposed since this issue was reported (See 0.4.2 release).

Maybe more concrete issues can be filed for missing functionality.

@kernelmachine
Copy link
Author

Sure. I think this issue comes down to an integration between ndarray and rayon. We should be able to apply basic parallelized computations on an array of subviews, and aggregate/reduce. This interface could be generic, or we could focus on a few specialized computations, like elementwise-operations or selections.

@bluss
Copy link
Member

bluss commented Mar 20, 2016

Yeah.

Here's a very basic experiment with that (only elementwise ops)

https://github.com/bluss/rust-ndarray-experimental/blob/master/src/lib.rs

  1. One important thing is of course to split along whichever axis has the greatest stride.
  2. There was a significant discovery here related to the just merged unstable feature specialization. You can seamlessly special case for the thread safe vs. not thread safe case, and use rayon only when the operation is thread safe! (Sync / Send as appropriate)

@bluss
Copy link
Member

bluss commented Dec 14, 2016

We need to break this down into specific sub-issues to that we can get each piece done in turn.

I'm editing the first comment of this issue. This is a good thing, that means that both I and you @pegasos1 can edit the same task list.

@bluss bluss changed the title Support for parallelized/blocked algorithms Meta Issue: Support for parallelized/blocked algorithms Dec 14, 2016
@kernelmachine
Copy link
Author

We just need to implement the parallel iterator trait, right? beyond tests and stuff, what else is there?

@bluss
Copy link
Member

bluss commented Dec 23, 2016

parallel map is a bit tricky (the Array::map(f) -> Array), but I have a work in progress for that.

@bluss
Copy link
Member

bluss commented Dec 23, 2016

There's also the question of interface. You have championed the parallel wrapper for array types before I think.

With parallel wrappers it could be something like:

use ndarray::parallel::par;
par(&mut array).map_inplace(|x| *x = x.exp());

or parallel array view types

array.par_view_mut().map_inplace(|x| *x = exp());

We could use wrapper/builder types for the closure instead:

use ndarray::parallel::par;
array.map_inplace(par(|x| *x = x.exp()));

or separate methods:

array.par_map_inplace(|x| *x = x.exp());

What is possible with specialization is to transparently parallelize regular Array::map_inplace calls, but that is too magical, we don't want that I think.

@iduartgomez
Copy link

On a more general note, there are any plans to eventually provide opt-in GPU computation? Maybe using https://github.com/arrayfire/arrayfire-rust ?

@bluss
Copy link
Member

bluss commented Mar 5, 2017

There is no explicit plan one way or the other.

Ndarray's design (explicit views, direct access to data) dictates that it's an in-memory data structure, so it could only integrate with gpu computation by allowing conversion to a more restricted format (like Arrayfire), or implementing operations using such a conversion before and after.

@frjnn
Copy link

frjnn commented May 13, 2021

Parallel Iter for AxisChunksIter

and

Parallel support for Array::map -> Array

should be checked out. @bluss @jturner314

@bluss
Copy link
Member

bluss commented May 13, 2021

I guess everything here is done as of current master. Zip::par_map_collect could be sufficient to satisfy the Array::map item, do you agree @frjnn?

@frjnn
Copy link

frjnn commented May 13, 2021

I agree

@bluss
Copy link
Member

bluss commented May 13, 2021

All the actionable points have been completed, so we can celebrate by closing. However, I think there is a lot more to do if we should begin to approach the original appeal of the issue text, and a new issue is welcome for that. 🙂

@bluss bluss closed this as completed May 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants