-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic scheduling for parallel_for and parallel_reduce #106
Comments
See issue 100: I'm not sure how to mark issues as duplicates of another issue, or I would do that now :-) |
Dynamic scheduling options for both RangePolicy and TeamPolicy |
Adding optional template parameter to policies to specify dynamic scheduling. Policy templates should be switched to variadic arguments. |
How about Kokkos::ScheduleDynamic or Kokkos::Schedule < Kokkos::Dynamic > |
Prefer Kokkos::Schedule < Kokkos::Dynamic > |
Ok this is the syntax I am implementing right now (starting with the RangePolicy): ExecutionPolicy< class ... Traits > where Traits can be any or none of the following ExecutionSpace (some Kokkos execution space) deprecated: INT_TYPE (i.e. some raw integer type as a direct template argument) So someone could really customize an execution policy like this: RangePolicy<InitialIntegrate, OpenMP, Schedule, ChunkSize<32>, IterationType > Right now order of arguments doesn't matter, but only one of each type is accepted (it static asserts with a helpful message if you for example try RangePolicy<OpenMP,Serial>). I am pretty much done with RangePolicy accepting these arguments, but Schedule doesn't do anything yet. The other ones doe the same thing as the equivalent before. |
One more comment: I will strongly advocate for Schedule, and ChunkSize to have only hint character. Otherwise people can do all kinds of crazy things. For example if you are guaranteed a ChunkSize I think it is possible to write a legal Kokkos::parallel_for which is not vectorizable (effectively you know that successive Chunk items will not collide). While that has its advantages, I think the loss in freedom to map iterations to hardware is far worse. So I rather would say those things are hints, and force people to write portable code. Static for example has an issue because on GPUs there is no true static execution since the hardware has a dynamic scheduler build in. |
Use the term "properties" for this kind of template parameters: |
Just noticed I had done that already in the actual implementation on experimental. Kokkos::RangePolicy<Kokkos::Schedule<Kokkos::Dynamic> >(0,N, Kokkos::ChunkSize(32) )
// This uses all options including TagType in order to call operator(MyCustomTag, int)
Kokkos::RangePolicy<Kokkos::Schedule<Kokkos::Dynamic>,
Kokkos::IterationType<int>,
Kokkos::OpenMP,
MyCustomTag >(0,N, Kokkos::ChunkSize(32) ) |
Would a better word for IterationType maybe be IndexType. It is the integer type the policy will internally use to provide indicies to the functor. This includes stuff like team_rank, league_rank, league_size team_size etc. when we extend the PolicyTraits to be used for the TeamPolicy. |
I prefer IndexType, seems to me to be more logical. |
I actually think the same. |
This is now in master using setter functions for runtime parameters. |
We've found that some of our loops are faster using dynamic scheduling in openmp.
The text was updated successfully, but these errors were encountered: