Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DArray: MPI interface #405

Closed
wants to merge 57 commits into from

Conversation

fda-tome
Copy link
Collaborator

Implementing reductions using MPI collective calls an interfacing the DArray to use MPI style instead of PGAS style (Partitioned global address space)

jpsamaroo and others added 30 commits September 26, 2022 06:58
…-topmost-bug

Add missing check to walk_data
…-no-capture

signature: Don't capture input arguments
Implement WeakChunk like WeakThunk
Swap Chunk for WeakChunk in eager thunk submission
chunks: Allow weak Chunk references in Thunk args
…ux-1.0

DaggerWebDash: Add Mux 1.x to compat
…ache-bug

Fix incorrect assertion in schedule!
…t-invokelatest

checkpoint: Use at-invokelatest
…broadcast

at-spawn: Add support for broadcasting
…z-docs

Update scheduler visualization docs
Adds ProcessorTypeScope(T), which matches processors that are a subtype
of T. In the process, also expands the scope system to support
lazily-evaluated scoping behavior, such as doing a subtype check or
checking for `default_enabled`, via "taints".
Also changes behavior such that proclist and single override scope when
set, to prevent issues with mixing proclist/single with scope.
…-type-scope

Add ProcessorTypeScope, deprecate proclist and single
jpsamaroo and others added 24 commits April 22, 2023 10:35
The worker scheduler would previously assume that it was fine to
schedule infinite amounts of work onto the same processor at once, which
is only efficient when tasks do lots of `yield`ing. Because most tasks
do not actually exhibit low occupancy, we want to teach at least the
worker scheduler to limit its eagerness when executing high-occupancy
tasks.

This commit teaches `@spawn` and the worker scheduler about a new
`occupancy` task option, which (on the user side) is a value between 0
and 1 which approximates how fully the task occupies the processor. If
the occupancy is 0.2, then 5 such tasks can execute concurrently and
fully occupy the processor.

Processors now operate primarily from a single controlling task per
processor, and work is executed in a lowest-occupancy-first manner to
attempt to maximize throughput.

With processors using occupancy estimates to limit oversubscription,
it's now quite easy for tasks to become starved for work. This commit
also adds work-stealing logic to each processor, allowing a starved
processor to steal scope-compatible tasks from other busy processors.
Processors will be able to steal so long as they are not fully occupied.
APIs like `delayed` and `spawn` assumed that passed kwargs were to be
treated as options to the scheduler, which is both somewhat confusing
for users, and precludes passing kwargs to user functions.

This commit changes those APIs, as well as `@spawn`, to instead pass
kwargs directly to the user's function. Options are now passed in an
`Options` struct to `delayed` and `spawn` as the second argument (the
first being the function), while `@spawn` still keeps them before the
call (which is generally more convenient).

Internally, `Thunk`'s `inputs` field is now a
`Vector{Pair{Union{Symbol,Nothing},Any}}`, where the second element of
each pair is the argument, while the first element is a position; if
`nothing`, it's a positional argument, and if a `Symbol`, then it's a
kwarg.
…bute function, revising testing forthe darray
…bute function, revising testing forthe darray
@fda-tome fda-tome marked this pull request as ready for review July 19, 2023 18:21
@fda-tome fda-tome changed the base branch from master to jps/dagger-mpi July 19, 2023 18:25
@jpsamaroo
Copy link
Member

Superseded by #422

@jpsamaroo jpsamaroo closed this Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants