[WIP] Refactor communication out of transition functions #4343

jakirkham · 2020-12-10T05:07:46Z

Currently transition functions alternate between working on the task graph a bit and communicate out to workers and clients. As it is, this is pretty interwoven. Sometimes going back and forth between these two modes of processing multiple times in the same function. This makes it a little tricky to optimize the transition functions further as often the communication is handled in high-level Python and doesn't improve much through Cythonization. Further this makes a bit tricky to pull apart the transitions functions and task graph state from the Scheduler as it is a bit too reliant on these communication steps.

To attempt to remedy this situation, this PR tries to move all of the communication out of the transition functions. It does this by trying to return all messages to be communicated and handling those in transition after each transition function completes. While transition functions and communication still effectively interleave at a higher level, this should make it easier to separate out the transition functions along with relevant state for further optimization.

distributed/scheduler.py

jakirkham · 2020-12-10T06:09:23Z

This still has some warts. That said, it would be good to get a sense of whether this change seems reasonable and what issues we might expect going with this approach (which may in turn inform how the warts are addressed 😉).

mrocklin · 2020-12-10T15:04:35Z

distributed/scheduler.py


    def transition_released_waiting(self, key):
        try:
            ts: TaskState = self.tasks[key]
            dts: TaskState
+            worker_msgs: dict = {}
+            report_msg: dict = {}


Should these be None for the seemingly common case of no messages to report?

Well as we have for-loops back in transition that try to iterate over these, we would need to add logic to check for None somehow. Atm this seems pretty straightforward even if there are cases where one of these doesn't contain something.

Should add that ultimately my hope is this becomes a message object, which we pass from transition functions and the communication side of the Scheduler. For now it is sort of an arbitrary list of things we have found need to be passed around. So I think now that we have identified what those things that we message are, this should be easier to handle.

mrocklin · 2020-12-10T15:08:11Z

At first glance this seems sensible to me. I would prefer that we not merge it until we get further along down this path. It seems like a decent path to go down though. (I'm thinking that we merge once we have a Python class that manages the scheduler state machine).

jakirkham · 2020-12-10T18:08:33Z

Thanks Matt! That makes sense. Wanted to share my thought process before going too far down this path.

Atm am seeing some hangs in the test suite. Not sure why that is yet. Am curious does it matter what state the Scheduler's task graph is in before it communicates with workers? Also should we be concerned about message ordering at all?

jakirkham · 2020-12-11T03:53:26Z

Atm am seeing some hangs in the test suite. Not sure why that is yet. Am curious does it matter what state the Scheduler's task graph is in before it communicates with workers? Also should we be concerned about message ordering at all?

Figured it out. Was overlooking another call to _add_to_memory so was not handling the report calls needed there. Fixing that fixed the issue.

jakirkham · 2020-12-11T04:47:37Z

Here's the call graph from the Scheduler that I see with these changes. This is best compared with the recent benchmark results here ( quasiben/dask-scheduler-performance#51 ).

From this we see that transition_memory_released also drops out (as it was only in the call graph at this point due to communication). The only relevant transitions are transition_processing_memory and transition_waiting_processing. The latter also takes significantly less time as part of its work was communication as well (though not entirely).

Additionally we now see communication that stems from transition, which is expected. It also takes basically the same amount of time. Again this is expected. We've only shifted where the communication happens. Not how much of it occurs.

jakirkham · 2020-12-11T19:55:08Z

Have refactored things away from passing TaskState objects to things like report_on_key after they (and the Scheduler graph as a whole) have been modified. Instead we now construct messages for each Client in the same way we do for each worker as part of the transition. As a result we are now returning dicts of addresses with messages for them from transitions. One for the clients and one for the workers.

jakirkham · 2020-12-14T16:37:27Z

Is there anything special we should be doing with the "fire-and-forget" Client? Does it receive messages? IIUC this Client is really just on the Scheduler (maybe not even a full Client in the same sense as a user created one). Though want to make sure there's nothing I'm missing here.

distributed/scheduler.py

Provides a way for callers to simply construct the message if they are not wanting to send it yet.

This provides us a way to effectively call `client_releases_keys` from other transitions without starting a new transition of its own.

Separates out the code needed to build a message for `report` based on the `TaskState` in question from the actual call to `self.report`.

This converts a `TaskState` into a `dict` of messages with the keys being the Clients to notify and the message being the report message. Allows us to think of messages simply in terms of the message and where it needs to be delivered without needing to know anything about the `TaskState` it came from or the `ClientState`s involved.

Instead of collecting a message to pass to `report` and letting the relevant Clients be collected from the `TaskState` information later, go ahead and collect that immediately while handling that `TaskState`. These Clients then form the keys of `client_msgs` where the message contains what was in `report_msg`. This allows us to keep all the `TaskState` work contained to where it is relevant and can be handled efficiently. Then the messaging out to Clients only needs be concerned with the messages and where they go without needing to worry about what they pertain to.

jakirkham · 2020-12-15T15:35:47Z

Closing and continuing in PR ( #4365 ).

jakirkham requested a review from mrocklin December 10, 2020 05:07

jakirkham mentioned this pull request Dec 10, 2020

line_profiler results on 4 workers (w/o stealing) over 20 iterations quasiben/dask-scheduler-performance#20

Open

jakirkham force-pushed the ref_trans_comm branch 4 times, most recently from 2e529da to 75e18ec Compare December 10, 2020 05:30

jakirkham commented Dec 10, 2020

View reviewed changes

distributed/scheduler.py Outdated Show resolved Hide resolved

jakirkham force-pushed the ref_trans_comm branch from 75e18ec to 76faf78 Compare December 10, 2020 05:47

jakirkham force-pushed the ref_trans_comm branch from 76faf78 to ad958fc Compare December 10, 2020 06:20

mrocklin reviewed Dec 10, 2020

View reviewed changes

jakirkham force-pushed the ref_trans_comm branch from ad958fc to 9e5f2f3 Compare December 10, 2020 23:49

jakirkham mentioned this pull request Dec 11, 2020

DGX Nightly Benchmark run 20201210 quasiben/dask-scheduler-performance#51

Open

jakirkham force-pushed the ref_trans_comm branch 11 times, most recently from aced1f7 to bcb7d42 Compare December 11, 2020 03:52

jakirkham force-pushed the ref_trans_comm branch from bcb7d42 to 9fae7ef Compare December 11, 2020 04:00

jakirkham force-pushed the ref_trans_comm branch from 9fae7ef to beca834 Compare December 11, 2020 17:23

jakirkham force-pushed the ref_trans_comm branch 3 times, most recently from 58d1175 to 5ea450f Compare December 11, 2020 19:32

jakirkham force-pushed the ref_trans_comm branch from 5ea450f to ccb51f0 Compare December 14, 2020 16:27

jakirkham force-pushed the ref_trans_comm branch from ccb51f0 to 5b7e05a Compare December 14, 2020 21:10

mrocklin reviewed Dec 14, 2020

View reviewed changes

distributed/scheduler.py Outdated Show resolved Hide resolved

jakirkham force-pushed the ref_trans_comm branch from 6c1375c to c30705a Compare December 14, 2020 22:34

jakirkham added 12 commits December 14, 2020 14:37

Move worker_send into transition functions

e80c832

Refactor _task_to_msg from send_task_to_worker

a84589b

Provides a way for callers to simply construct the message if they are not wanting to send it yet.

Move report out of _add_to_memory

1533c97

Refactor out _client_releases_keys

796dc03

This provides us a way to effectively call `client_releases_keys` from other transitions without starting a new transition of its own.

Collect client recs in _add_to_memory

105eaf3

Use _client_releases_keys in transitions

b0816e5

Refactor out _task_to_report_msg

ce389a0

Separates out the code needed to build a message for `report` based on the `TaskState` in question from the actual call to `self.report`.

Collect and send worker messages from transitions

53f3f29

Handle report in transition

272ce95

Add method to send a message to a specific client

888cf96

jakirkham force-pushed the ref_trans_comm branch from c30705a to 210dfd6 Compare December 14, 2020 22:37

jakirkham mentioned this pull request Dec 15, 2020

Refactor SchedulerState from Scheduler #4365

Merged

jakirkham closed this Dec 15, 2020

jakirkham deleted the ref_trans_comm branch December 15, 2020 15:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Refactor communication out of transition functions #4343

[WIP] Refactor communication out of transition functions #4343

jakirkham commented Dec 10, 2020

jakirkham commented Dec 10, 2020

mrocklin Dec 10, 2020

jakirkham Dec 10, 2020

jakirkham Dec 11, 2020

mrocklin commented Dec 10, 2020

jakirkham commented Dec 10, 2020

jakirkham commented Dec 11, 2020

jakirkham commented Dec 11, 2020

jakirkham commented Dec 11, 2020 •

edited

Loading

jakirkham commented Dec 14, 2020

jakirkham commented Dec 15, 2020

[WIP] Refactor communication out of transition functions #4343

[WIP] Refactor communication out of transition functions #4343

Conversation

jakirkham commented Dec 10, 2020

jakirkham commented Dec 10, 2020

mrocklin Dec 10, 2020

Choose a reason for hiding this comment

jakirkham Dec 10, 2020

Choose a reason for hiding this comment

jakirkham Dec 11, 2020

Choose a reason for hiding this comment

mrocklin commented Dec 10, 2020

jakirkham commented Dec 10, 2020

jakirkham commented Dec 11, 2020

jakirkham commented Dec 11, 2020

jakirkham commented Dec 11, 2020 • edited Loading

jakirkham commented Dec 14, 2020

jakirkham commented Dec 15, 2020

jakirkham commented Dec 11, 2020 •

edited

Loading