-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transition_processing_memory
optimizations, etc.
#4487
Conversation
3a7151d
to
7df87de
Compare
transition_processing_memory
optimizations
fd0991e
to
087e0f0
Compare
transition_processing_memory
optimizationstransition_processing_memory
optimizations
Planning on merging tomorrow if no comments |
transition_processing_memory
optimizationstransition_processing_memory
optimizations, etc.
bf67cd1
to
63d109e
Compare
b7a2894
to
bc3f76a
Compare
Seeing an unrelated test failure. Am able to reproduce without these changes. Filed as issue ( #4493 ) Edit: This should be resolved now that PR ( dask/dask#7194 ) is in |
c70475b
to
df6d8a4
Compare
Looks like CI has a different issue unrelated to this PR. Fixing with PR ( #4499 ) Edit: Restarting CI now that that fix is in Edit 2: Seems like GH Actions doesn't do a merge commit when testing. So was still seeing the old failure. Went ahead and rebased. Should clear out that issue |
Avoids checking for the value and then retrieving it by simply trying to get the associated value. If it is not found, it is `None`, which is fast to check after and handle.
Allows Cython to optimize operations on this object.
This should allow faster access (particularly in Cython).
This is a holdover from before the `SchedulerState` refactor. Now this is properly typed and can be used efficiently in Cython anyways.
Avoids duplicate retrieval. Also handles the coercion to `double` once.
Avoid collecting an intermediate list of client keys and instead use them to produce messages directly.
This makes it easier to type this value and thus for Cython to optimize this value throughout.
Should knock out type checking when the function is called, which should simplify the work needed in later steps.
This allows us to only create the `set` `s` when we know we need it. Otherwise we just get the `set` previously contained in the `dict`.
Instead of creating an empty `set` when `s` is not defined, just check that `s` is non-trivial (either `None` or empty). This should be a bit faster and avoid the unnecessary creation step.
As `defaultdict`s are difficult for Cython to optimize and usually a `dict` will suffice, change `_unknown_duractions` to a `dict` to allow Cython to use Python C API specific to `dict`s.
This makes it easier to annotate these more accurately and benefit from Cython's optimizations around these `dict`-typed variables. Requires a very small amount of checking for keys and setting their values if not present. Though this remains efficient in both Python & Cython as well as compact.
These are annotated local variables, which are holdovers from before the `SchedulerState` refactor. We can now drop these and use `SchedulerState` to access these directly.
55e4b75
to
eb3e6c3
Compare
Cython will turn this into a very efficient `switch...case` statement. So this cuts down on the overhead of comparisons and checks (paying them once when computing the length). Then focuses on just checking this C typed integral value for which `case` to run.
Planning on merging tomorrow if no comments |
black distributed
/flake8 distributed
Goes through
transition_processing_memory
and optimizes it more carefully. In particular simplifies the code around starts tops, durations, and occupancies. Also types variables to get more performant Cython code. Makes use of intermediate variables to group work and allow easier reuse in later expressions. Tries to simplify data generated and collected to cutdown on general overhead associated. Additionally really cuts down on the work needed to produce messages for clients and workers.Also includes various small optimizations like using
.get(...)
, typing variables, dropping unneeded intermediate variables, constructing values once, using typed attributes, etc.