Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure shuffle split operations are blacklisted from work stealing #4964

Merged
merged 1 commit into from
Jun 29, 2021

Conversation

fjetter
Copy link
Member

@fjetter fjetter commented Jun 23, 2021

If shuffle split tasks are not blacklisted from work stealing, this can have
catastrophic effects on performance.
See also #4962

If shuffle split tasks are not blacklisted from work stealing, this can have
catastrophic effects on performance.
See also dask#4962
@mrocklin
Copy link
Member

Hrm, I wonder if we should just blacklist any task that runs very quickly 🤔

😄

@jrbourbeau
Copy link
Member

cc @madsbk

@fjetter
Copy link
Member Author

fjetter commented Jun 23, 2021

Hrm, I wonder if we should just blacklist any task that runs very quickly 🤔

How quickly should it be? I've seen splits take longer than 5ms. I feel it is incredibly difficult to put a sane number on this but maybe you are right and something like LATENCY * 2 would do the trick? idk I can check. but this PR at least restores the behaviour we originally intended

@mrocklin
Copy link
Member

How quickly should it be? I've seen splits take longer than 5ms. I feel it is incredibly difficult to put a sane number on this but maybe you are right and something like LATENCY * 2 would do the trick? idk I can check. but this PR at least restores the behaviour we originally intended

I was joking. You removed this behavior in the work stealing PR, right?

@mrocklin
Copy link
Member

Does this fix #4962 ?

@mrocklin
Copy link
Member

Should this be merged?

@mrocklin
Copy link
Member

Does this fix #4962 ?

Ah, no. We also need to handle decide_worker

@fjetter
Copy link
Member Author

fjetter commented Jun 29, 2021

I was joking. You removed this behavior in the work stealing PR, right?

yes but that's only a draft and I want to reiterate what's going on after the sizes are properly measured.

Does this fix #4962 ?

Not yet,

We'll need

Should this be merged?

I believe it will improve performance and stability. I would like to (eventually) get the system into a state where we don't need this but I believe we're not there, yet.

@fjetter
Copy link
Member Author

fjetter commented Jun 29, 2021

See also dask/dask#7844 for summary and cause of the regression

@fjetter
Copy link
Member Author

fjetter commented Jun 29, 2021

All tests green? That's a sign! (I swear, I didn't cheat)

@mrocklin mrocklin merged commit d419e41 into dask:main Jun 29, 2021
@fjetter fjetter deleted the work_stealing_shuffle_split branch June 29, 2021 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants