Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make stealing more robust #8788

Merged
merged 3 commits into from
Jul 23, 2024

Conversation

hendrikmakait
Copy link
Member

This PR improves a few details in the stealing code that could potentially cause the scheduler to deadlock. I don't have any reproducer that could confirm this, but I've identified these weak points while investigating #8787. Fixing these should be strictly beneficial.

  • Tests added / passed
  • Passes pre-commit run --all-files

@hendrikmakait hendrikmakait requested a review from fjetter as a code owner July 22, 2024 18:09
@@ -569,7 +569,9 @@ def __hash__(self) -> int:
return self._hash

def __eq__(self, other: object) -> bool:
return isinstance(other, WorkerState) and other.server_id == self.server_id
return self is other or (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should speed up equality checks since we should mostly (always?) be using the same instance anyways.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should always be the same. I'd even be fine with removing __eq__ entirely in which case == is the same as is

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd even be fine with removing __eq__ entirely in which case == is the same as is

I like the idea, but I'd leave it for a separate PR.

Copy link
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    29 files  ±0      29 suites  ±0   11h 59m 25s ⏱️ -58s
 4 091 tests ±0   3 974 ✅ +1    112 💤 ±0   5 ❌  - 1 
55 339 runs  ±0  52 883 ✅ +1  2 438 💤 ±0  18 ❌  - 1 

For more details on these failures, see this check.

Results for commit 9981803. ± Comparison against base commit 4adf564.

ts = self.scheduler.tasks[key]
self.put_key_in_stealable(ts)
elif start == "processing":
if start == "processing":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put an in code comment here since this will otherwise die in a refactoring

@hendrikmakait hendrikmakait merged commit 222e047 into dask:main Jul 23, 2024
9 of 18 checks passed
@hendrikmakait hendrikmakait deleted the improve-stealing-robustness branch July 23, 2024 10:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants