-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler.py / worker.py code cleanup #4626
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @crusaderky -- I see this is marked as a draft PR so just let me know when you'd like a review
@jrbourbeau ready |
cc'ing @jakirkham , just because this seems like the kind of thing that might interest him |
distributed/scheduler.py
Outdated
if address not in parent._workers: | ||
try: | ||
ws: WorkerState = parent._workers[address] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would actually suggest using .get
and checking for None
instead. This has worked well in other places we have done this and avoids some overhead from Cython code generated around exception handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran a benchmark and I can't see the issue?
Python 3.8, Cython 0.29.21, Linux x64
N = 10000
def bench_get_miss():
d = {}
for i in range(N):
x = d.get(i)
if x is None:
x = {}
d[i] = x
def bench_get_hit():
d = {1: {}}
for _ in range(N):
x = d.get(1)
if x is None:
x = {}
d[1] = x
def bench_setdefault_miss():
d = {}
for i in range(N):
x = d.setdefault(i, {})
def bench_setdefault_hit():
d = {1: {}}
for _ in range(N):
x = d.setdefault(1, {})
Pure python perfomance:
%timeit bench_get_miss()
766 µs ± 1.89 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit bench_get_hit()
384 µs ± 174 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit bench_setdefault_miss()
534 µs ± 269 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit bench_setdefault_hit()
431 µs ± 9.68 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Cythonized:
%timeit bench_get_miss()
524 µs ± 4.38 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit bench_get_hit()
139 µs ± 791 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit bench_setdefault_miss()
366 µs ± 1.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit bench_setdefault_hit()
193 µs ± 1.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakirkham any feedback on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those benchmarks don't seem to be comparing to try...except
, which is what is used here and is what is being suggested to be changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more context, please see the following
In [1]: %load_ext Cython
In [2]: %%cython -3
...:
...: from cython import ccall
...:
...:
...: @ccall
...: def get_safe1(d: dict, k):
...: return d.get(k)
...:
...:
...: @ccall
...: def get_safe2(d: dict, k):
...: try:
...: return d[k]
...: except KeyError:
...: return None
...:
In [3]: d = {"a": 1}
In [4]: %time get_safe1(d, "a")
CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 6.2 µs
Out[4]: 1
In [5]: %time get_safe2(d, "a")
CPU times: user 3 µs, sys: 0 ns, total: 3 µs
Wall time: 6.2 µs
Out[5]: 1
In [6]: %time get_safe1(d, "b")
CPU times: user 4 µs, sys: 0 ns, total: 4 µs
Wall time: 6.2 µs
In [7]: %time get_safe2(d, "b")
CPU times: user 6 µs, sys: 0 ns, total: 6 µs
Wall time: 9.06 µs
IOW it is the try...except...
itself that adds overhead particularly when the key is missed. This can be avoided by just using .get
instead.
Additionally the is None
check that would follow is a simple identity check in Python and a pointer comparison in Cython. So is very fast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I misread your comment I thought you were talking about line 3629 et al.
New benchmark shows that effectively try is slower than in and get on a miss, but it's not a cython-specific issue:
N = 100000
def bench_in_miss():
d = {}
x = 1
for _ in range(N):
if x not in d:
pass
else:
y = d[x]
def bench_in_hit():
d = {1: 2}
x = 1
for _ in range(N):
if x not in d:
pass
else:
y = d[x]
def bench_try_miss():
d = {}
x = 1
for _ in range(N):
try:
y = d[x]
except KeyError:
pass
def bench_try_hit():
d = {1: 2}
x = 1
for _ in range(N):
try:
y = d[x]
except KeyError:
pass
def bench_get_miss():
d = {}
x = 1
for _ in range(N):
y = d.get(x)
if y is None:
pass
def bench_get_hit():
d = {1: 2}
x = 1
for _ in range(N):
y = d.get(x)
if y is None:
pass
Pure Python
In [3]: %time bench_in_miss()
CPU times: user 4.37 ms, sys: 14 µs, total: 4.38 ms
Wall time: 4.38 ms
In [4]: %time bench_in_hit()
CPU times: user 6.94 ms, sys: 0 ns, total: 6.94 ms
Wall time: 6.92 ms
In [5]: %time bench_try_miss()
CPU times: user 18.1 ms, sys: 0 ns, total: 18.1 ms
Wall time: 18.1 ms
In [6]: %time bench_try_hit()
CPU times: user 5.24 ms, sys: 0 ns, total: 5.24 ms
Wall time: 5.21 ms
In [7]: %time bench_get_miss()
CPU times: user 8.11 ms, sys: 0 ns, total: 8.11 ms
Wall time: 8.09 ms
In [8]: %time bench_get_hit()
CPU times: user 7.79 ms, sys: 0 ns, total: 7.79 ms
Wall time: 7.77 ms
Cython
In [2]: %time bench_in_miss()
CPU times: user 4.22 ms, sys: 22 µs, total: 4.24 ms
Wall time: 4.25 ms
In [3]: %time bench_in_hit()
CPU times: user 4.23 ms, sys: 0 ns, total: 4.23 ms
Wall time: 4.23 ms
In [4]: %time bench_try_miss()
CPU times: user 9.27 ms, sys: 0 ns, total: 9.27 ms
Wall time: 9.33 ms
In [5]: %time bench_try_hit()
CPU times: user 5.77 ms, sys: 0 ns, total: 5.77 ms
Wall time: 5.83 ms
In [6]: %time bench_get_miss()
CPU times: user 3.31 ms, sys: 0 ns, total: 3.31 ms
Wall time: 3.31 ms
In [7]: %time bench_get_hit()
CPU times: user 4.4 ms, sys: 25 µs, total: 4.42 ms
Wall time: 4.43 ms
I'm reverting to "in" as it is the fastest all around.
No deep thoughts here. One suggestion above |
@jrbourbeau ready for merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @crusaderky! I left a couple of comments -- apologies for not asking them earlier
|
||
worker_keys = {ws._address: ws.identity() for ws in workers} | ||
if close_workers and worker_keys: | ||
if close_workers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we dropped worker_keys
here -- is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if worker_keys is empty, you'll end up with await asyncio.gather(*[])
which is a noop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which btw is exactly what already happened on the immediately following block if remove:...
@@ -5687,7 +5676,7 @@ async def retire_workers( | |||
workers: list (optional) | |||
List of worker addresses to retire. | |||
If not provided we call ``workers_to_close`` which finds a good set | |||
workers_names: list (optional) | |||
names: list (optional) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch
*, | ||
address, | ||
resolve_address: bool = True, | ||
now: float = None, | ||
resources: dict = None, | ||
host_info: dict = None, | ||
metrics: dict, | ||
executing: dict = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I find that address
and metrics
dropping default values here to be more confusing than what we had before. What does it mean for these parameters to not have a default but also be after the *
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means that they are mandatory keyword arguments.
Nothing in the code caters of address=None - if it arrives, it's not going to be dealt with correctly.
There was an explicit assert metrics
which prevented you from omitting the parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thanks for clarifying -- this was a bit jarring to see for the first time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @crusaderky, will merge as soon as CI finishes
General-purpose minor code cleanup for the benefit of readability + some very minor performance tweaks. There are no functional changes.