Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CancelledError with task function holding a reference to a Future #7746

Open
gjoseph92 opened this issue Apr 4, 2023 · 0 comments
Open

CancelledError with task function holding a reference to a Future #7746

gjoseph92 opened this issue Apr 4, 2023 · 0 comments
Labels
bug Something is broken regression

Comments

@gjoseph92
Copy link
Collaborator

On 2023.3.2, when a task function has a closure over a Future object, getting the Future's result intermittently raises a CancelledError when the task runs on a worker. On 2023.3.1 and earlier, this works fine. git bisect points to #7580 as the change at which this stopped working.

Minimal reproducer:

import time
import distributed


if __name__ == "__main__":
    client = distributed.Client(n_workers=1, processes=True, threads_per_worker=2)

    future_one = client.submit(lambda: 1, key="one")

    def add_one(y):
        time.sleep(0.01)
        return future_one.result() + y

    fs = client.map(add_one, list(range(5)))
    print(client.gather(fs, errors="skip"))
    fs[0].result()

Prior to #7580:

[1, 2, 3, 4, 5]

After:

2023-04-04 17:29:51,136 - distributed.worker - WARNING - Compute Failed
Key:       add_one-df166ab4e839d113ebae52aeb6483e40
Function:  add_one
args:      (0)
kwargs:    {}
Exception: "CancelledError('one')"

2023-04-04 17:29:51,139 - distributed.worker - WARNING - Compute Failed
Key:       add_one-c158b0793ccb125702dba6eeecedfe53
Function:  add_one
args:      (1)
kwargs:    {}
Exception: "CancelledError('one')"

2023-04-04 17:29:51,154 - distributed.worker - WARNING - Compute Failed
Key:       add_one-e6b80fa13e4911d5c14b080a00646fc7
Function:  add_one
args:      (2)
kwargs:    {}
Exception: "CancelledError('one')"

2023-04-04 17:29:51,158 - distributed.worker - WARNING - Compute Failed
Key:       add_one-ad8696ffcb9f00999a4c42e507937e32
Function:  add_one
args:      (3)
kwargs:    {}
Exception: "CancelledError('one')"

2023-04-04 17:29:51,172 - distributed.worker - WARNING - Compute Failed
Key:       add_one-b9408ea1fb0e19cd85955a7c706556c1
Function:  add_one
args:      (4)
kwargs:    {}
Exception: "CancelledError('one')"

[]
Traceback (most recent call last):
  File "/Users/gabe/dev/dask-playground/test-wrapper/minimal.py", line 17, in <module>
    fs[0].result()
  File "/Users/gabe/dev/distributed/distributed/client.py", line 314, in result
    raise exc.with_traceback(tb)
  File "/Users/gabe/dev/dask-playground/test-wrapper/minimal.py", line 12, in add_one
    return future_one.result() + y
  File "/Users/gabe/dev/distributed/distributed/client.py", line 311, in result
    result = self.client.sync(self._result, callback_timeout=timeout, raiseit=False)
  File "/Users/gabe/dev/distributed/distributed/utils.py", line 349, in sync
    return sync(
  File "/Users/gabe/dev/distributed/distributed/utils.py", line 416, in sync
    raise exc.with_traceback(tb)
  File "/Users/gabe/dev/distributed/distributed/utils.py", line 389, in f
    result = yield future
  File "/Users/gabe/dev/dask-playground/env/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/Users/gabe/dev/distributed/distributed/client.py", line 336, in _result
    result = await self.client._gather([self])
  File "/Users/gabe/dev/distributed/distributed/client.py", line 2209, in _gather
    raise exc
concurrent.futures._base.CancelledError: one

Interestingly, with threads_per_worker=1, the problem doesn't happen. Clearly, it has something to do with the tasks running in parallel—either the deserialization of the task function, or the execution.

The traceback points to here:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken regression
Projects
None yet
Development

No branches or pull requests

1 participant