Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give some helpful info on a suspected serialization error #1380

Merged
merged 1 commit into from
Dec 12, 2023

Conversation

chris-janidlo
Copy link
Contributor

Description

Motivated by some requests in #help and by the fact that diagnosing and fixing serialization-related errors is currently a rather arcane task.

[sc-28690]

Type of change

  • New feature (non-breaking change that adds functionality)

@chris-janidlo chris-janidlo added no-news-is-good-news This change does not require a news file quick-review Review of this should be quick and easy labels Dec 7, 2023
Copy link

@chris-janidlo chris-janidlo force-pushed the serde-error-message-help-link branch from 5501c5b to e949b0c Compare December 7, 2023 17:29
@chris-janidlo
Copy link
Contributor Author

With this change, a suspected serialization error looks like this (from a jupyter notebook):

---------------------------------------------------------------------------
TaskExecutionFailed                       Traceback (most recent call last)
[/Users/chris/ws/compute/scratch/playground.ipynb](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/playground.ipynb) Cell 5 line 2
      [1](vscode-notebook-cell:/Users/chris/ws/compute/scratch/playground.ipynb#W4sZmlsZQ%3D%3D?line=0) with Executor(ep_uuid, funcx_client=gcc) as gcx:
----> [2](vscode-notebook-cell:/Users/chris/ws/compute/scratch/playground.ipynb#W4sZmlsZQ%3D%3D?line=1)     res = gcx.submit(hello_other_file).result()
      [4](vscode-notebook-cell:/Users/chris/ws/compute/scratch/playground.ipynb#W4sZmlsZQ%3D%3D?line=3) res

File [~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:458](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:458), in Future.result(self, timeout)
    [456](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:456)     raise CancelledError()
    [457](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:457) elif self._state == FINISHED:
--> [458](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:458)     return self.__get_result()
    [459](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:459) else:
    [460](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:460)     raise TimeoutError()

File [~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:403](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:403), in Future.__get_result(self)
    [401](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:401) if self._exception:
    [402](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:402)     try:
--> [403](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:403)         raise self._exception
    [404](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:404)     finally:
    [405](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:405)         # Break a reference cycle with the exception in self._exception
    [406](https://file+.vscode-resource.vscode-cdn.net/Users/chris/ws/compute/scratch/~/opt/anaconda3/envs/compute/lib/python3.10/concurrent/futures/_base.py:406)         self = None

TaskExecutionFailed: 
 Traceback (most recent call last):
   File "/Users/chris/ws/compute/funcx/compute_sdk/globus_compute_sdk/serialize/facade.py", line 122, in unpack_and_deserialize
     deserialized = self.deserialize(current)
   File "/Users/chris/ws/compute/funcx/compute_sdk/globus_compute_sdk/serialize/facade.py", line 79, in deserialize
     return strategy.deserialize(payload)
   File "/Users/chris/ws/compute/funcx/compute_sdk/globus_compute_sdk/serialize/concretes.py", line 140, in deserialize
     function = dill.loads(codecs.decode(chomped.encode(), "base64"))
   File "/Users/chris/opt/anaconda3/envs/compute/lib/python3.10/site-packages/dill/_dill.py", line 387, in loads
     return load(file, ignore, **kwds)
   File "/Users/chris/opt/anaconda3/envs/compute/lib/python3.10/site-packages/dill/_dill.py", line 373, in load
     return Unpickler(file, ignore=ignore, **kwds).load()
   File "/Users/chris/opt/anaconda3/envs/compute/lib/python3.10/site-packages/dill/_dill.py", line 646, in load
     obj = StockUnpickler.load(self)
   File "/Users/chris/opt/anaconda3/envs/compute/lib/python3.10/site-packages/dill/_dill.py", line 636, in find_class
     return StockUnpickler.find_class(self, module, name)
 ModuleNotFoundError: No module named 'script'


This appears to be an error with serialization. If it is, using a different
serialization strategy from globus_compute_sdk.serialize might resolve the issue. For
an example, to use globus_compute_sdk.serialize.CombinedCode:

    from globus_compute_sdk import Client, Executor
    from globus_compute_sdk.serialize import CombinedCode

    gcc = Client(code_serialization_strategy=CombinedCode())
    with Executor('<your-endpoint-id>', client=gcc) as gcx:
        # do something with gcx

For more information, see:
    https://globus-compute.readthedocs.io/en/latest/sdk.html#specifying-a-serialization-strategy

(The above error happens from importing a function from another file)

@chris-janidlo chris-janidlo force-pushed the serde-error-message-help-link branch from e949b0c to bf7b9ec Compare December 7, 2023 17:50
@chris-janidlo chris-janidlo force-pushed the serde-error-message-help-link branch 3 times, most recently from cd15b8b to 3b35a3b Compare December 11, 2023 19:41
Motivated by some requests in #help and by the fact that diagnosing and
fixing serialization-related errors is currently a rather arcane task.
@khk-globus khk-globus force-pushed the serde-error-message-help-link branch from 3b35a3b to ef3659a Compare December 12, 2023 16:43
@khk-globus
Copy link
Contributor

Rebased on main per Slack interaction.

@chris-janidlo chris-janidlo merged commit 63b009c into main Dec 12, 2023
32 checks passed
@chris-janidlo chris-janidlo deleted the serde-error-message-help-link branch December 12, 2023 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-news-is-good-news This change does not require a news file quick-review Review of this should be quick and easy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants