Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ray fails to serialize xarray object. #748

Closed
robertnishihara opened this issue Jul 19, 2017 · 2 comments
Closed

Ray fails to serialize xarray object. #748

robertnishihara opened this issue Jul 19, 2017 · 2 comments
Labels
bug Something that is supposed to be working; but isn't

Comments

@robertnishihara
Copy link
Collaborator

The following fails (Python 3.6, Anaconda 4.3, MacOS). This requires pip install xarray.

import numpy as np
import xarray as xr

import ray

ray.init()

data = xr.DataArray(np.random.randn(2, 3), coords={'x': ['a', 'b']}, dims=('x', 'y'))
ray.put(data)

The error is the following.

---------------------------------------------------------------------------
RaySerializationException                 Traceback (most recent call last)
~/Workspace/ray/python/ray/worker.py in store_and_register(self, object_id, value, depth)
    303                 ray.numbuf.store_list(object_id.id(), self.plasma_client.conn,
--> 304                                       [value])
    305                 break

~/Workspace/ray/python/ray/serialization.py in serialize(obj)
    127                                         .format(type(obj)),
--> 128                                         obj)
    129     class_id = type_to_class_id[type(obj)]

RaySerializationException: Ray does not know how to serialize objects of type <class 'xarray.core.dataarray.DataArray'>.

During handling of the above exception, another exception occurred:

RecursionError                            Traceback (most recent call last)
<ipython-input-1-002206058974> in <module>()
      7 
      8 data = xr.DataArray(np.random.randn(2, 3), coords={'x': ['a', 'b']}, dims=('x', 'y'))
----> 9 ray.put(data)

~/Workspace/ray/python/ray/worker.py in put(value, worker)
   1687         object_id = worker.local_scheduler_client.compute_put_id(
   1688             worker.current_task_id, worker.put_index)
-> 1689         worker.put_object(object_id, value)
   1690         worker.put_index += 1
   1691         return object_id

~/Workspace/ray/python/ray/worker.py in put_object(self, object_id, value)
    348         # Serialize and put the object in the object store.
    349         try:
--> 350             self.store_and_register(object_id, value)
    351         except ray.numbuf.numbuf_plasma_object_exists_error as e:
    352             # The object already exists in the object store, so there is no

~/Workspace/ray/python/ray/worker.py in store_and_register(self, object_id, value, depth)
    306             except serialization.RaySerializationException as e:
    307                 try:
--> 308                     _register_class(type(e.example_object))
    309                     warning_message = ("WARNING: Serializing objects of type "
    310                                        "{} by expanding them as dictionaries "

~/Workspace/ray/python/ray/worker.py in _register_class(cls, pickle, worker)
   1544     if not pickle:
   1545         # Raise an exception if cls cannot be serialized efficiently by Ray.
-> 1546         serialization.check_serializable(cls)
   1547         worker.run_function_on_all_workers(register_class_for_serialization)
   1548     else:

~/Workspace/ray/python/ray/serialization.py in check_serializable(cls)
     60                                            "Ray cannot serialize it "
     61                                            "efficiently.".format(cls))
---> 62     if hasattr(obj, "__slots__"):
     63         raise RayNotDictionarySerializable("The class {} uses '__slots__', so "
     64                                            "Ray may not be able to serialize "

~/anaconda3/lib/python3.6/site-packages/xarray/core/common.py in __getattr__(self, name)
    162             # this avoids an infinite loop when pickle looks for the
    163             # __setstate__ attribute before the xarray object is initialized
--> 164             for source in self._attr_sources:
    165                 with suppress(KeyError):
    166                     return source[name]

~/anaconda3/lib/python3.6/site-packages/xarray/core/dataarray.py in _attr_sources(self)
    485     def _attr_sources(self):
    486         """List of places to look-up items for attribute-style access"""
--> 487         return [self.coords, LevelCoordinatesSource(self), self.attrs]
    488 
    489     def __contains__(self, key):

~/anaconda3/lib/python3.6/site-packages/xarray/core/dataarray.py in attrs(self)
    499     def attrs(self):
    500         """Dictionary storing arbitrary metadata with this array."""
--> 501         return self.variable.attrs
    502 
    503     @attrs.setter

~/anaconda3/lib/python3.6/site-packages/xarray/core/dataarray.py in variable(self)
    364     @property
    365     def variable(self):
--> 366         return self._variable
    367 
    368     @property

... last 4 frames repeated, from the frame below ...

~/anaconda3/lib/python3.6/site-packages/xarray/core/common.py in __getattr__(self, name)
    162             # this avoids an infinite loop when pickle looks for the
    163             # __setstate__ attribute before the xarray object is initialized
--> 164             for source in self._attr_sources:
    165                 with suppress(KeyError):
    166                     return source[name]

RecursionError: maximum recursion depth exceeded in comparison
@robertnishihara robertnishihara added the bug Something that is supposed to be working; but isn't label Jul 19, 2017
@mitar
Copy link
Member

mitar commented Oct 11, 2017

Please make sure that also xarray.Dataset can be serialized by Ray.

Ideally "A multi-dimensional, in memory, array database." would stay in memory and be shared between Ray processes through memory.

@robertnishihara
Copy link
Collaborator Author

This is fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants