Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Datasets] LazyBlocklist split fails to split heteroeneous list #32950

Closed
jianoaix opened this issue Mar 1, 2023 · 0 comments · Fixed by #32951
Closed

[Datasets] LazyBlocklist split fails to split heteroeneous list #32950

jianoaix opened this issue Mar 1, 2023 · 0 comments · Fixed by #32951
Assignees
Labels
bug Something that is supposed to be working; but isn't

Comments

@jianoaix
Copy link
Contributor

jianoaix commented Mar 1, 2023

On Ray master, the following will fail:

import ray

inputs = ["example://iris.csv"] * 100
ds = ray.data.read_csv(inputs, parallelism=10)
ds.schema()
ds._plan._in_blocks.split(2)

Error message:

Traceback (most recent call last):
  File "test_split.py", line 6, in <module>
    ds._plan._in_blocks.split(2)
  File "/home/ubuntu/ray/python/ray/data/_internal/lazy_block_list.py", line 165, in split
    cached_metadata = np.array_split(self._cached_metadata, num_splits)
  File "<__array_function__ internals>", line 200, in array_split
  File "/home/ubuntu/.local/lib/python3.8/site-packages/numpy/lib/shape_base.py", line 786, in array_split
    sary = _nx.swapaxes(ary, axis, 0)
  File "<__array_function__ internals>", line 200, in swapaxes
  File "/home/ubuntu/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 594, in swapaxes
    return _wrapfunc(a, 'swapaxes', axis1, axis2)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 54, in _wrapfunc
    return _wrapit(obj, method, *args, **kwds)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 43, in _wrapit
    result = getattr(asarray(obj), method)(*args, **kwds)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (10,) + inhomogeneous part.
@jianoaix jianoaix added the bug Something that is supposed to be working; but isn't label Mar 1, 2023
@jianoaix jianoaix self-assigned this Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant