Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug distributed array from scalar #492

Closed
mtar opened this issue Feb 21, 2020 · 2 comments · Fixed by #507
Closed

bug distributed array from scalar #492

mtar opened this issue Feb 21, 2020 · 2 comments · Fixed by #507
Assignees
Labels
bug Something isn't working redistribution Related to distributed tensors

Comments

@mtar
Copy link
Collaborator

mtar commented Feb 21, 2020

Description
factories.array does not support 0-dim tensors if split is not None.

To Reproduce
Steps to reproduce the behavior:

  1. Which module/class/function is affected?
    factories.array / stride_tricks.sanitize_axis
  2. What are the circumstances under which the bug appears?
    create splitted heat array from scalar
  3. What is the exact error-message/erroneous behaviour?
Traceback (most recent call last):
  File "pt.py", line 4, in <module>
    a = ht.array(1, split=0)
  File "heat/core/factories.py", line 323, in array
    split = sanitize_axis(obj.shape, split)
  File "heat/core/stride_tricks.py", line 109, in sanitize_axis
    raise ValueError("axis {} is out of bounds for shape {}".format(axis, shape))
ValueError: axis 0 is out of bounds for shape torch.Size([])
Traceback (most recent call last):
  File "pt.py", line 4, in <module>
    a = ht.array(1, split=0)
  File "heat/core/factories.py", line 323, in array
    split = sanitize_axis(obj.shape, split)
  File "heat/core/stride_tricks.py", line 109, in sanitize_axis
    raise ValueError("axis {} is out of bounds for shape {}".format(axis, shape))
ValueError: axis 0 is out of bounds for shape torch.Size([])

Expected behaviour
Create 0-dim heat tensor

Illustrative

a = ht.array(1, split=0)
b = ht.array(torch.tensor(0), split=0)

Version Info
v0.3.0

Additional comments
This bug affects issue #490

@mtar mtar added bug Something isn't working redistribution Related to distributed tensors labels Feb 21, 2020
@ClaudiaComito
Copy link
Contributor

@mtar do you think this is causing #490? I don't see why the tests wouldn't fail with 2 processes already if this were the problem.

(why distribute a scalar would be my next question)

@mtar
Copy link
Collaborator Author

mtar commented Mar 4, 2020

The bug happens more specifically in indexing.nonzero() lines 74-75 when lcl_nonzero has only one element:

if a.numdims == 1:
    lcl_nonzero = lcl_nonzero.squeeze()
return factories.array(lcl_nonzero, is_split=is_split, device=a.device, comm=a.comm)

In this particular case, squeeze will return a 0-dim tensor and stuff happens.

On seven processes, you run into it (line 346).

@mtar mtar mentioned this issue Mar 16, 2020
4 tasks
@mtar mtar self-assigned this Mar 16, 2020
@ClaudiaComito ClaudiaComito mentioned this issue Apr 1, 2020
4 tasks
@mtar mtar changed the title bug splitted array from scalar bug distributed array from scalar Apr 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working redistribution Related to distributed tensors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants