Task outputting Torch.Tensor now errors (reports having no out attribute) #767

wilke0818 · 2025-01-31T21:56:11Z

What version of Pydra are you using?
0.25.0
Of importance (we believe): this issue comes from upgrading torch to 2.6.0

What were you trying to do?
Use Pydra with torch.Tensors in tasks (splitting list of tensors to functions that input and output individual tensors).

What did you expect will happen?
Test to pass.

What actually happened?
Test no longer passes and errors with:
AttributeError: Task 'test_task_task' has no output attribute 'out', available: '_is_param', 'all_'

Example code:

"""Tests Pydra Helping functions."""

import pydra
import torch


@pydra.mark.task
def pydra_task(test_input: torch.Tensor) -> torch.Tensor:
    """Task function for Pydra workflow to run."""
    return test_input + 2


def test_pydra() -> None:
    """Test simple tensor pydra workflow."""
    wf = pydra.Workflow(name="wf_test", input_spec=["x"])
    wf.split("x", x=[torch.tensor([[3, 4], [5, 6]]), torch.tensor([[0, 1], [1, 2]])])

    wf.add(pydra_task(name="test_task_task", test_input=wf.lzin.x))
    wf.set_output([("wf_out", wf.test_task_task.lzout.out)])

    with pydra.Submitter(plugin="serial", n_procs=1) as sub:
        sub(wf)

    results = wf.result()

    assert results[0].output.wf_out.equal(torch.tensor([[5, 6], [7, 8]]))
    assert results[1].output.wf_out.equal(torch.tensor([[2, 3], [3, 4]]))

Expected: Pass the test
Actual:

Note: based on #761 this code might be needed for the test to pass:

@register_serializer(torch.Tensor)
def bytes_repr_arraylike(obj: torch.Tensor, cache: Cache) -> Iterator[bytes]:
    """Register a serializer for Torch tensors that allows Pydra to properly use them."""
    yield f"{obj.__class__.__module__}{obj.__class__.__name__}:".encode()
    array = np.asanyarray(obj)
    yield f"{array.size}:".encode()
    if array.dtype == "object":
        yield from bytes_repr_sequence_contents(iter(array.ravel()), cache)
    else:
        yield array.tobytes(order="C")

The text was updated successfully, but these errors were encountered:

satra · 2025-02-01T16:45:54Z

@wilke0818 - the following change seems sufficient for this test to pass (together with the register_serializer).

@pydra.mark.task
@pydra.mark.annotate({"return": {"out": torch.Tensor}})
def pydra_task(test_input: torch.Tensor) -> torch.Tensor:
     """Task function for Pydra workflow to run."""
     return test_input + 2

some more info, which seems to indicate how using torch.Tensor has specific interaction with output_spec generation. perhaps @tclose or @djarecka can spell a quick fix.

without annotate:

In [41]: foo = pydra_task()

In [42]: foo.output_spec
Out[42]: SpecInfo(name='Tensor', fields=[('_is_param', <class 'bool'>)], bases=(<class 'pydra.engine.specs.BaseSpec'>,))

with annotate:

In [46]: foo = pydra_task()

In [47]: foo.output_spec
Out[47]: SpecInfo(name='Output', fields=[('out', <class 'torch.Tensor'>)], bases=(<class 'pydra.engine.specs.BaseSpec'>,))

in comparison using np.array seems to be fine.

In [48]: import pydra
    ...: import torch
    ...: 
    ...: 
    ...: @pydra.mark.task
    ...: def pydra_task(test_input: np.array) -> np.array:
    ...:     """Task function for Pydra workflow to run."""
    ...:     return test_input + 2
    ...: 

In [49]: foo = pydra_task()

In [50]: foo.output_spec
Out[50]: SpecInfo(name='Output', fields=[('out', <built-in function array>)], bases=(<class 'pydra.engine.specs.BaseSpec'>,))

satra · 2025-02-02T16:34:03Z

this code chunk is the reason why it behaves differently for numpy and torch:

pydra/pydra/engine/task.py

Line 175 in d674840

if (

the torch.Tensor object has annotations ({'_is_param': bool}) (@wilke0818 - this was probably the change between torch 2.5 and 2.6) while numpy.ndarray does not. hence the code block gets executed and we don't get the out field.

@tclose - why do we assume that any object with annotations provides a meaningful signature for outputs? was there something you came across that provided this?

tclose · 2025-02-03T03:13:45Z

Hi @satra, I don't believe I have touched that code (unless git says otherwise). However, the #766 completely rewrites/replaces it. It is pretty much ready to go, I'm just working through the unittests and updating them to the new syntax.

satra · 2025-02-03T03:30:22Z

@tclose - this was the original commit: 7265a37

if you don't remember why, i can try a hack before the new syntax is merged. one of our packages ran into this issue, which is why @wilke0818 posted this. there is a workaround for the moment, but looking forward to the new syntax, hopefully we can get it merged soon.

tclose · 2025-02-04T04:16:32Z

No sorry, I can't remember my thinking behind that one. Looks like I was just re-enabling something I thought would work after the refactor to use FileFormats. I am just working through the unittests for my PR now, so hoping to have a prototype ready to check out soon, so if you have a short term fix that sounds like a good idea

wilke0818 added the bug Something isn't working label Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task outputting Torch.Tensor now errors (reports having no out attribute) #767

Task outputting Torch.Tensor now errors (reports having no out attribute) #767

wilke0818 commented Jan 31, 2025

satra commented Feb 1, 2025

satra commented Feb 2, 2025

tclose commented Feb 3, 2025

satra commented Feb 3, 2025

tclose commented Feb 4, 2025

Task outputting Torch.Tensor now errors (reports having no out attribute) #767

Task outputting Torch.Tensor now errors (reports having no out attribute) #767

Comments

wilke0818 commented Jan 31, 2025

satra commented Feb 1, 2025

satra commented Feb 2, 2025

tclose commented Feb 3, 2025

satra commented Feb 3, 2025

tclose commented Feb 4, 2025