Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: op.function doesn't work with slices. #809

Merged
merged 1 commit into from
May 30, 2024

Conversation

saltball
Copy link

This PR addresses the issues encountered when defining an OP using OP.function and utilizing the slices feature, although it has not been thoroughly tested yet.

dflow version: 1.8.67.

I have noticed that functions decorated with the @OP.function decorator raising an error with slices, such as "TypeError: the type of <redacted> must be a list; got pathlib.PosixPath instead."

I have traced the issue to (notice Line515&518)

script += " input_sign = %s.get_input_sign()\n" % class_name
script += " output_sign = %s.get_output_sign()\n" % class_name
if self.slices is not None and self.slices.pool_size is not None:
script += " from dflow.python.utils import try_to_execute\n"
script += " from functools import partial\n"
script += " try_to_execute = partial(try_to_execute, "\
"op_obj=op_obj, output_sign=output_sign, cwd=os.getcwd())\n"
script += " from typing import List\n"
script += " from pathlib import Path\n"
for name in self.slices.input_artifact:
if isinstance(input_sign[name], Artifact):
if input_sign[name].type == str:
script += " input_sign['%s'].type = List[str]\n" % \
name
elif input_sign[name].type == Path:
script += " input_sign['%s'].type = List[Path]\n" %\
name
. Specifically, the code in this section has no impact on the get_input_sign of class method defined using a Python class, as it returns an actual dictionary returned by the class method, which is created when calling it. However, the behavior is different within OP.function due to some interesting things of Python functions. Methods defined within library class methods in python_op_template.py and those created within class methods defined in the __main__ scope do not behave consistently, leading to different behavior when executed within a container. If the OP is defined by OP.function and attempts to use the slices feature, the Python interpreter alters the input type of the OP, where input_sign will be changed because it is returned by get_input_sign instead of a copy.

I propose to protect it using deepcopy() in this PR, but I am not certain if there are any other implications.


P.S. for quick check the behavior in render_script:

from pathlib import Path

from dflow.python import (
    OP,
    OPIO,
    Artifact,
    OPIOSign,
)
from typing import List
from copy import deepcopy


@OP.function
def someOP(
    input_file: Artifact(Path),
) -> {"output_file": Artifact(Path)}:
    input_file = input_file['input_file']
    output_file = Path(input_file.stem + ".out")
    ...
    return OPIO({
        "output_file": output_file,
    })


input_sign = someOP.get_input_sign()
# input_sign = deepcopy(someOP.get_input_sign())
print(someOP.get_input_sign())
input_sign['input_file'].type = List[Path]
print(someOP.get_input_sign())


class someOP(OP):
    @classmethod
    def get_input_sign(cls):
        return OPIOSign({
            "input_file": Artifact(Path),
        })

    @classmethod
    def get_output_sign(cls):
        return OPIOSign({
            "output_file": Artifact(Path),
        })

    @OP.exec_sign_check
    def execute(
            self,
            ip: OPIO,
    ) -> OPIO:
        input_file = ip['input_file']
        output_file = Path(input_file.stem + ".out")
        ...
        return OPIO({
            "output_file": output_file,
        })


print('=====')
input_sign = someOP.get_input_sign()
print(someOP.get_input_sign())
input_sign['input_file'].type = List[Path]
print(someOP.get_input_sign())

expect:

{'input_file': Artifact(type=pathlib.Path, optional=False, sub_path=True)}
{'input_file': Artifact(type=pathlib.Path, optional=False, sub_path=True)}
=====
{'input_file': Artifact(type=pathlib.Path, optional=False, sub_path=True)}
{'input_file': Artifact(type=pathlib.Path, optional=False, sub_path=True)}

but:

{'input_file': Artifact(type=pathlib.Path, optional=False, sub_path=True)}
{'input_file': Artifact(type=typing.List, optional=False, sub_path=True)}
=====
{'input_file': Artifact(type=pathlib.Path, optional=False, sub_path=True)}
{'input_file': Artifact(type=pathlib.Path, optional=False, sub_path=True)}

@zjgemi
Copy link
Collaborator

zjgemi commented May 28, 2024

It's very nice to point out this bug. That's because for function OP, the OP class is defined in the closure whose get_input_sign/get_output_sign methods returns local variables directly. Modification of the returned dicts leads to change of local variables. However, I recommend to deepcopy inside the local class, so that the behavior of function OP is aligned with that of class OP.

src/dflow/python/op.py:

        class subclass(cls):
            task_kwargs = {}

            @classmethod
            def get_input_sign(cls):
                return deepcopy(input_sign)

            @classmethod
            def get_output_sign(cls):
                return deepcopy(output_sign)

@saltball
Copy link
Author

@zjgemi I agree, changes made in src/dflow/python/op.py.

@zjgemi zjgemi merged commit eab86e6 into deepmodeling:master May 30, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants