Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bindings): Task arguments default value binding #2401

Conversation

MortalHappiness
Copy link
Member

@MortalHappiness MortalHappiness commented May 9, 2024

Tracking issue

Resolves: flyteorg/flyte#5321

Why are the changes needed?

See the issue description for details.

What changes were proposed in this pull request?

if the key is not in kwargs but in interface.inputs_with_defaults, add the value in interface.inputs_with_defaults to kwargs.

How was this patch tested?

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

@MortalHappiness MortalHappiness marked this pull request as draft May 9, 2024 18:04
@MortalHappiness MortalHappiness force-pushed the feature/#5321-task-default-value-binding branch from b434ddf to cc13415 Compare May 9, 2024 18:45
@MortalHappiness MortalHappiness marked this pull request as ready for review May 9, 2024 18:46
@MortalHappiness MortalHappiness marked this pull request as draft May 9, 2024 18:50
@MortalHappiness MortalHappiness marked this pull request as ready for review May 10, 2024 04:51
Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we remove these lines? since if statement will always be false.

tests/flytekit/unit/core/test_promise.py Outdated Show resolved Hide resolved
@MortalHappiness MortalHappiness force-pushed the feature/#5321-task-default-value-binding branch from cc13415 to 277d88b Compare May 15, 2024 12:49
@MortalHappiness
Copy link
Member Author

Also, should we remove these lines? since if statement will always be false.

Done

@MortalHappiness MortalHappiness force-pushed the feature/#5321-task-default-value-binding branch 2 times, most recently from 51e15e5 to 6446ad1 Compare May 16, 2024 15:13
@wild-endeavor
Copy link
Contributor

could you add more tests please? take a look at serialization tests, it should be pretty simple to test the compiled workflow.

do we do anything special for [] and {}? You're not supposed to use those as function defaults anyways but some people might.

@MortalHappiness MortalHappiness force-pushed the feature/#5321-task-default-value-binding branch from 6446ad1 to 6c808ae Compare May 17, 2024 13:24
@MortalHappiness
Copy link
Member Author

could you add more tests please? take a look at serialization tests, it should be pretty simple to test the compiled workflow.

do we do anything special for [] and {}? You're not supposed to use those as function defaults anyways but some people might.

Done

@MortalHappiness MortalHappiness force-pushed the feature/#5321-task-default-value-binding branch 7 times, most recently from 545adbc to d9a7345 Compare May 26, 2024 09:33
Copy link

codecov bot commented May 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.56%. Comparing base (57ee143) to head (2de588f).
Report is 1 commits behind head on master.

Current head 2de588f differs from pull request most recent head b92519a

Please upload reports for the commit b92519a to get more accurate results.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2401      +/-   ##
==========================================
+ Coverage   72.28%   78.56%   +6.27%     
==========================================
  Files         182      185       +3     
  Lines       18491    18603     +112     
  Branches     3832     3604     -228     
==========================================
+ Hits        13367    14616    +1249     
+ Misses       4488     3348    -1140     
- Partials      636      639       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@wild-endeavor wild-endeavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! just minor comments. can you take a look at the failing unit test also? for some performance reason, we split out tests that rely on pandas into a separate gh action. Take a look at

if "pandas" not in sys.modules:

@@ -84,6 +122,287 @@ def raw_container_wf(val1: int, val2: int) -> int:
assert sum_spec.template.container.image == "alpine"


@pytest.mark.parametrize(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know some people promote these, but once you get past a few args, I find these parametrized tests super hard to follow. Could you maybe split up the test below into multiple tests? That would reduce the need to have so many args - both help make it more readable, and separate testing concerns better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

flytekit/core/promise.py Show resolved Hide resolved
flytekit/core/promise.py Outdated Show resolved Hide resolved
flytekit/core/promise.py Show resolved Hide resolved
tests/flytekit/unit/core/test_dynamic.py Show resolved Hide resolved
flytekit/core/promise.py Outdated Show resolved Hide resolved
@MortalHappiness MortalHappiness marked this pull request as draft May 29, 2024 05:37
@MortalHappiness MortalHappiness force-pushed the feature/#5321-task-default-value-binding branch from d9a7345 to c90ce87 Compare May 29, 2024 11:37
@MortalHappiness MortalHappiness marked this pull request as ready for review May 29, 2024 11:38
@MortalHappiness MortalHappiness force-pushed the feature/#5321-task-default-value-binding branch from c90ce87 to 2de588f Compare May 29, 2024 12:33
wild-endeavor
wild-endeavor previously approved these changes May 29, 2024
Copy link
Contributor

@wild-endeavor wild-endeavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still need to look at monodocs. i think we can do that later today.

@wild-endeavor
Copy link
Contributor

eduardo fixed the build separately https://github.com/flyteorg/flyte/pull/5425/files can merge this after passing

wf_no_input_spec = get_serializable(OrderedDict(), serialization_settings, wf_no_input)
wf_with_input_spec = get_serializable(OrderedDict(), serialization_settings, wf_with_input)

assert wf_no_input_spec.template.nodes[0].inputs[0].binding.value == Scalar(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually sorry, i think we need to think about this case a bit more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean I need to bind this to Union instead of Primitive?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly

Copy link
Member Author

@MortalHappiness MortalHappiness Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. See the latest commit for details. Here I added a bool argument is_union_type_variants_expanded. And if the default argument is used, then is_union_type_variants_expanded will be set to False, which will bypass the for loop expansion for union types here

elif t_value is not None and is_union_type_variants_expanded and expected_literal_type.union_type is not None:
for i in range(len(expected_literal_type.union_type.variants)):
try:
lt_type = expected_literal_type.union_type.variants[i]
python_type = get_args(t_value_type)[i] if t_value_type else None
return binding_data_from_python_std(ctx, lt_type, t_value, python_type, nodes)
except Exception:
logger.debug(
f"failed to bind data {t_value} with literal type {expected_literal_type.union_type.variants[i]}."
)
raise AssertionError(
f"Failed to bind data {t_value} with literal type {expected_literal_type.union_type.variants}."
)

Also, I added two tests for testing Optional[List[int]].

@wild-endeavor
Copy link
Contributor

sorry, missed something, let me get back to you later.

@@ -684,6 +690,7 @@ def binding_data_from_python_std(
t_value: Any,
t_value_type: type,
nodes: List[Node],
is_union_type_variants_expanded: bool = True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this now with the other change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@MortalHappiness MortalHappiness force-pushed the feature/#5321-task-default-value-binding branch from c9cb9df to b92519a Compare June 6, 2024 04:17
@wild-endeavor wild-endeavor merged commit 47f2a29 into flyteorg:master Jun 6, 2024
46 checks passed
@eapolinario
Copy link
Collaborator

This is pretty awesome! Thank you for all the hard work here, @MortalHappiness and @wild-endeavor !

bgedik pushed a commit to bgedik/flytekit that referenced this pull request Jul 3, 2024
flyteorg/flyte#5321

if the key is not in `kwargs` but in `interface.inputs_with_defaults`, add the value in `interface.inputs_with_defaults` to `kwargs`.

Signed-off-by: Chi-Sheng Liu <[email protected]>
Signed-off-by: bugra.gedik <[email protected]>
fiedlerNr9 pushed a commit that referenced this pull request Jul 25, 2024
flyteorg/flyte#5321

if the key is not in `kwargs` but in `interface.inputs_with_defaults`, add the value in `interface.inputs_with_defaults` to `kwargs`.

Signed-off-by: Chi-Sheng Liu <[email protected]>
Signed-off-by: Jan Fiedler <[email protected]>
Comment on lines +1091 to +1092
if not isinstance(default_val, Hashable):
raise _user_exceptions.FlyteAssertion("Cannot use non-hashable object as default argument")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this extra-check? Any value that can't be handled by the existing type transformers at registration time is going to show pretty prominently, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used to prevent the confusing behavior of mutable default arguments.

https://docs.python-guide.org/writing/gotchas/#mutable-default-arguments

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on Hashable prevents dataclasses from being used as default arguments. This is too restrictive, since even sophisticated linters like ruff are not able to catch this example:

from dataclasses import dataclass
from mashumaro import DataClassDictMixin

@dataclass
class DC(DataClassDictMixin):
    a: int
    b: str

def f(dc: DC = DC(42, "43"), xs: list[int] = [1,2], d: dict[str, int] = {"abc": 42}, s: set(int) = set([1,2,3])):
    ...

if __name__ == "__main__":
    f()

Running ruff with https://docs.astral.sh/ruff/rules/mutable-argument-default/ turned on returns:

❯ ruff check example
example/dc.py:9:46: B006 Do not use mutable data structures for argument defaults
   |
 7 |     b: str
 8 |
 9 | def f(dc: DC = DC(42, "43"), xs: list[int] = [1,2], d: dict[str, int] = {"abc": 42}, s: set(int) = set([1,2,3])):
   |                                              ^^^^^ B006
10 |     ...
   |
   = help: Replace with `None`; initialize within function

example/dc.py:9:73: B006 Do not use mutable data structures for argument defaults
   |
 7 |     b: str
 8 |
 9 | def f(dc: DC = DC(42, "43"), xs: list[int] = [1,2], d: dict[str, int] = {"abc": 42}, s: set(int) = set([1,2,3])):
   |                                                                         ^^^^^^^^^^^ B006
10 |     ...
   |
   = help: Replace with `None`; initialize within function

example/dc.py:9:100: B006 Do not use mutable data structures for argument defaults
   |
 7 |     b: str
 8 |
 9 | def f(dc: DC = DC(42, "43"), xs: list[int] = [1,2], d: dict[str, int] = {"abc": 42}, s: set(int) = set([1,2,3])):
   |                                                                                                    ^^^^^^^^^^^^ B006
10 |     ...
   |
   = help: Replace with `None`; initialize within function

Found 3 errors.
No fixes available (3 hidden fixes can be enabled with the `--unsafe-fixes` option).

I propose we handle these 3 common errors explicitly instead. This is also going to simplify the error message. I'll get a PR out soon.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: #2651

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eapolinario I saw your PR. Quite make senses for me. Those 3 container types are indeed the most commonly used non-hashable types.

Copy link
Member Author

@MortalHappiness MortalHappiness Aug 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about collections.deque and collections.ChainMap?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Core feature] Flytekit create automatic bindings for Task default values
4 participants