Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull recursively #59

Merged
merged 35 commits into from
Nov 2, 2023
Merged

Pull recursively #59

merged 35 commits into from
Nov 2, 2023

Conversation

liamhuber
Copy link
Member

@liamhuber liamhuber commented Oct 31, 2023

Allow pull to recursively break out of its own context and cause parent nodes to also pull their input data trees.

pull, execute, and __call__ are all just shortcuts for run with a convenient set of flags pre-set. The run command has a bunch of boolean flags to control the type of execution. To summarize these I'll just copy-paste the new docstring section:

Args:
    run_data_tree (bool): Whether to first run all upstream nodes in the data
        graph. (Default is False.)
    run_parent_trees_too (bool): Whether to recursively run the data tree in
        parent nodes (if any). (Default is False.)
    fetch_input (bool): Whether to first update inputs with the
        highest-priority connections holding data. (Default is True.)
    check_readiness (bool): Whether to raise an exception if the node is not
        `ready` to run after fetching new input. (Default is True.)
    force_local_execution (bool): Whether to ignore any executor settings and
        force the computation to run locally. (Default is False.)
    emit_ran_signal (bool): Whether to fire off all the output `ran` signal
        afterwards. (Default is True.)
    **kwargs: Keyword arguments matching input channel labels; used to update
        the input channel values before running anything.

Here __call__ is now the most aggressive thing imaginable, which invokes runs in both its own upstream data dependencies and (recursively!) parent node data dependencies. (It doesn't trigger downstream runs by emitting its ran signal though).

Note that since Workflow instances are a parent-most object, their run method cannot run upstream data trees or push downstream execution signals, so in this case pull and run operate the same (well, run has a couple of other flags you can change, but I mean with default values).

Closes #54.

TODO:

  • Extend tests
    • Pulling from inside a macro with and without the recursive call
    • Can we move tests from integration to unit? Yes
    • Test recovery (connections and labels) after failure at various points
    • Figure out how recursive input pulling needs to interact with executors and make sure it is. EDIT: For now, I opted to fail hard if you try to pull when any node in the upstream tree uses an executor. The presence of the executors indicates to me that you might be sitting around waiting for the result longer than expected; IMO it's healthier to push users towards actually putting things in a workflow and then invoking the workflow when they get to the point that they want to be using executors.
  • Update example notebook
  • Update docstring examples

Copy link

Binder 👈 Launch a binder notebook on branch pyiron/pyiron_workflow/pull_recursively

@liamhuber
Copy link
Member Author

Coveralls failure:

...
HTTP error:
---
Error: Bad Gateway (502)
...

I need to make more commits here anyway, so let's delay investigating this in case it disappears on its own.

Copy link

github-actions bot commented Nov 1, 2023

Pull Request Test Coverage Report for Build 6722182340

  • 64 of 64 (100.0%) changed or added relevant lines in 3 files are covered.
  • 25 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.003%) to 87.777%

Files with Coverage Reduction New Missed Lines %
workflow.py 3 86.05%
function.py 5 90.7%
node.py 17 88.57%
Totals Coverage Status
Change from base Build 6711058205: 0.003%
Covered Lines: 3253
Relevant Lines: 3706

💛 - Coveralls

Copy link

codacy-production bot commented Nov 1, 2023

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
-0.04% (target: -1.00%) 90.16%
Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (5bba7fa) 1843 1514 82.15%
Head commit (ba3e970) 1872 (+29) 1537 (+23) 82.11% (-0.04%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#59) 61 55 90.16%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings    Change summary preferences

Should have been separate commits for cyclic data in parent context and when an upstream node fails, but we'll live.
For pulling when all the nodes are parentless
They are duplicated in integration tests. In particular, we check pulling from inside a macro in the macro tests
Since it can't have one and its parent is always none. This prevents an error when a workflow sits at the top of a recursive data running call.
The presence of the executors indicates to me that you might be sitting around waiting for the result longer than expected; IMO it's healthier to push users towards actually putting things in a workflow and then invoking the workflow when they get to the point that they want to be using executors.
@liamhuber liamhuber added the format_black trigger the Black formatting bot label Nov 1, 2023
@liamhuber liamhuber removed the format_black trigger the Black formatting bot label Nov 1, 2023
@liamhuber liamhuber marked this pull request as ready for review November 1, 2023 22:13
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@liamhuber liamhuber merged commit e6cd644 into main Nov 2, 2023
14 of 15 checks passed
@liamhuber liamhuber deleted the pull_recursively branch November 2, 2023 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pull mode should be allowed to (always?) pull parent macro input first
2 participants