Skip to content

Commit

Permalink
JNmpi lammps nodes patch version ~0.7.0~ 0.7.1 (#315)
Browse files Browse the repository at this point in the history
* Make node_function and type hints available at the _class level_

And fix one error type on the storage tests.

* Pull the same trick with Macro

But use an actual intermediate class instead of a function, so that we retain some of the `Composite` class methods and the IDE doesn't complain that we're capitalizing the name

* Make the Function misdirection a class too

* Refactor: Pull out a HasCreator interface

So that `Composite` and the not-actually-a-`type[Node]` `Macro` can avoid duplicating code

* Update docstrings

* Format black

* Codacy: don't import unused wraps

* Make many AbstractFunction methods classmethods

* Make output channel info available as a Function class attribute

So channel labels and type hints (and later any ontological hints) can be viewed without ever instantiating a node. That means `output_labels` is no longer available at the class instance level, but only when defining a new class!

* Make input channel info available as a Function class attribute

This was much easier because we never need to inspect the source code manually (as we do sometimes to check return values).

* Add/extend public method docstrings

* Refactor: slide

* Remove unused private attribute

* Fix docstrings

* Format black

* Remove unused imports

Just some housekeeping in the files we're touching anyhow

* Fix docstring

* Make macro output labels a class attribute

* Format black

* Bump typeguard from 4.1.5 to 4.2.1 (#262)

* Bump typeguard from 4.1.5 to 4.2.1

Bumps [typeguard](https://github.com/agronholm/typeguard) from 4.1.5 to 4.2.1.
- [Release notes](https://github.com/agronholm/typeguard/releases)
- [Changelog](https://github.com/agronholm/typeguard/blob/master/docs/versionhistory.rst)
- [Commits](agronholm/typeguard@4.1.5...4.2.1)

---
updated-dependencies:
- dependency-name: typeguard
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* [dependabot skip] Update environment

* [dependabot skip] Update env file

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: pyiron-runner <[email protected]>

* Revert warning and test for the graphviz call+windows combination (#275)

Whatever was broken in the upstream dependency or github image seems to be fixed.

* [patch] Update dependencies (#274)

* Update dependencies

* [dependabot skip] Update env file

* Fix browser version typo

* [dependabot skip] Update env file

---------

Co-authored-by: pyiron-runner <[email protected]>

* Revert save overload (#234)

* Prepend instead of appending new connections

Fetch looks for the first available data in connection lists, so prepend on connection formation so that newer connections get higher priority.

* Add RandomFloat to the standard library

* Use standard library for for and while loop docs examples

* Fix type hint in while loop

Now that `Function` is not a node and we automate the functionality of `SingleValueNode`

* Don't use maps in the README example

* Disallow IO maps on `AbstractMacro`

I tried to break this down a reasonable amount, but I also want the tests passing at each commit. In principle, I could duplicate code instead of simply moving it to break this apart a bit further, but in practice I'm not expecting anyone to actually review this PR so that seems like needless effort. Here, I

* Move `inputs/outputs_map` from `Composite` down to `Workflow`
* Disable automatic IO generation on `AbstractMacro` -- you _need_ to define it by labels and the graph creator signature/returns
* Add IO previewing and class-level stuff analogous to `AbstractFunction` onto `AbstractMacro` now that its IO can't be remapped
* Update tests and examples accordingly
* Finally, re-write the while- and for-loop meta nodes, which depended heavily on the presence of maps

Note that although the for- and while-loops are totally different under the hood, the only user-facing changes are (a) you can only wrap nodes that are importable, and (b) the IO maps for the while-loop are mandatory instead of optional.

* Fix atomistics macros

We duplicate the task node signature for the macros now, but I am not currently stressed about this; there are ways around it, I just don't see the abstraction being worthwhile for three nodes.

* Make the pyiron_atomistics Bulk node importable

* Provide values for all for-loop input

We don't pass the loop body defaults and we explicitly create input channels now, so we (sadly) need to provide all input. I played around scraping the defaults and hints from the body node, but the string representation vs code representation is just too much of a headache. Ultimately I think we will need to get away from this executed string somehow.

* Format black

* Prune macro IO nodes that don't get forked

For the sake of computational efficiency, have macro IO value-link directly to child node IO wherever possible.

* Ignore the notebook testing script

I just created this so that I can check that the notebooks pass locally before pushing to github (just from my IDE without actually running the notebooks). For the CI we have a specific job that runs the notebooks with a different environment, so we don't want the remote CI discovering these tests.

* Give the `Function` creator class a function-like name

* Make it an actual function!

* Rename `AbstractFunction` back to the simpler `Function`

Now that that name is free again

* Un-parent `Macro` from `HasCreator`

It was easy enough to directly reference `AbstractMacro` in all the places in tests and documentation where we were previously exploiting this parentage.

* Give the `Macro` creator class a function-like name

* Make it a function!

* Remove unused import

* Rename `AbstractMacro` back to `Macro`

* Update some text in the deepdive

* Rename `wrap_as` to `wrap`

In the process of getting nicer phrasing

* Refactor: rename

* Refactor: rename

* Refactor: rename

* Correct some mis-names that slipped into strings

* Refactor: rename

* Add back README stuff

A chunk got lost in the mix, but thankfully I noticed reviewing the diff

* Tweak docstrings

* Remove line break from example output

* Don't give `Function` special treatment for `self`

* Add a test for protected names

* Make the graph creator an abstract(static) method

This was just an oversight. No impact on behaviour, but makes things clearer to devs.

* Let Macro nodes scrape their output labels from their return statement

Just like Function nodes do, but additionally left-stripping `macro.` (or `self.` or `wf.` or whatever the `graph_creator`'s first self-like argument is called)

* Disallow degenerate output labels

* Refactor: extract shared behaviour for scraping IO from a class method

* Refactor: rename the dev-facing attribute for manually setting output labels

* Refactor: rename classmethod getters

* Refactor: simplify if

* Refactor: more efficient if

* Only build channel previews once

* Refactor validation and apply it to both Macro and Function

* Add examples to the deepdive of turning off output validation

* Improve error messages

* Refactor: move the preview stuff to its own file

* Refactor: rename public facing preview methods

* Introduce an interface class

To separate the ideas of having IO information available at the class level, and scraping such data from a class method.

* Refactor: slide public method up

* Change class method to simple attribute

* Introduce new abstract classes and reparent Macro and Function accordingly

* Replace Macro and Function decorators

With calls to a centralized decorator factory

* Refactor: slide creators below the decorators they use

* Extend docstrings

* Move degeneracy validation inside the try loop

As it will trigger an OSError off the same un-inspectable case

* Introduce a custom warning type

So that we can test whether or not we're catching _our_ warning

* Test new IO classes explicitly and remove duplication downstream

* Update notebooks

* Format black

* Add a helper to see both previews at once

* Format black

* Use `self` canonically as the first macro argument in the source code

* Remove unused imports

* Update readme

* Update quickstart

* Update deepdive

* Run all starters when pulling

* Refactor: break out earlier when the node in question is the starter

* Refactor: immediately convert to list

* Run parented pulls by running the parent

By using these specially constructed execution signals

* Track post-facto provenance

* Remove recursion limit adjustment

The stack always goes directly up to the parent in macros now, instead of through all previous calls, so we don't need hit against this limit anymore.

* Format black

* Fix bad type hint

* Remove abstract method from Composite

In Workflow the method just returns the first argument, so a method is entirely unnecessary; in Macro we legitimately need it, but now that this is the only place where it's implemented/used, we can give it an even more specific name.

* Pull IO creation up to parent class

* Remove unused import

* Formatting

* Format black

* Use lru_cache

* Move for- and while-loops to their own submodule

* Give label a default in Node.__init__

* Don't pass type_hint directly to a node

* Make HasIO.set_input_values accept args

* Fix notebooks

* Make all nodes parse kwargs as input in __post__

* Stop using __post__

Replace it with a system where `Node.__init__` calls `_setup_node` and then `_after_node_setup`, so that children of `Node` can use a combination of (a) doing stuff before calling `super().__init__` and (b) overriding `_node_setup` in order to be sure they accomplish what is needed prior to the post-setup stuff happening. This is a bit more cognitive overhead for node class writers, but simplifies the actual technology.

* Format black

* Allow run and run callbacks to take (optional!) arguments

* Let run and related methods on Node take args as well as kwargs

* Update pyiron-verse dependencies (#302)

* Update pyiron-verse dependencies

Synchronized to avoid headaches with patch number upper limits, so we can get it all done in a single pull

* [dependabot skip] Update env file

* Format black

* Bypass failure point

Everything still looks totally fine locally. Let's skip the failing doctest line and see if 3.10 has any problems at the unit test stage.

* Add it back with some debug

* Reduce debug

`instance.result` exists and is the expected `DynamicFoo` object

---------

Co-authored-by: pyiron-runner <[email protected]>

* [minor] Factories instead of meta nodes, and use them for transformer nodes (#293)

* Replace evaluated code to-/from-list nodes with actual classes

Leveraging custom constructors

* Develop meta abstraction and make a new home for transforming meta nodes

* Accept node args too

* Introduce a decorator for building the class-IO

* Change paradigm to whether or not the node uses __reduced__ and a constructor

Instead of "Meta" nodes

* Allow direct use of Constructed children

* Move and update constructed stuff

* Add new singleton behaviour so factory-produced classes can pass is-tests

* Apply constructed and class registration updates to the transformers

* Remove (now unused) meta module

* PEP8 newline

* Remove unnecessary __getstate__

The object isn't holding instance level state and older versions of python bork here.

* Add constructed __*state__ compatibility for older versions

* 🐛 add missing `return`

* Format black

* Introduce a new factory pattern

At the cost of requiring factory functions to forgo kwargs, we get object matching for factories and classes, and picklability for instances. Still need to check some edge cases around the use of stuff with a non-trivial qualname.

* Test factories as methods

* Test local scoping

* Add docstrings and hide a function

* Simplify super

* Give demo class in tests have a more generic __init__

* Test and document multiple inheritance

* Stop storing __init__ args

By leveraging `__new__` and `__getnewargs_ex__`

* Add clearing methods

In case you want to change some constructor behaviour and clear the access cache. Not sure what should be cached at the end of the day, but for now just give users a shortcut for clearing it.

* Use snippets.factory.classfactory and typing.ClassVar for transformers

* Revert singleton

* Format black

* Remove constructed

It's superceded by the snippets.factory stuff

* Gently rename things for when the factory comes from a decorator

* Allow factory made classes to also come from decorators

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* [minor] Use factories in existing nodes (#303)

* Change paradigm to whether or not the node uses __reduced__ and a constructor

Instead of "Meta" nodes

* Allow direct use of Constructed children

* Move and update constructed stuff

* Add new singleton behaviour so factory-produced classes can pass is-tests

* PEP8 newline

* Remove unnecessary __getstate__

The object isn't holding instance level state and older versions of python bork here.

* Add constructed __*state__ compatibility for older versions

* 🐛 add missing `return`

* Format black

* Revert singleton

* Remove constructed

It's superceded by the snippets.factory stuff

* Format black

* Let the factory clear method take specific names

* Don't override __module__ to the factory function

If it was explicitly set downstream, leave that. But if the user left it empty, still default it back to the factory function's module

* Clean up storage if job tests fail

* Make tinybase the default storage backend

* Switch Function and Macro over to using classfactory

With this, everything is pickleable (unless you slap something unpickleable on top, or define it in a place that can't be reached by pickle like inside a local function scope). The big downside is that `h5io` storage is now basically useless, since all our nodes come from custom reconstructors. Similarly, for the node job `DataContainer` can no longer store the input node. The `tinybase` backend is still working ok, so I made it the default, and I got the node job working again by forcing it to cloudpickle the input node on saving. These are some ugly hacks, but since storage is an alpha feature right now anyhow, I'd prefer to push ahead with pickleability.

* Remove unused decorator

And reformat tests in the vein of usage in Function and Macro

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* [minor] Executor compliance (#304)

* Change paradigm to whether or not the node uses __reduced__ and a constructor

Instead of "Meta" nodes

* Allow direct use of Constructed children

* Move and update constructed stuff

* Add new singleton behaviour so factory-produced classes can pass is-tests

* PEP8 newline

* Remove unnecessary __getstate__

The object isn't holding instance level state and older versions of python bork here.

* Add constructed __*state__ compatibility for older versions

* 🐛 add missing `return`

* Format black

* Revert singleton

* Remove constructed

It's superceded by the snippets.factory stuff

* Format black

* Let the factory clear method take specific names

* Don't override __module__ to the factory function

If it was explicitly set downstream, leave that. But if the user left it empty, still default it back to the factory function's module

* Clean up storage if job tests fail

* Make tinybase the default storage backend

* Switch Function and Macro over to using classfactory

With this, everything is pickleable (unless you slap something unpickleable on top, or define it in a place that can't be reached by pickle like inside a local function scope). The big downside is that `h5io` storage is now basically useless, since all our nodes come from custom reconstructors. Similarly, for the node job `DataContainer` can no longer store the input node. The `tinybase` backend is still working ok, so I made it the default, and I got the node job working again by forcing it to cloudpickle the input node on saving. These are some ugly hacks, but since storage is an alpha feature right now anyhow, I'd prefer to push ahead with pickleability.

* Remove unused decorator

And reformat tests in the vein of usage in Function and Macro

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* [patch] Make factory made objects cloudpickleable (#305)

* Change paradigm to whether or not the node uses __reduced__ and a constructor

Instead of "Meta" nodes

* Allow direct use of Constructed children

* Move and update constructed stuff

* Add new singleton behaviour so factory-produced classes can pass is-tests

* PEP8 newline

* Remove unnecessary __getstate__

The object isn't holding instance level state and older versions of python bork here.

* Add constructed __*state__ compatibility for older versions

* 🐛 add missing `return`

* Format black

* Revert singleton

* Remove constructed

It's superceded by the snippets.factory stuff

* Format black

* Let the factory clear method take specific names

* Don't override __module__ to the factory function

If it was explicitly set downstream, leave that. But if the user left it empty, still default it back to the factory function's module

* Clean up storage if job tests fail

* Make tinybase the default storage backend

* Switch Function and Macro over to using classfactory

With this, everything is pickleable (unless you slap something unpickleable on top, or define it in a place that can't be reached by pickle like inside a local function scope). The big downside is that `h5io` storage is now basically useless, since all our nodes come from custom reconstructors. Similarly, for the node job `DataContainer` can no longer store the input node. The `tinybase` backend is still working ok, so I made it the default, and I got the node job working again by forcing it to cloudpickle the input node on saving. These are some ugly hacks, but since storage is an alpha feature right now anyhow, I'd prefer to push ahead with pickleability.

* Remove unused decorator

And reformat tests in the vein of usage in Function and Macro

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Compute qualname if not provided

* Fail early if there is a <locals> function in the factory made hierarchy

* Skip the factory fanciness if you see <locals>

This enables _FactoryMade objects to be cloudpickled, even when they can't be pickled, while still not letting the mere fact that they are dynamic classes stand in the way of pickling.

Nicely lifts our constraint on the node job interaction with pyiron base, which was leveraging cloudpickle

* Format black

* Test ClassFactory this way too

---------

Co-authored-by: pyiron-runner <[email protected]>

* [patch] More transformers (#306)

* Change paradigm to whether or not the node uses __reduced__ and a constructor

Instead of "Meta" nodes

* Allow direct use of Constructed children

* Move and update constructed stuff

* Add new singleton behaviour so factory-produced classes can pass is-tests

* PEP8 newline

* Remove unnecessary __getstate__

The object isn't holding instance level state and older versions of python bork here.

* Add constructed __*state__ compatibility for older versions

* 🐛 add missing `return`

* Format black

* Revert singleton

* Remove constructed

It's superceded by the snippets.factory stuff

* Format black

* Let the factory clear method take specific names

* Don't override __module__ to the factory function

If it was explicitly set downstream, leave that. But if the user left it empty, still default it back to the factory function's module

* Clean up storage if job tests fail

* Make tinybase the default storage backend

* Switch Function and Macro over to using classfactory

With this, everything is pickleable (unless you slap something unpickleable on top, or define it in a place that can't be reached by pickle like inside a local function scope). The big downside is that `h5io` storage is now basically useless, since all our nodes come from custom reconstructors. Similarly, for the node job `DataContainer` can no longer store the input node. The `tinybase` backend is still working ok, so I made it the default, and I got the node job working again by forcing it to cloudpickle the input node on saving. These are some ugly hacks, but since storage is an alpha feature right now anyhow, I'd prefer to push ahead with pickleability.

* Remove unused decorator

And reformat tests in the vein of usage in Function and Macro

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Compute qualname if not provided

* Fail early if there is a <locals> function in the factory made hierarchy

* Skip the factory fanciness if you see <locals>

This enables _FactoryMade objects to be cloudpickled, even when they can't be pickled, while still not letting the mere fact that they are dynamic classes stand in the way of pickling.

Nicely lifts our constraint on the node job interaction with pyiron base, which was leveraging cloudpickle

* Format black

* Test ClassFactory this way too

* Test existing list nodes

* Rename length base class

* Refactor transformers to use on_run and run_args more directly

* Introduce an inputs-to-dict transformer

* Preview IO as a separate step

To guarantee IO construction happens as early as possible in case it fails

* Add dataframe transformer

* Remove prints 🤦

* Add dataframe transformer tests

* Add transformers to the create menu

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* [patch] 🧹 be more consistent in caching/shortcuts (#307)

* Change paradigm to whether or not the node uses __reduced__ and a constructor

Instead of "Meta" nodes

* Allow direct use of Constructed children

* Move and update constructed stuff

* Add new singleton behaviour so factory-produced classes can pass is-tests

* PEP8 newline

* Remove unnecessary __getstate__

The object isn't holding instance level state and older versions of python bork here.

* Add constructed __*state__ compatibility for older versions

* 🐛 add missing `return`

* Format black

* Revert singleton

* Remove constructed

It's superceded by the snippets.factory stuff

* Format black

* Let the factory clear method take specific names

* Don't override __module__ to the factory function

If it was explicitly set downstream, leave that. But if the user left it empty, still default it back to the factory function's module

* Clean up storage if job tests fail

* Make tinybase the default storage backend

* Switch Function and Macro over to using classfactory

With this, everything is pickleable (unless you slap something unpickleable on top, or define it in a place that can't be reached by pickle like inside a local function scope). The big downside is that `h5io` storage is now basically useless, since all our nodes come from custom reconstructors. Similarly, for the node job `DataContainer` can no longer store the input node. The `tinybase` backend is still working ok, so I made it the default, and I got the node job working again by forcing it to cloudpickle the input node on saving. These are some ugly hacks, but since storage is an alpha feature right now anyhow, I'd prefer to push ahead with pickleability.

* Remove unused decorator

And reformat tests in the vein of usage in Function and Macro

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Compute qualname if not provided

* Fail early if there is a <locals> function in the factory made hierarchy

* Skip the factory fanciness if you see <locals>

This enables _FactoryMade objects to be cloudpickled, even when they can't be pickled, while still not letting the mere fact that they are dynamic classes stand in the way of pickling.

Nicely lifts our constraint on the node job interaction with pyiron base, which was leveraging cloudpickle

* Format black

* Test ClassFactory this way too

* Test existing list nodes

* Rename length base class

* Refactor transformers to use on_run and run_args more directly

* Introduce an inputs-to-dict transformer

* Preview IO as a separate step

To guarantee IO construction happens as early as possible in case it fails

* Add dataframe transformer

* Remove prints 🤦

* Add dataframe transformer tests

* Add transformers to the create menu

* Format black

* 🧹 be more consistent in caching/shortcuts

Instead of always defining private holders by hand

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* [patch] Dataclass transformer (#308)

* Change paradigm to whether or not the node uses __reduced__ and a constructor

Instead of "Meta" nodes

* Allow direct use of Constructed children

* Move and update constructed stuff

* Add new singleton behaviour so factory-produced classes can pass is-tests

* PEP8 newline

* Remove unnecessary __getstate__

The object isn't holding instance level state and older versions of python bork here.

* Add constructed __*state__ compatibility for older versions

* 🐛 add missing `return`

* Format black

* Revert singleton

* Remove constructed

It's superceded by the snippets.factory stuff

* Format black

* Let the factory clear method take specific names

* Don't override __module__ to the factory function

If it was explicitly set downstream, leave that. But if the user left it empty, still default it back to the factory function's module

* Clean up storage if job tests fail

* Make tinybase the default storage backend

* Switch Function and Macro over to using classfactory

With this, everything is pickleable (unless you slap something unpickleable on top, or define it in a place that can't be reached by pickle like inside a local function scope). The big downside is that `h5io` storage is now basically useless, since all our nodes come from custom reconstructors. Similarly, for the node job `DataContainer` can no longer store the input node. The `tinybase` backend is still working ok, so I made it the default, and I got the node job working again by forcing it to cloudpickle the input node on saving. These are some ugly hacks, but since storage is an alpha feature right now anyhow, I'd prefer to push ahead with pickleability.

* Remove unused decorator

And reformat tests in the vein of usage in Function and Macro

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Compute qualname if not provided

* Fail early if there is a <locals> function in the factory made hierarchy

* Skip the factory fanciness if you see <locals>

This enables _FactoryMade objects to be cloudpickled, even when they can't be pickled, while still not letting the mere fact that they are dynamic classes stand in the way of pickling.

Nicely lifts our constraint on the node job interaction with pyiron base, which was leveraging cloudpickle

* Format black

* Test ClassFactory this way too

* Test existing list nodes

* Rename length base class

* Refactor transformers to use on_run and run_args more directly

* Introduce an inputs-to-dict transformer

* Preview IO as a separate step

To guarantee IO construction happens as early as possible in case it fails

* Add dataframe transformer

* Remove prints 🤦

* Add dataframe transformer tests

* Add transformers to the create menu

* Format black

* 🧹 be more consistent in caching/shortcuts

Instead of always defining private holders by hand

* Introduce a dataclass node

* Give the dataclass node a simpler name

Since we can inject attribute access, I don't anticipate ever needing the reverse dataclass-to-outputs node, so let's simplify the naming here.

* Remove unused import

* Set the output type hint automatically

* Add docs

* Add tests

* Format black

* PEP8 newline

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* [minor] Introduce for-loop (#309)

* Change paradigm to whether or not the node uses __reduced__ and a constructor

Instead of "Meta" nodes

* Allow direct use of Constructed children

* Move and update constructed stuff

* Add new singleton behaviour so factory-produced classes can pass is-tests

* PEP8 newline

* Remove unnecessary __getstate__

The object isn't holding instance level state and older versions of python bork here.

* Add constructed __*state__ compatibility for older versions

* 🐛 add missing `return`

* Format black

* Revert singleton

* Remove constructed

It's superceded by the snippets.factory stuff

* Format black

* Let the factory clear method take specific names

* Don't override __module__ to the factory function

If it was explicitly set downstream, leave that. But if the user left it empty, still default it back to the factory function's module

* Clean up storage if job tests fail

* Make tinybase the default storage backend

* Switch Function and Macro over to using classfactory

With this, everything is pickleable (unless you slap something unpickleable on top, or define it in a place that can't be reached by pickle like inside a local function scope). The big downside is that `h5io` storage is now basically useless, since all our nodes come from custom reconstructors. Similarly, for the node job `DataContainer` can no longer store the input node. The `tinybase` backend is still working ok, so I made it the default, and I got the node job working again by forcing it to cloudpickle the input node on saving. These are some ugly hacks, but since storage is an alpha feature right now anyhow, I'd prefer to push ahead with pickleability.

* Remove unused decorator

And reformat tests in the vein of usage in Function and Macro

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Expose concurrent.futures executors on the creator

* Only expose the base Executor from pympipool

Doesn't hurt us now and prepares for the version bump

* Extend `Runnable` to use a non-static method

This is significant. `on_run` is no longer a property returning a staticmethod that will be shipped off, but we directly ship off `self.on_run` so `self` goes with it to remote processes. Similarly, `run_args` gets extended to be `tuple[tuple, dict]` so positional arguments can be sent too.

Stacked on top of pickleability, this means we can now use standard `concurrent.futures.ProcessPoolExecutor` -- as long as the nodes are all defined somewhere importable, i.e. not in `__main__`. Since working in notebooks is pretty common, the more flexible `pympipool.Executor` is left as the default `Workflow.create.Executor`.

This simplifies some stuff under the hood too, e.g. `Function` and `Composite` now just directly do their thing in `on_run` instead of needing the misdirection of returning their own static methods.

* Format black

* Compute qualname if not provided

* Fail early if there is a <locals> function in the factory made hierarchy

* Skip the factory fanciness if you see <locals>

This enables _FactoryMade objects to be cloudpickled, even when they can't be pickled, while still not letting the mere fact that they are dynamic classes stand in the way of pickling.

Nicely lifts our constraint on the node job interaction with pyiron base, which was leveraging cloudpickle

* Format black

* Test ClassFactory this way too

* Test existing list nodes

* Rename length base class

* Refactor transformers to use on_run and run_args more directly

* Introduce an inputs-to-dict transformer

* Preview IO as a separate step

To guarantee IO construction happens as early as possible in case it fails

* Add dataframe transformer

* Remove prints 🤦

* Add dataframe transformer tests

* Add transformers to the create menu

* Format black

* 🧹 be more consistent in caching/shortcuts

Instead of always defining private holders by hand

* Introduce a dataclass node

* Give the dataclass node a simpler name

Since we can inject attribute access, I don't anticipate ever needing the reverse dataclass-to-outputs node, so let's simplify the naming here.

* Remove unused import

* Set the output type hint automatically

* Add docs

* Add tests

* Format black

* PEP8 newline

* Introduce for-loop

* Refactor: break _build_body into smaller functions

* Resolve dataframe column name conflicts

When a body node has the same labels for looped input as for output

* Update docstrings

* Refactor: rename file

* Don't break when one of iter or zipped is empty

* 🐛 pass body hint, not hint and default, to row collector input hint

* Silence disconnection warning

Since(?) disconnection is reciprocal it was firing left right and centre

* Update deepdive

* Remove old for loop

* Remove unused import

* Add tests

* Remove unused attributes

* Add a shortcut for assigning an executor to all body nodes

* Format black

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* Bump pyiron ecosystem (#313)

* Bump pyiron ecosystem

* [dependabot skip] Update env file

* Bumpy pympipool in conda env

* Be more gracious in parallel for-loop speed test

* [dependabot skip] Update env file

* Bump base in the conda env

* [dependabot skip] Update env file

* Downgrade pympipool

pylammpsmpi, and thus pyiron_atomistics is still a patch behind, and to my chagrin patches remain hard-pinned in pyiron_atomistics

* Update deepdive notebook to use changed pympipool signature

* [dependabot skip] Update env file

---------

Co-authored-by: pyiron-runner <[email protected]>

* Move pandas to main env

* Add matgl to notebooks env

* [dependabot skip] Update env file

* Introduce new access points to zip and iter loop dataframes right on the node

* Remove Joerg's iter

* Update decorator calls

* 🔥 🐛 Don't pull on input nodes (#318)

* 🔥 🐛 Don't pull on input nodes

A simple call invokes a pull, including pulling the parent tree -- this blows up when the loop is inside a parent scope instead of alone. Since we _know_ these are just user input nodes with direct value connections to the loop macro's scope, we never need to run the pull, or indeed even a fetch the value.

* Format black

---------

Co-authored-by: pyiron-runner <[email protected]>

* Make databases a module

* Expose output maps on for-loop convenience access points

* Get the notebooks running

* Rewiring some stuff
* Using the new iter method
* Formatting nodes with PascalCase
* Whatever else was needed to get the notebooks to execute

* Format black

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: pyiron-runner <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
  • Loading branch information
3 people authored May 9, 2024
1 parent b69ca0b commit 0426c4c
Show file tree
Hide file tree
Showing 77 changed files with 12,649 additions and 10,225 deletions.
20 changes: 11 additions & 9 deletions .binder/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,20 @@ dependencies:
- cloudpickle =3.0.0
- graphviz =9.0.0
- h5io =0.2.2
- h5io_browser =0.0.9
- matplotlib =3.8.3
- pyiron_base =0.7.9
- pyiron_contrib =0.1.15
- pympipool =0.7.13
- h5io_browser =0.0.12
- matplotlib =3.8.4
- pandas =2.2.0
- pyiron_base =0.8.3
- pyiron_contrib =0.1.16
- pympipool =0.8.0
- python-graphviz =0.20.3
- toposort =1.10
- typeguard =4.1.5
- typeguard =4.2.1
- ase =3.22.1
- atomistics =0.1.23
- atomistics =0.1.27
- lammps
- phonopy =2.21.2
- pyiron_atomistics =0.4.17
- matgl = 0.9.2
- phonopy =2.22.1
- pyiron_atomistics =0.5.4
- pyiron-data =0.0.29
- numpy =1.26.4
7 changes: 4 additions & 3 deletions .ci_support/environment-notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@ channels:
- conda-forge
dependencies:
- ase =3.22.1
- atomistics =0.1.23
- atomistics =0.1.27
- lammps
- phonopy =2.21.2
- pyiron_atomistics =0.4.17
- matgl = 0.9.2
- phonopy =2.22.1
- pyiron_atomistics =0.5.4
- pyiron-data =0.0.29
- numpy =1.26.4
13 changes: 7 additions & 6 deletions .ci_support/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,12 @@ dependencies:
- cloudpickle =3.0.0
- graphviz =9.0.0
- h5io =0.2.2
- h5io_browser =0.0.9
- matplotlib =3.8.3
- pyiron_base =0.7.9
- pyiron_contrib =0.1.15
- pympipool =0.7.13
- h5io_browser =0.0.12
- matplotlib =3.8.4
- pandas =2.2.0
- pyiron_base =0.8.3
- pyiron_contrib =0.1.16
- pympipool =0.8.0
- python-graphviz =0.20.3
- toposort =1.10
- typeguard =4.1.5
- typeguard =4.2.1
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ _build/
apidoc/
.ipynb_checkpoints/
test_times.dat
tests/integration/test_notebooks.py
.aider*
31 changes: 13 additions & 18 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Individual node computations can be shipped off to parallel processes for scalab

Once you're happy with a workflow, it can be easily turned it into a macro for use in other workflows. This allows the clean construction of increasingly complex computation graphs by composing simpler graphs.

Nodes (including macros) can be stored in plain text, and registered by future workflows for easy access. This encourages and supports an ecosystem of useful nodes, so you don't need to re-invent the wheel. (This is a beta-feature, with full support of [FAIR](https://en.wikipedia.org/wiki/FAIR_data) principles for node packages planned.)
Nodes (including macros) can be stored in plain text as python code, and registered by future workflows for easy access. This encourages and supports an ecosystem of useful nodes, so you don't need to re-invent the wheel. (This is a beta-feature, with full support of [FAIR](https://en.wikipedia.org/wiki/FAIR_data) principles for node packages planned.)

Executed or partially-executed graphs can be stored to file, either by explicit call or automatically after running. When creating a new node(/macro/workflow), the working directory is automatically inspected for a save-file and the node will try to reload itself if one is found. (This is an alpha-feature, so it is currently only possible to save entire graphs at once and not individual nodes within a graph, all the child nodes in a saved graph must have been instantiated by `Workflow.create` (or equivalent, i.e. their code lives in a `.py` file that has been registered), and there are no safety rails to protect you from changing the node source code between saving and loading (which may cause errors/inconsistencies depending on the nature of the changes).)

Expand All @@ -39,7 +39,7 @@ Nodes can be used by themselves and -- other than being "delayed" in that their
```python
>>> from pyiron_workflow import Workflow
>>>
>>> @Workflow.wrap_as.function_node()
>>> @Workflow.wrap.as_function_node()
... def add_one(x):
... return x + 1
>>>
Expand All @@ -54,33 +54,28 @@ But the intent is to collect them together into a workflow and leverage existing
>>> from pyiron_workflow import Workflow
>>> Workflow.register("pyiron_workflow.node_library.plotting", "plotting")
>>>
>>> @Workflow.wrap_as.function_node()
>>> @Workflow.wrap.as_function_node()
... def Arange(n: int):
... import numpy as np
... return np.arange(n)
>>>
>>> @Workflow.wrap_as.macro_node("fig")
... def PlotShiftedSquare(macro, shift: int = 0):
... macro.arange = Arange()
... macro.plot = macro.create.plotting.Scatter(
... x=macro.arange + shift,
... y=macro.arange**2
>>> @Workflow.wrap.as_macro_node("fig")
... def PlotShiftedSquare(self, n: int, shift: int = 0):
... self.arange = Arange(n)
... self.plot = self.create.plotting.Scatter(
... x=self.arange + shift,
... y=self.arange**2
... )
... macro.inputs_map = {"arange__n": "n"} # Expose arange input
... return macro.plot
... return self.plot
>>>
>>> wf = Workflow("plot_with_and_without_shift")
>>> wf.n = wf.create.standard.UserInput()
>>> wf.no_shift = PlotShiftedSquare(shift=0, n=10)
>>> wf.shift = PlotShiftedSquare(shift=2, n=10)
>>> wf.inputs_map = {
... "n__user_input": "n",
... "shift__shift": "shift"
... }
>>> wf.no_shift = PlotShiftedSquare(shift=0, n=wf.n)
>>> wf.shift = PlotShiftedSquare(shift=2, n=wf.n)
>>>
>>> diagram = wf.draw()
>>>
>>> out = wf(shift=3, n=10)
>>> out = wf(shift__shift=3, n__user_input=10)

```

Expand Down
13 changes: 7 additions & 6 deletions docs/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,12 @@ dependencies:
- cloudpickle =3.0.0
- graphviz =9.0.0
- h5io =0.2.2
- h5io_browser =0.0.9
- matplotlib =3.8.3
- pyiron_base =0.7.9
- pyiron_contrib =0.1.15
- pympipool =0.7.13
- h5io_browser =0.0.12
- matplotlib =3.8.4
- pandas =2.2.0
- pyiron_base =0.8.3
- pyiron_contrib =0.1.16
- pympipool =0.8.0
- python-graphviz =0.20.3
- toposort =1.10
- typeguard =4.1.5
- typeguard =4.2.1
30 changes: 23 additions & 7 deletions notebooks/atomistics_nodes.ipynb

Large diffs are not rendered by default.

Loading

0 comments on commit 0426c4c

Please sign in to comment.