-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
💡 For-loop syntax? #72
Comments
First, to Sam's comment, I agree that it is necessary to have some explicit instruction that the input should be looped over, and for exactly the reason he supplies: otherwise input that might potentially be iterable could get very confused. Any anyhow, "explicit is better than implicit". Then I want to look at the main suggestion: wf.ev = strained_energy(name='Al', STRAIN=np.linspace(-0.05, 0.05, 5)) Pros:
The cons then depend on what we're doing under the hood to actually make this behave the way it so intuitively looks like it should. I'm afraid I can't find I see two major routes: 1) Empower
|
Thanks for raising this point, @samwaseda. I think the example shows the danger when having input parameters that do not have a specified type. The solution I would like to follow is that rather than trying to resolve the behavior of the function from the data type to provide for the various cases separate names (strain_scalar or strain_hydrostatic and strain_tensor). The example would look then like:
Note that I have replaced the vector |
It looks like our replies have been submitted almost at the same time, @liamhuber. I therefore addressed only @samwaseda's comment. Please find below the code snippet for the exec_pool loop I mentioned in one of the previous replies:
I used it in connection with @jan-janssen's pympipool library. It was meant as a proof of concept and I did not intend it to be perfect code. I like your new syntax for the for loop, i.e.:
Nevertheless, I have a few points where I am not perfectly happy:
|
With this implementation I have two concerns (or rather: one concern expressed in two different ways)
|
Thanks for the rapid iteration of this discussion, both!
Super, let's use it as a starting point and improve from here then! (When we actually get to it that is, as it's lower priority than storage IMO)
This is a very valid concern. I would say there's an interesting asymmetry in the redundancy here: given In principle we can then drop the wf.ev = strained_energy(name='Al', STRAIN=np.linspace(-0.05, 0.05, 5)) And I still have concerns about how implicit that is. The big difference to the numpy example is that Maybe it's just OK to do this and I'm making a mountain out of a molehill, but my gut still says this is a messy idea. As an aside, on way to implement such behaviour may be to modify I made a super simple example of this concept: class Foo:
def __new__(cls, *args, **kwargs):
return super().__new__(cls)
def __init__(self, x, bar):
self.x = x
self.y = x - 1
self.bar = bar # Our "body node"
class Bar:
def __new__(cls, x):
if x < 0:
instance = Foo(x, cls) # Our "for macro"
else:
instance = super().__new__(cls)
return instance
def __init__(self, x):
self.x = x
self.y = x + 1
bar = Bar(5)
print(bar.x, bar.y, type(bar))
>>> 5 6 <class '__main__.Bar'>
foobar = Bar(-5)
print(foobar.x, foobar.y, type(foobar), foobar.bar)
>>> -5 -6 <class '__main__.Foo'> <class '__main__.Bar'> Anyhow, like I said, I don't think this is a good idea, but at least it is possible.
In some ways then I feel that explicitly having the
This is a nice point. In this particular example I don't think it makes a huge difference since there are only five elements, so a human can assess start, stop, and steps trivially easily. But for more complex data, or even just for linspaces that are too big to easily count In general I guess this is a sort of "best practice" -- like how
Indeed, I think I can answer both at once. In this case, the value being iterated over is indicated by the fact that the kwarg matches with As an implementation detail, right now the IO channels are not actually accessible until after a node gets instantiated. So Note that in the examples here we have been iterating over exactly one input field, but in general I would expect to be able to iterate over multiple, perhaps assuming that they should all have the same length and get zipped together (as is the case in |
@liamhuber, I fully agree that this topic has lower priority than issues such as storage. Nevertheless, I think we all feel that this is a crucial point and will affect the majority of workflows. What I realize in our discussions is that we talk mainly about a specific scenario, where we could have a fully independent execution of the iterations in the loop body. With respect to pyiron this corresponds to the concept of the parallel master. In numpy, it would correspond to a map approach where the numpy function is applied to each element of the input vector, array etc. This is the case, where in my opinion we do not need an explicit statement for the for-loop. I also do not see the need for a macro - a simple construct as my exec_pool example (which should be of course renamed) could be part of the node.py module. This is what I did with the exec_pool implementation. The nice thing about this is that it works readily also for macro nodes, without having to add a single line of extra code. Things are getting more tricky when implementing workflows with a for-loop that does not run the elements individually. An example would be a convergence text where output from the previous node is needed as input into the next node. In this case, we probably need the full power of a loop-node, although we may still find a more intuitive syntax. This scenario is what our SerialMaster aimed to do, but we were never happy with the syntax. Again, low priority for the moment but it does not hurt to continue the discussion. |
Super, we're on the same page here. I just wanted to make sure that the very active conversation here didn't give the impression I was also actively coding this!
Yes, I see what you're saying. This is in exactly the direction as my comment above under "1) Empower Node to "batch" input". I'm not totally closed to such a solution, but I do still have concerns about (a) the added complexity to the I'm definitely willing to give it a try, I'm just not enthusiastic about it yet.
Yes, for sure. As soon as there is cyclicity in the data graph things just get super complicated. We could do this with a cyclic I do think we want to keep in mind how our solution will handle multiple looped inputs -- zipping or nested loops? (i.e. O(1) or O(N) cost with N looped variables.) This can probably be handled just fine by extending the |
For-loops are now implemented to my satisfaction in #309. This still needs to be extended to be a method(s) right on Anyhow, I will leave this issue open mostly because I really like the GUI sketch at the top, even if the core of having a decent for-loop syntax is already achieved. |
Shortcut-access on |
I want to migrate our discussion on for-loop syntax from the meeting discussion to its own issue, as it's getting a bit in-depth.
I had posted an example calculation that used the current (and not at all good) for-loop syntax:
@JNmpi replied:
(Note that the registration aside is in its own issue, #71)
The @samwaseda replied:
The text was updated successfully, but these errors were encountered: