Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature requests and bugs based on a practical workflow example #102

Closed
JNmpi opened this issue Nov 6, 2022 · 7 comments
Closed

Feature requests and bugs based on a practical workflow example #102

JNmpi opened this issue Nov 6, 2022 · 7 comments
Assignees
Labels
bug Something isn't working feature request request partner of "enhancement"

Comments

@JNmpi
Copy link
Contributor

JNmpi commented Nov 6, 2022

Thanks @liamhuber. Many of the new features are really great. I particularly like the show button and the color flash when a cell is active. I worked a bit with the latest version and tried to build up a simple workflow to build a pyiron table, use this data to make a plot (including using the information from the column names to set the axis labels) and filter relevant projects. It was nice to see that the new capability to add new nodes within the notebook is quit powerful. I have added the example in the example_table notebook and json file in the repository.

When working with pyironflow a missed a few features and observed a few bugs listed below:

  • deleting a script does not work (clicking on the icon button nothing happens)
  • it would be nice if pyironflow would remember the last saved json-file and script and open it directly
  • when loading the json file the input is initialised to the default value, not the one in the saved script (this severely limits the user experience since the same buttons have to be clicked again and again making debugging hard)
  • it would be nice to get the command to open and run a node in a separate cell of the jupyter notebook. I did it manually via gui.script.nodes[index_of_node] but this is rather cumbersome since the node index has to be obtained by trial and error. For example, the command (or a button) could be placed in the box where the input of the node is controlled.
  • since scrolling appears to be not working in the output window it would be nice to open the output content in another notebook cell
  • certain node feature will be commonly requested, e.g. input as combobox with the possibility to add/delete them (e.g. for the pyironTable to add new columns)
  • I was expecting that gui.load(json_file) would work but got an error message
@JNmpi JNmpi added bug Something isn't working feature request request partner of "enhancement" labels Nov 6, 2022
@liamhuber
Copy link
Member

liamhuber commented Nov 7, 2022

Thanks for the feedback and the new demo!

  • deleting a script does not work (clicking on the icon button nothing happens)

I'm not able to reproduce this. I added a safety feature that requires confirming script deletion -- there's a little "pop-up window" right below the main toolbar after clicking the delete button. Maybe you just missed this?

Unfortunately, real "pop-up" windows are not supported in ipwidgets. I could probably do something with colour to make this pseudo-pop-up pop a bit more though?

  • it would be nice if pyironflow would remember the last saved json-file and script and open it directly

It does! 😄 When you instantiate the gui, use the name of the json-file you want to load (minus the '.json' extension) and it will automatically search for the json.

  • when loading the json file the input is initialised to the default value, not the one in the saved script (this severely limits the user experience since the same buttons have to be clicked again and again making debugging hard)

Yikes, yep! I suspect this was an unintended side-effect of my patch to get node defaults to actually be used. I don't think it will be too hard to fix, but it's a big enough deal that I'll make a separate issue for it so I don't lose track.

EDIT: Patched in #106

  • it would be nice to get the command to open and run a node in a separate cell of the jupyter notebook. I did it manually via gui.script.nodes[index_of_node] but this is rather cumbersome since the node index has to be obtained by trial and error. For example, the command (or a button) could be placed in the box where the input of the node is controlled.

This sounds like a reasonable idea to me, but I'll need to think/read a bit about implementation. Definitely adding some sort of "(force) update" button to the nodes is super easy. I currently have no idea how hard/easy it is to use that to trigger action in a new cell. An alternative might be to reroute the stdout to a convenient window inside the gui itself. I'll probably make a separate issue for this and play around with it later as it's (potentially quite) a bit deeper.

  • since scrolling appears to be not working in the output window it would be nice to open the output content in another notebook cell

I'm not sure what you mean here. When I use it, the output window generated by the "SHOW" button automatically extends the gui.draw() cell as far as needed to show all the output requested, and scrolling the notebook as usually lets me see it. Is your output somehow cut off? Are you using notebook or lab? (I'm using lab and would prefer to be lazy and not support notebook...)

I played around a little a couple weeks ago with a personal project using voila, and I can see that if we want to move in that direction in the future, then the entire gui.draw() output "app" will need to have some fixed maximum size. At that point we'll need to limit the height of the output window and have the scrolling occur inside that window. I anticipate zero problems making that happen, but I'd rather cross that bridge when we get to it -- right now I'm not even convinced this is where the output window will live indefinitely.

*certain node feature will be commonly requested, e.g. input as combobox with the possibility to add/delete them (e.g. for the pyironTable to add new columns)

This is a really nice idea! Right now we only offer access to the base pyiron Node class when people import from ironflow.custom_nodes, but having a bunch of deeper templates in there to draw from would be stellar. I'll start collecting ideas like this for what to add.

  • I was expecting that gui.load(json_file) would work but got an error message

I can't reproduce this error. I have no problem gui.load('example.json') from either a clean session or a session that's already loaded other stuff.

I do have trouble with the actual gui not updating -- to actually see the effect of the load I need to trigger a refresh from within the GUI, e.g. by creating a new script or whatever. I have an idea for how to get it to update automatically though and will try that.

@JNmpi
Copy link
Contributor Author

JNmpi commented Nov 8, 2022

Thanks for the quick reply, @liamhuber.

Thanks for the feedback and the new demo!

  • deleting a script does not work (clicking on the icon button nothing happens)

I'm not able to reproduce this. I added a safety feature that requires confirming script deletion -- there's a little "pop-up window" right below the main toolbar after clicking the delete button. Maybe you just missed this?

Unfortunately, real "pop-up" windows are not supported in ipwidgets. I could probably do something with colour to make this pseudo-pop-up pop a bit more though?
Thanks for the explanation. I indeed simply overlooked the confirmation line. It may thus be indeed helpful to highlight it more prominently.

  • it would be nice if pyironflow would remember the last saved json-file and script and open it directly

It does! 😄 When you instantiate the gui, use the name of the json-file you want to load (minus the '.json' extension) and it will automatically search for the json.
Thanks, this works beautifully. I tried to to get directly to a specific Tab via the instantiation (e.g. script_title='Table') but this did not work. I guess presently script_title only gives a name for a newly instantiated script. It may be helpful to include this additional functionality.

  • when loading the json file the input is initialised to the default value, not the one in the saved script (this severely limits the user experience since the same buttons have to be clicked again and again making debugging hard)

Yikes, yep! I suspect this was an unintended side-effect of my patch to get node defaults to actually be used. I don't think it will be too hard to fix, but it's a big enough deal that I'll make a separate issue for it so I don't lose track.

EDIT: Patched in #106
Thanks.

  • it would be nice to get the command to open and run a node in a separate cell of the jupyter notebook. I did it manually via gui.script.nodes[index_of_node] but this is rather cumbersome since the node index has to be obtained by trial and error. For example, the command (or a button) could be placed in the box where the input of the node is controlled.

This sounds like a reasonable idea to me, but I'll need to think/read a bit about implementation. Definitely adding some sort of "(force) update" button to the nodes is super easy. I currently have no idea how hard/easy it is to use that to trigger action in a new cell. An alternative might be to reroute the stdout to a convenient window inside the gui itself. I'll probably make a separate issue for this and play around with it later as it's (potentially quite) a bit deeper.

Having it inside pyironflow would be great. If opening a new cell turns out to be difficult generating a line (string) that can be copied into a new cell would already help a lot.

  • since scrolling appears to be not working in the output window it would be nice to open the output content in another notebook cell

I'm not sure what you mean here. When I use it, the output window generated by the "SHOW" button automatically extends the gui.draw() cell as far as needed to show all the output requested, and scrolling the notebook as usually lets me see it. Is your output somehow cut off? Are you using notebook or lab? (I'm using lab and would prefer to be lazy and not support notebook...)

I indeed had scroll bars in the output window in mind. While scrolling the notebook (I use jupyterlab) vertically works it is often necessary to have also a horizontal scroll bar (e.g. for pandas dataframes). It looks like this is known bug in the ipywidgets module and fixed in version 8.0. However, it also looks like nglview requires an ipywidgets version below 8.0. We therefore probably have to live with this issue for a while. As a simple fix I had the idea to have something like gui.output or gui.show_output_window command which I can run in a separate notebook cell.

I played around a little a couple weeks ago with a personal project using voila, and I can see that if we want to move in that direction in the future, then the entire gui.draw() output "app" will need to have some fixed maximum size. At that point we'll need to limit the height of the output window and have the scrolling occur inside that window. I anticipate zero problems making that happen, but I'd rather cross that bridge when we get to it -- right now I'm not even convinced this is where the output window will live indefinitely.

*certain node feature will be commonly requested, e.g. input as combobox with the possibility to add/delete them (e.g. for the pyironTable to add new columns)

This is a really nice idea! Right now we only offer access to the base pyiron Node class when people import from ironflow.custom_nodes, but having a bunch of deeper templates in there to draw from would be stellar. I'll start collecting ideas like this for what to add.

  • I was expecting that gui.load(json_file) would work but got an error message

I can't reproduce this error. I have no problem gui.load('example.json') from either a clean session or a session that's already loaded other stuff.

I do have trouble with the actual gui not updating -- to actually see the effect of the load I need to trigger a refresh from within the GUI, e.g. by creating a new script or whatever. I have an idea for how to get it to update automatically though and will try that.

I was not able to get this working. When instantiating the gui and running load I do not see the loaded scripts (even if I try to make a refresh).

@JNmpi
Copy link
Contributor Author

JNmpi commented Nov 8, 2022

A few more items/ideas regarding features. I guess it is ok to stick to this issue and do not open a new one.

  • Programming the nodes with constructs like self.inputs[1] is inconvenient and error prone. It would be nicer to allow constructs with the actual input name, i.e. self.input.alat of self.input['alat']. Being able to provide actual names makes the code much easier to read, avoids errors when introducing new inputs/outputs since it does not require reindexing.

  • The following point is realted to pyiron, but required to get nice solutions for ironflow:
    - It would be good to have for some pyiron constructs an alternative formulation. An example is

        job.calc_md(temperature=300)  # previous
    
        job.calc = calculator.md(temperature=300)   # alternative
    
    • Such a formulation would allow us to easily create generic modules, e.g. for MD that can be used as input for VASP, LAMMPS, etc.
    • The same applies to job.server etc.

@liamhuber
Copy link
Member

Having [node update stdout] inside pyironflow would be great. If opening a new cell turns out to be difficult generating a line (string) that can be copied into a new cell would already help a lot.

Yeah, that's a good idea for an intermediate step. I added this part of the conversation to #60 so I don't lose track of the idea.

I indeed had scroll bars in the output window in mind. While scrolling the notebook (I use jupyterlab) vertically works it is often necessary to have also a horizontal scroll bar (e.g. for pandas dataframes). It looks like this is known bug in the ipywidgets module and fixed in version 8.0. However, it also looks like nglview requires an ipywidgets version below 8.0. We therefore probably have to live with this issue for a while.

Yeah, it sounds like we're on the same page for what would be nice to have, but indeed the ipywidgets limit is constraining.

As a simple fix I had the idea to have something like gui.output or gui.show_output_window command which I can run in a separate notebook cell.

There is such a simple fix! gui.node_presenter.box will return this object. Even in a new cell, it will update itself automatically when you click SHOW on different nodes too.

I was not able to get this working. When instantiating the gui and running load I do not see the loaded scripts (even if I try to make a refresh).

Hmm, that is tricky then. I think we'll have to look at it in real-time whenever we Zoom again.

A few more items/ideas regarding features. I guess it is ok to stick to this issue and do not open a new one.

Super. Yep. I'll break them up and move them around as needed later, and close this one when everything is either resolved or delegated to another issue.

Programming the nodes with constructs like self.inputs[1] is inconvenient and error prone. It would be nicer to allow constructs with the actual input name, i.e. self.input.alat of self.input['alat']. Being able to provide actual names makes the code much easier to read, avoids errors when introducing new inputs/outputs since it does not require reindexing.

I'm 100% on board with this. Right now the inputs and outputs are just lists, but I should be able to override this with a list-like class that has a modified __getattr__ so it first checks if any of its elements has a label corresponding to the requested key.

In the medium-term this would be better to implement right in ryvencore instead of ironflow, but I'll do it as a proof-of-concept in our overridden node class first. My interactions with Leon Thomm have all been good and productive so far, but I'd rather bring him a working idea. There will also be some delay on this front as there is some disconnect between his latest version of ryvencore and what's available on Conda. After I get experience putting ironflow on conda-forge I'll see if I can help him streamline his release process so we can get his latest ideas in our dependency stack.

The following point is realted to pyiron, but required to get nice solutions for ironflow:
It would be good to have for some pyiron constructs an alternative formulation. An example is

    job.calc_md(temperature=300)  # previous
    job.calc = calculator.md(temperature=300)   # alternative

Such a formulation would allow us to easily create generic modules, e.g. for MD that can be used as input for VASP, LAMMPS, etc.
The same applies to job.server etc.

Yep. I will probably hack-and-slash something to this effect after I get the book-keeping (CI, pypi release, conda-forge release) all sorted out. My current vision is that things like the Lammps node can only ever run a static calculation (or no running of these nodes??) and that they otherwise need to be passed as input to MD, minimimization, NEB, VC-SGC, etc. calculators.

This is thus a little bit different from what you suggest here, something more like:

md = pr.atomistics.calculator.MD('my_md_run')
md.temperature = 300
md.n_steps = 1000

lammps = pr.atomistics.representation.Lammps('lammps_reference')
lammps.structure = pr.atomistics.structure.bulk('Fe').repeat(4)
md.representation = lammps

md.run()

In a perfect world I imagine that the calculator would then see if the representation has native support for this type of calculation, e.g. lammps has calc_md so we would basically just be using the calculator as a wrapper interface for that -- but then if the representation doesn't allow it that the calculator either falls back on a python-coded version of the algorithm, or throws an error (probably when the representation is assigned, so we fail as early as possible).

This is definitely best implemented right in the heart of pyiron, but I think it will be productive to play around with the idea here in ironflow where I'm just making it look like that's what the underlying model is. That will give us some freedom to experiment with different architectures/interfaces without breaking anything anyone else cares about.

@JNmpi
Copy link
Contributor Author

JNmpi commented Nov 12, 2022

Thanks for your quick and very constructive reply @liamhuber. Just a few thoughts regarding the calculator.MD idea. At a first glance it looked to me like your suggested version makes things more complex than what we have now. In the example given by you a single line

 job.calc_md(temperature=300)

would be replaced by 4 lines. In a jupyter notebook version (or general in a python script) I would not like to have it (or at least I would prefer to keep our present single line approach as one possible option). However, for a visual node based approach your pseudo code would work very well. We reduce the number of inputs to the LAMMPS node and make the MD module generic. The workflow would then look like:

structure -> LAMMPS -> MD

Based on this concept we could/should also convert other inputs into the LAMMPS job into nodes. Specifically, we could have the following flow:

  structure -> LAMMPS -> POTENTIAL -> MD -> SERVER

The advantage of this flow would be that the potential module has all the critical information (i.e. LAMMPS and structure). This construct is much cleaner than the original one, since we do not have to test that one input (structure) is there to generate another input (potential). The same applies to SERVER, where the info regarding the queue, number of cores etc. are provided. Our present LAMMPS job could be regarded as a macro node of this workflow, which collects all the input parameters of the inputs of the various nodes in the macro object, i.e., structure, potential etc.

These thoughts also show that for supporting the two very different approaches for workflows (visual node based one vs command like python based one) different code constructs are needed.

@JNmpi
Copy link
Contributor Author

JNmpi commented Nov 13, 2022

One more issue I forgot to mention in the last comment:

  • We should print not only the input labels in the graphical representation of the node but also the output ones. Actually, it appears the output is printed already but placed right to the dot representing the output. One can see this if one brings another node to the right side of the node, then the printout becomes visible. Thus, the fix should be simply to shift the end of the text to the left side of the output node.

@liamhuber
Copy link
Member

One more issue I forgot to mention in the last comment:

  • We should print not only the input labels in the graphical representation of the node but also the output ones. Actually, it appears the output is printed already but placed right to the dot representing the output. One can see this if one brings another node to the right side of the node, then the printout becomes visible. Thus, the fix should be simply to shift the end of the text to the left side of the output node.

Absolutely! There is just a bit of finicky front-end stuff that has led me to procrastinate on this: i.e. making sure the text is right-justified, handling truncation when the labels (also input) are too long, and experimenting with new node widths to see which compromise narrowness (for overall node compactness) and width (for fitting long labels) "feels" best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature request request partner of "enhancement"
Projects
None yet
Development

No branches or pull requests

2 participants