Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PRE REVIEW]: strucscan: A lightweight Python-based framework for high-throughput material simulation #4519

Closed
editorialbot opened this issue Jun 27, 2022 · 51 comments
Assignees
Labels
pre-review Python Shell TeX Track: 2 (BCM) Biomedical Engineering, Biosciences, Chemistry, and Materials

Comments

@editorialbot
Copy link
Collaborator

editorialbot commented Jun 27, 2022

Submitting author: @thohamm (Thomas Hammerschmidt)
Repository: https://github.com/ICAMS/strucscan
Branch with paper.md (empty if default branch):
Version: 1.0
Editor: @ppxasjsm
Reviewers: @mturiansky, @wcwitt
Managing EiC: Kevin M. Moerman

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/cf152ba42d55db682d1ac29f951bcfe1"><img src="https://joss.theoj.org/papers/cf152ba42d55db682d1ac29f951bcfe1/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/cf152ba42d55db682d1ac29f951bcfe1/status.svg)](https://joss.theoj.org/papers/cf152ba42d55db682d1ac29f951bcfe1)

Author instructions

Thanks for submitting your paper to JOSS @thohamm. Currently, there isn't an JOSS editor assigned to your paper.

@thohamm if you have any suggestions for potential reviewers then please mention them here in this thread (without tagging them with an @). In addition, this list of people have already agreed to review for JOSS and may be suitable for this submission (please start at the bottom of the list).

Editor instructions

The JOSS submission bot @editorialbot is here to help you find and assign reviewers and start the main review. To find out what @editorialbot can do for you type:

@editorialbot commands
@editorialbot
Copy link
Collaborator Author

Hello human, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.88  T=0.08 s (967.0 files/s, 123169.0 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
Python                           25            902           1981           3338
Markdown                         13             95              0            403
TeX                               1             13              0            137
Jupyter Notebook                  4              0           1763            121
reStructuredText                  8             51             81             80
YAML                              8             20            135             68
Bourne Again Shell               14             25            137             57
make                              1              4              6              9
--------------------------------------------------------------------------------
SUM:                             74           1110           4103           4213
--------------------------------------------------------------------------------


gitinspector failed to run statistical information for the repository

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/0927-0256(96)00008-0 is OK
- 10.1103/PhysRevB.59.1758 is OK
- 10.1103/physrevb.54.11169 is OK
- 10.1016/j.cpc.2021.108171 is OK
- 10.1016/j.commatsci.2021.110731 is OK
- 10.1016/j.commatsci.2018.07.043 is OK
- 10.1016/j.commatsci.2017.07.030 is OK
- 10.1002/cpe.3505 is OK
- 10.1109/CCGRID.2001.923173 is OK
- 10.1088/1361-648x/aa680e is OK
- 10.1088/1367-2630/15/11/115016 is OK
- 10.1038/s41597-020-00638-4 is OK

MISSING DOIs

- 10.1007/10968987_3 may be a valid DOI for title: SLURM: Simple Linux Utility for Resource Management

INVALID DOIs

- doi.org/10.1016/j.commatsci.2012.10.028 is INVALID because of 'doi.org/' prefix

@editorialbot
Copy link
Collaborator Author

Wordcount for paper.md is 1447

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@thohamm
Copy link

thohamm commented Jun 27, 2022

Missing and invalid DOI fixed.

@thohamm
Copy link

thohamm commented Jun 27, 2022

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@Kevin-Mattheus-Moerman
Copy link
Member

@editorialbot check references

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1016/0927-0256(96)00008-0 is OK
- 10.1103/PhysRevB.59.1758 is OK
- 10.1103/physrevb.54.11169 is OK
- 10.1016/j.commatsci.2021.110731 is OK
- 10.1016/j.commatsci.2018.07.043 is OK
- 10.1016/j.commatsci.2017.07.030 is OK
- 10.1002/cpe.3505 is OK
- 10.1109/CCGRID.2001.923173 is OK
- 10.1007/10968987_3 is OK
- 10.1088/1361-648x/aa680e is OK
- 10.1088/1367-2630/15/11/115016 is OK
- 10.1038/s41597-020-00638-4 is OK
- 10.1016/j.commatsci.2012.10.028 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@Kevin-Mattheus-Moerman
Copy link
Member

@thohamm I have had a quick look at your paper, can you check the following:

  • Please add city and country to your affiliation, do not use acronyms for countries.

@Kevin-Mattheus-Moerman
Copy link
Member

@editorialbot invite @jedbrown as editor

@editorialbot
Copy link
Collaborator Author

Invitation to edit this submission sent!

@thohamm
Copy link

thohamm commented Jun 29, 2022

Affiliation fixed.

@thohamm
Copy link

thohamm commented Jun 29, 2022

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@jedbrown
Copy link
Member

👋 Hi @thohamm. A few notes from my initial scan

  1. I see there are some "TODO" marks in the document.
  2. Can you please share with us a little about the provenance of this project? I see it was a new project importing source from elsewhere just prior to submission.
  3. It looks like there is intended to be Sphinx documentation, but I don't see it linked. Does that exist?
  4. Those docs say that pytest is used, but it looks like the code doesn't use pytest.
  5. Are there unit tests and can they be made to run with continuous integration (such as GitHub Actions)?
  6. If the output is tabular, why not put it into a Pandas dataframe for easy plotting and statistics?
  7. How does this package related to generic workflow management tools with support for batch computing (e.g., Snakemake, Parsl, Swift, libensemble)?
  8. To what extent does this package provide machine- or human-auditable provenance?
  9. To what extent does this package convey research, versus serve as a client to research software (VASP, etc.)? Would it be in scope for strucscan to compute summary statistics, make common figures, and/or do quality control to make the research more efficient or reliable?

@danielskatz
Copy link

👋 @thohamm - we do need a response from you - if we don't hear back in another week, we'll mark this submission as withdrawn.

@thohamm
Copy link

thohamm commented Jul 10, 2022

Thank you for your comments. We are working on them.

@thohamm
Copy link

thohamm commented Jul 12, 2022

@danielskatz, @jedbrown - Thank you for your detailed comments.

We reply to them point by point below. We would be willing to include according remarks in the paper if you find this helpful.

  1. Thank you for your comment, we fixed them.
  2. The development of this python code started about 6 months ago on the basis of 10-year old collection of shell scripts.
    After several iterations of restructuring program flow and data handling we now arrived at a converged version that is also
    already in practical use in our group. We didn't see much benefit in sharing older versions and therefore decided to
    set up a new git repo for this project.
  3. The Sphinx documentation has now been prepared and is available on https://strucscan.readthedocs.io/.
  4. We prepared tests using pytest but are still struggling with executing them due to write permission problems with GitHub
    Actions. For the moment we removed the corresponding remark from the documentation in the git repo and will include it
    again as soon as the tests are up and running.
  5. Some of the examples are planned to be used as unit tests as soon as the issues with github actions are resolved. (cf. 4)
  6. The data structure and connectivity of the output depends on the particular set of calculations and is therefore not very well
    suited for a pandas dataframe with an a-priori layout. We are instead storing dictionaries from which the user can collect
    data selectively for constructing dedicated dataframes. This is more convenient and flexible in our experience.
  7. In contrast to generic tools for batch computing, strucscan is dedicated to high-throughput simulations and provides a more
    sophisticated framework than a generic workflow manager. In particular, it offers generalized interfaces to different
    simulation software packages, interfaces to schedulers including monitoring tools, a transparent storage structure for
    convenient post-processing, and measures for data connectivity.
  8. The provenance of the strucscan code itself is given by the github versioning that starts at the first converged version of the
    software (cf. 2). The provenance of individual calculations with strucscan is realized by storing all input and output files in
    a folder tree. This storage is in addition to the collection of central results in dictionaries. This approach of storing the full
    data set as provenance of individual calculations provides the user full flexibility of adapting to external meta-data schema,
    e.g. of third-party databases.
  9. The goal of strucscan is to support the user to convey high-throughput research. This means particularly to cope with the
    challenges of starting, monitoring, collecting and analyzing very large numbers (thousands, ten-thousands) of calculations.
    The goal of strucscan is to provide the basis for exactly the mentioned purposes, i.e., summary statistics, common figures
    and quality control. This means a framework for generating large sets of quality-checked data stored in a transparent data
    structure. On this ground the user can then use the basic post-processing of strucscan or further tools of data visualization
    and analytics.

@thohamm
Copy link

thohamm commented Jul 12, 2022

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@arfon
Copy link
Member

arfon commented Jul 18, 2022

@jedbrown – do these comments satisfy your concerns, and if so, would you be able to edit this submission for JOSS?

@Kevin-Mattheus-Moerman
Copy link
Member

@jedbrown 👋 ☝️

@Kevin-Mattheus-Moerman
Copy link
Member

@jedbrown Can you check those comments?

@jedbrown
Copy link
Member

I'm sorry to be slow catching up after our family got covid (via 2yo's childcare). @thohamm, thanks for your updates and comprehensive response. I think we still need a scope check with other editors in light of it being nominally general purpose (thus user is responsible for app-specific reliability checks/statistics/plotting), but the examples and known uses thus far are for VASP. The main question is to what extent people outside your research group would use this tool for their research or would disseminate research via this tool (say, as a platform for CS research).

Regarding packaging, it looks like you're missing pyyaml. This in a fresh virtualenv:

$ pip install .
Processing /home/jed/joss/strucscan
  Preparing metadata (setup.py) ... done
[...]
Successfully installed strucscan-0.post0.dev68
$  strucscan --help
Traceback (most recent call last):
  File "/home/jed/joss/strucscan/VENV/bin/strucscan", line 5, in <module>
    from strucscan.cli import main
  File "/home/jed/joss/strucscan/VENV/lib/python3.10/site-packages/strucscan/__init__.py", line 1, in <module>
    from strucscan.__init__ import *
  File "/home/jed/joss/strucscan/VENV/lib/python3.10/site-packages/strucscan/__init__.py", line 7, in <module>
    from strucscan.utils import *
  File "/home/jed/joss/strucscan/VENV/lib/python3.10/site-packages/strucscan/utils.py", line 2, in <module>
    import yaml
ModuleNotFoundError: No module named 'yaml'

Installing pyyaml fixes this, but strucscan --help still doesn't work. (I know it doesn't claim to, but this is a normal expectation.)

@arfon arfon added query-scope Submissions of uncertain scope for JOSS and removed query-scope Submissions of uncertain scope for JOSS labels Aug 13, 2022
@ppxasjsm
Copy link

@editorialbot assign me as editor

@editorialbot
Copy link
Collaborator Author

Assigned! @ppxasjsm is now the editor

@thohamm
Copy link

thohamm commented Aug 16, 2022

@jedbrown Thank you for your comments!

  • We included pyyaml in the installation setup for pip and conda-forge.
  • We added a brief print out page when calling strucscan with '--help'.
  • Regarding the scope we would like to emphasize that we developed strucscan as tool for conducting, monitoring and collecting data from high-throughput atomistic simulation calculations. We included an interface to a widely used DFT code (VASP) as example and provide a simple interface for extension to other codes. We know from our own experience and from discussions with other groups that there is a need for transparent and easily-extendable high-throughput tools for atomistic simulations. We are therefore convinced that strucscan will be useful beyond our group and will support users to adapt it for their needs.

@jedbrown
Copy link
Member

Thanks @thohamm. Also, welcome @ppxasjsm 👋 and thanks for agreeing to edit this submission.

@ppxasjsm
Copy link

@thohamm, I'll be acting as your editor for this submission. I am new to this role, so bear with me.
However, to get things going, could you please suggest 5-6 reviewers from this list that you consider appropriate to vet this paper?

@thohamm
Copy link

thohamm commented Aug 18, 2022

@ppxasjsm Thank you for acting as editor for our submission. I would like to suggest bocklund, mturiansky, utf, raghurama123, wcwitt, lucydot as reviewers.

@ppxasjsm
Copy link

Hi @mturiansky, would you be willing to review this submission?

@ppxasjsm
Copy link

Hi @utf, would you be willing to review this submission?

@ppxasjsm
Copy link

@thohamm thank you for your suggestions. Let's see what people say.

@mturiansky
Copy link

Hi @mturiansky, would you be willing to review this submission?

Hello! I should be able to review this, but I will need a few weeks if that's okay, since I am a bit busy at the moment.

@ppxasjsm
Copy link

Hello! I should be able to review this, but I will need a few weeks if that's okay, since I am a bit busy at the moment.

Yes that should be fine! Thank you for agreeing to do the review.

@ppxasjsm
Copy link

@editorialbot add @mturiansky as reviewer

@editorialbot
Copy link
Collaborator Author

@mturiansky added to the reviewers list!

@utf
Copy link

utf commented Aug 23, 2022

Hi @ppxasjsm, I would love to review this, It looks right up my alley but unfortunately I won't have time over the next couple of months.

@ppxasjsm
Copy link

Hi @ppxasjsm, I would love to review this, It looks right up my alley but unfortunately I won't have time over the next couple of months.

No problem! Thank you for letting me know!

@ppxasjsm
Copy link

@lucydot, would you be willing to review this submission?

@ppxasjsm
Copy link

@wcwitt, would you be willing to review this submission?

@wcwitt
Copy link

wcwitt commented Aug 30, 2022

Yes, no problem.

@ppxasjsm
Copy link

Brilliant thank you 👍

@ppxasjsm
Copy link

@editorialbot add @wcwitt as reviewer

@editorialbot
Copy link
Collaborator Author

@wcwitt added to the reviewers list!

@ppxasjsm
Copy link

@editorialbot start review

@editorialbot
Copy link
Collaborator Author

OK, I've started the review over in #4719.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pre-review Python Shell TeX Track: 2 (BCM) Biomedical Engineering, Biosciences, Chemistry, and Materials
Projects
None yet
Development

No branches or pull requests

10 participants