Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: GridTest: testing and metrics collection for Python #2284

Closed
38 tasks done
whedon opened this issue Jun 1, 2020 · 70 comments
Closed
38 tasks done

[REVIEW]: GridTest: testing and metrics collection for Python #2284

whedon opened this issue Jun 1, 2020 · 70 comments
Assignees
Labels
accepted published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. review

Comments

@whedon
Copy link

whedon commented Jun 1, 2020

Submitting author: @vsoch (Vanessa Sochat)
Repository: https://github.com/vsoch/gridtest
Version: 0.0.15
Editor: @usethedata
Reviewer: @cbrueffer, @khinsen
Archive: 10.5281/zenodo.3930366

⚠️ JOSS reduced service mode ⚠️

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/ff67c32a99f2132190e2794cf94768a7"><img src="https://joss.theoj.org/papers/ff67c32a99f2132190e2794cf94768a7/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/ff67c32a99f2132190e2794cf94768a7/status.svg)](https://joss.theoj.org/papers/ff67c32a99f2132190e2794cf94768a7)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@cbrueffer & @khinsen, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @usethedata know.

Please try and complete your review in the next six weeks

Review checklist for @cbrueffer

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@vsoch) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @khinsen

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@vsoch) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?
@whedon
Copy link
Author

whedon commented Jun 1, 2020

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @cbrueffer, @khinsen it looks like you're currently assigned to review this paper 🎉.

⚠️ JOSS reduced service mode ⚠️

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

⭐ Important ⭐

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Jun 1, 2020

Reference check summary:

OK DOIs

- None

MISSING DOIs

- None

INVALID DOIs

- None

@whedon
Copy link
Author

whedon commented Jun 1, 2020

@usethedata
Copy link

@cbrueffer and @khinsen Thanks for agreeing to review. Documenting here that, per earlier discussion, @cbrueffer won't be able to start this review until mid-June, so the timeline on this one can be a bit longer. Please reach out to me, either through a comment or through the email in my profile, if you have any questions or needs.

@khinsen
Copy link

khinsen commented Jun 8, 2020

@usedata I am stuck with this review, and I'd appreciate some guidance of how to proceed.

The big problem that I see with this software is lack of clarity in the documentation. There are many specific points that I could list (here or as issues on the software's repository), but that would be a lot of work (because there are so many points) and therefore I won't be able to do this rapidly. More importantly, there is no clear high-level overview in the documentation. I find it hard to answer questions such as "Have the functional claims of the software been confirmed?" in the review checklist, because there aren't any clear claims of functionality, there are just many examples. There is no clear "statement of need" either.

Taking another step back in the assessment of this submission, I am not convinced that the functionality that gridtest seems to implement (from my cursory inspection of the code) is a good candidate for packaging as a reusable unit. My impression is that, for the use cases I can imagine, individual potential users would be better off implementing just the parts of the functionality they need in their own small scripts. The cost of an additional dependency in future maintenance, plus the effort required for understanding if gridtest is suitable for their needs and figuring out how to use it, seem too high compared to the expected return. There are clearly good ideas and concepts behind gridtest, but I think they would be better conveyed by a tutorial article accompanied by small example scripts that interested users could easily adapt to their specific needs (this is Donald Knuth's concept of "re-editable software", introduced in this interview and explored in more detail in this blog post by Ted Unangst and in this article by myself).

@vsoch
Copy link

vsoch commented Jun 8, 2020

@khinsen Im sorry that you don’t appreciate my library, it’s a general tool for reproducible generation of grids and I know that there is quite a bit of documentation, I’d be happy to address that list of points that you mention. I see this as a pretty good high level overview:

https://vsoch.github.io/gridtest/

And this too:

https://vsoch.github.io/gridtest/getting-started/

and high level along with use cases here:

https://github.com/vsoch/gridtest/blob/master/paper/paper.md

And there are many tutorials too, as you mention would be good to have.

https://vsoch.github.io/gridtest/tutorials/

I don’t have suggestions other than perhaps suggesting that if you don’t feel fit to review, I think we still have time to find someone else.

update sorry copy pasted some of the links wrong, embarrassing!

@usethedata
Copy link

@khinsen Thanks for your comments. It is interesting to me that we've both read the documentation and code and come to some very different conclusions. I confess that I struggle to understand the perspectives of "no clear statement need" and "no high level overview". That's part of what's taken me a while to respond to your questions, since my read of the paper and documentation gave me a clear understanding of the problem(s) the author is working to solve and the overview of that solution. There is also, to me, a very clear statement of functionality, which is the generation of grids, for which a primary purpose is testing software.

I realize that trying to explain things in this context (purely written) can be challenging, since all of the non-verbal clues to express context are missing. But I do need to ask for more information about what you see is missing from the documentation noted above.

I understand the points you raise regarding reusable code versus re-editable code. My thought is that, as you note, there is not a one size fits all answer to that balance. A user is free to use this code either as a module (reusable, as presently developed) or from a copy and paste (re-editable).

This particular format (written only, lacking all of the non-verbals) can be challenging to convey complex subjects. I would be happy to get on a video call with you (Zoom, GoToMeeting, Teams, WebEx, ...) to discuss, if you feel that would be useful. I'm in US Eastern Time (UTC-04:00) and can adjust my schedule to find a time at your convenience.

@cbrueffer
Copy link

Chiming in a bit late, since @khinsen's comment above some changes have been made in the repository README. This, at least to me, now gives a clearer picture of the software. I think the functionality is quite well described in both the documentation and the paper.

To me, what is currently missing is a description of the state of the art, how GridTest fits in, and what it adds over existing tools; I've opened PR vsoch/gridtest#34 to discuss this. Addressing this may also alleviate @khinsen's comments about unclear "statement of need".

In terms of code maintenance, I agree that it's sometimes difficult to find a sweet spot for functionality. I agree with @usethedata that this software seems easy to use both as a package, and as a source for copying code. In this regard, I don't see this as an argument against paper acceptance.

Not sure how these reviews usually go, but I've opened the following PRs and issues while going through the checklist and @vsoch has been very responsive in addressing them: vsoch/gridtest#26, vsoch/gridtest#32, vsoch/gridtest#33, vsoch/gridtest#31

@vsoch
Copy link

vsoch commented Jun 20, 2020

Thank you @cbrueffer! It's always really a wonderful surprise when a reviewer takes the initiative to help out with changes, and I really appreciate that. The last issue (for vsoch/gridtest#34) now has a PR open to address it: vsoch/gridtest#35.

@khinsen
Copy link

khinsen commented Jun 22, 2020

Back to this review after a break for dealing with real-life issues, sorry for the delay.

With all the changes to the documentation, the functionality of this package has become a lot clearer! The one point I'd like to see improved is the definition of what a grid actually is in the specific context of this package. Coming from a background of good old number crunching, my intuitive definition is pretty close to this, an essential aspect being regularity. From the examples I see that gridtest takes a more general point of view which even includes random sampling, which from my point of view is the exact opposite of a grid (see e.g. discussion of Monte-Carlo integration vs. grid-based numerical integration).

This is in fact part of my initial struggle to understand the point of this package. My very first reaction was "what does this do better than numpy.arange(a, b, c)?" Going through the examples quickly shows that there is more, but I still have no clear idea of what exactly there is.

@usedata Thanks for your feedback, and sorry for the late reply. I suspect that our different initial reaction is mainly due to coming from different backgrounds, which is why it's always good to have several reviewers! Thanks also for your offer of a video call. Unfortunately, these days I do reviews in the break between video calls, which are already far too numerous on my agenda!

As for reusable/re-editable, in the Open Source world that's always a user choice from a purely technical/legal point of view. However, the way this package is written and presented clearly says "install and reuse me". A re-editable version would be just a collection of commented examples, with each of them containing no more code than strictly necessary for one specific use case.

@vsoch
Copy link

vsoch commented Jun 22, 2020

Thank you @khinsen for your feedback! I'd be happy to add more detail about the definition for a grid, and let's discuss 1) where it would be best, and 2) what additional detail you are looking for. The first spot we mention grids is on the first page you encounter, under the "parameterization" section:

https://vsoch.github.io/gridtest/#grids

image

The second place is under the getting started guide, there is a "concepts" section from this page https://vsoch.github.io/gridtest/getting-started/index.html#concepts:

image

In both I make a very general reference to a grid of parameters, but you are correct that I don't make reference to any kind of grid search (or anything related to machine learning). What would you look for in either/both of those sections to make the distinction? You might also be interested to see the background section that I added per @cbrueffer feedback, which does bring in some of these more ML oriented libraries https://github.com/vsoch/gridtest/pull/35/files#diff-e2778e7674f45b806282a9611faa7220R37 (please feel free to give feedback there too!)

Anyhoo, let's discuss what could be made clearer, linked to, etc., how you would like to improve the documentation to clear this up.

@khinsen
Copy link

khinsen commented Jun 22, 2020

I'd put the explanation in the "parametrization" section, and then put a link on the word "grid" in the getting started guide, and also in the README.

As for the contents of that explanation... I don't know, that's why I am asking! I doubt it requires any reference to machine learning libraries, parameter grids have been used for decades in other applications. What I'd like to see is a description of the types of grids that are supported. If there is any functionality designed for a specific non-obvious use case, a comment would be nice as well.

BTW, the new "background" section is very useful, it provides the statement of need for people coming from machine learning (which is not me!)

@vsoch
Copy link

vsoch commented Jun 22, 2020

Sounds good! Here is a PR with the suggested changes vsoch/gridtest#36.

@khinsen
Copy link

khinsen commented Jun 29, 2020

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Jun 29, 2020

@khinsen
Copy link

khinsen commented Jun 29, 2020

@usethedata @vsoch I am overall happy with the software and its documentation now. Re-reading the paper, I note it suffers from the same problems as the initial documentation, and deserves an overhaul along the same lines: summarize what GridTest does and which development situations it addresses.

The section "use cases" starts by stating that there are use cases beyond testing - fine, but it would be great to explain the testing use cases first! The first sub-section, "Grids as first-class citizens" doesn't even describe a use case.

The "state of the field" aspect also needs some improvements. It should outline the differences compared to the well-known Python testing libraries and compared to grid-generating functionality in numerical libraries (such as scikit-learn).

@vsoch
Copy link

vsoch commented Jun 29, 2020

@khinsen I'll get on this! I was slow to respond to vsoch/gridtest#35 (I didn't see the comment!) so once that is merged, I'll also work on the use cases section.

@vsoch
Copy link

vsoch commented Jun 30, 2020

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Jun 30, 2020

@vsoch
Copy link

vsoch commented Jun 30, 2020

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Jun 30, 2020

@vsoch
Copy link

vsoch commented Jun 30, 2020

@khinsen please take a look! I've done the following changes:

  • extended the background section to lead into the idea of "grids as first class citizens"
  • moved the concepts section to appear before use cases
  • extended the use cases to include the (somewhat modified) content that we added to the readme, etc. about the different use cases for grids.

Please take a look and let me know your feedback!

@khinsen
Copy link

khinsen commented Jul 2, 2020

@vsoch Thanks, that looks good! I like the many links to further details on the Web site.

@whedon
Copy link
Author

whedon commented Jul 5, 2020

@usethedata
Copy link

@whedon accept

@whedon whedon added the recommend-accept Papers recommended for acceptance in JOSS. label Jul 5, 2020
@whedon
Copy link
Author

whedon commented Jul 5, 2020

Attempting dry run of processing paper acceptance...

@whedon
Copy link
Author

whedon commented Jul 5, 2020

Reference check summary:

OK DOIs

- 10.1107/S2059798320003198 is OK

MISSING DOIs

- https://doi.org/10.1163/1574-9347_dnp_e612900 may be missing for title: Keras

INVALID DOIs

- None

@whedon
Copy link
Author

whedon commented Jul 5, 2020

👋 @openjournals/joss-eics, this paper is ready to be accepted and published.

Check final proof 👉 openjournals/joss-papers#1541

If the paper PDF and Crossref deposit XML look good in openjournals/joss-papers#1541, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.

@whedon accept deposit=true

@danielskatz
Copy link

@vsoch - I'm the AEiC on duty currently, and this mostly looks good to go. One exception is the use of second person, which I would prefer to see as third person for a scientific article, as opposed to a tutorial or user documentation, where second person would be fine. Would you be willing to change this?

@vsoch
Copy link

vsoch commented Jul 5, 2020

Sure! How would you like me to change it - refer to "the user?"

@danielskatz
Copy link

that's what I was thinking

@vsoch
Copy link

vsoch commented Jul 5, 2020

okay, all set!

vsoch/gridtest@b6e2eed

@whedon generate pdf

@danielskatz
Copy link

@whedon generate pdf

@whedon
Copy link
Author

whedon commented Jul 5, 2020

@danielskatz
Copy link

One more fix - in line 153 of the source, "it's" should be "its"

@vsoch
Copy link

vsoch commented Jul 5, 2020

Fixed! vsoch/gridtest@ce4d3d1

That particular word has always been trouble for me - thanks for the catch!

@danielskatz
Copy link

@whedon accept

@whedon
Copy link
Author

whedon commented Jul 5, 2020

Attempting dry run of processing paper acceptance...

@whedon
Copy link
Author

whedon commented Jul 5, 2020

Reference check summary:

OK DOIs

- 10.1107/S2059798320003198 is OK

MISSING DOIs

- https://doi.org/10.1163/1574-9347_dnp_e612900 may be missing for title: Keras

INVALID DOIs

- None

@whedon
Copy link
Author

whedon commented Jul 5, 2020

👋 @openjournals/joss-eics, this paper is ready to be accepted and published.

Check final proof 👉 openjournals/joss-papers#1542

If the paper PDF and Crossref deposit XML look good in openjournals/joss-papers#1542, then you can now move forward with accepting the submission by compiling again with the flag deposit=true e.g.

@whedon accept deposit=true

@danielskatz
Copy link

@whedon accept deposit=true

@whedon
Copy link
Author

whedon commented Jul 5, 2020

Doing it live! Attempting automated processing of paper acceptance...

@whedon whedon added accepted published Papers published in JOSS labels Jul 5, 2020
@whedon
Copy link
Author

whedon commented Jul 5, 2020

🐦🐦🐦 👉 Tweet for this paper 👈 🐦🐦🐦

@whedon
Copy link
Author

whedon commented Jul 5, 2020

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

  1. Check final PDF and Crossref metadata that was deposited 👉 Creating pull request for 10.21105.joss.02284 joss-papers#1543
  2. Wait a couple of minutes to verify that the paper DOI resolves https://doi.org/10.21105/joss.02284
  3. If everything looks good, then close this review issue.
  4. Party like you just published a paper! 🎉🌈🦄💃👻🤘

Any issues? Notify your editorial technical team...

@vsoch
Copy link

vsoch commented Jul 5, 2020

Woohoo!! Thank you @usethedata @khinsen @cbrueffer @whedon and @danielskatz - this was a lot of fun!

@danielskatz
Copy link

Thanks to @cbrueffer & @khinsen for reviewing, and @usethedata for editing!

And congratulations to @vsoch (Vanessa Sochat)!!

@whedon
Copy link
Author

whedon commented Jul 5, 2020

🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](https://joss.theoj.org/papers/10.21105/joss.02284/status.svg)](https://doi.org/10.21105/joss.02284)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.02284">
  <img src="https://joss.theoj.org/papers/10.21105/joss.02284/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: https://joss.theoj.org/papers/10.21105/joss.02284/status.svg
   :target: https://doi.org/10.21105/joss.02284

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. review
Projects
None yet
Development

No branches or pull requests

6 participants