-
-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW]: GridTest: testing and metrics collection for Python #2284
Comments
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @cbrueffer, @khinsen it looks like you're currently assigned to review this paper 🎉. Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post. ⭐ Important ⭐ If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿 To fix this do the following two things:
For a list of things I can do to help you, just type:
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
|
|
@cbrueffer and @khinsen Thanks for agreeing to review. Documenting here that, per earlier discussion, @cbrueffer won't be able to start this review until mid-June, so the timeline on this one can be a bit longer. Please reach out to me, either through a comment or through the email in my profile, if you have any questions or needs. |
@usedata I am stuck with this review, and I'd appreciate some guidance of how to proceed. The big problem that I see with this software is lack of clarity in the documentation. There are many specific points that I could list (here or as issues on the software's repository), but that would be a lot of work (because there are so many points) and therefore I won't be able to do this rapidly. More importantly, there is no clear high-level overview in the documentation. I find it hard to answer questions such as "Have the functional claims of the software been confirmed?" in the review checklist, because there aren't any clear claims of functionality, there are just many examples. There is no clear "statement of need" either. Taking another step back in the assessment of this submission, I am not convinced that the functionality that gridtest seems to implement (from my cursory inspection of the code) is a good candidate for packaging as a reusable unit. My impression is that, for the use cases I can imagine, individual potential users would be better off implementing just the parts of the functionality they need in their own small scripts. The cost of an additional dependency in future maintenance, plus the effort required for understanding if gridtest is suitable for their needs and figuring out how to use it, seem too high compared to the expected return. There are clearly good ideas and concepts behind gridtest, but I think they would be better conveyed by a tutorial article accompanied by small example scripts that interested users could easily adapt to their specific needs (this is Donald Knuth's concept of "re-editable software", introduced in this interview and explored in more detail in this blog post by Ted Unangst and in this article by myself). |
@khinsen Im sorry that you don’t appreciate my library, it’s a general tool for reproducible generation of grids and I know that there is quite a bit of documentation, I’d be happy to address that list of points that you mention. I see this as a pretty good high level overview: https://vsoch.github.io/gridtest/ And this too: https://vsoch.github.io/gridtest/getting-started/ and high level along with use cases here: https://github.com/vsoch/gridtest/blob/master/paper/paper.md And there are many tutorials too, as you mention would be good to have. https://vsoch.github.io/gridtest/tutorials/ I don’t have suggestions other than perhaps suggesting that if you don’t feel fit to review, I think we still have time to find someone else. update sorry copy pasted some of the links wrong, embarrassing! |
@khinsen Thanks for your comments. It is interesting to me that we've both read the documentation and code and come to some very different conclusions. I confess that I struggle to understand the perspectives of "no clear statement need" and "no high level overview". That's part of what's taken me a while to respond to your questions, since my read of the paper and documentation gave me a clear understanding of the problem(s) the author is working to solve and the overview of that solution. There is also, to me, a very clear statement of functionality, which is the generation of grids, for which a primary purpose is testing software. I realize that trying to explain things in this context (purely written) can be challenging, since all of the non-verbal clues to express context are missing. But I do need to ask for more information about what you see is missing from the documentation noted above. I understand the points you raise regarding reusable code versus re-editable code. My thought is that, as you note, there is not a one size fits all answer to that balance. A user is free to use this code either as a module (reusable, as presently developed) or from a copy and paste (re-editable). This particular format (written only, lacking all of the non-verbals) can be challenging to convey complex subjects. I would be happy to get on a video call with you (Zoom, GoToMeeting, Teams, WebEx, ...) to discuss, if you feel that would be useful. I'm in US Eastern Time (UTC-04:00) and can adjust my schedule to find a time at your convenience. |
Chiming in a bit late, since @khinsen's comment above some changes have been made in the repository README. This, at least to me, now gives a clearer picture of the software. I think the functionality is quite well described in both the documentation and the paper. To me, what is currently missing is a description of the state of the art, how GridTest fits in, and what it adds over existing tools; I've opened PR vsoch/gridtest#34 to discuss this. Addressing this may also alleviate @khinsen's comments about unclear "statement of need". In terms of code maintenance, I agree that it's sometimes difficult to find a sweet spot for functionality. I agree with @usethedata that this software seems easy to use both as a package, and as a source for copying code. In this regard, I don't see this as an argument against paper acceptance. Not sure how these reviews usually go, but I've opened the following PRs and issues while going through the checklist and @vsoch has been very responsive in addressing them: vsoch/gridtest#26, vsoch/gridtest#32, vsoch/gridtest#33, vsoch/gridtest#31 |
Thank you @cbrueffer! It's always really a wonderful surprise when a reviewer takes the initiative to help out with changes, and I really appreciate that. The last issue (for vsoch/gridtest#34) now has a PR open to address it: vsoch/gridtest#35. |
Back to this review after a break for dealing with real-life issues, sorry for the delay. With all the changes to the documentation, the functionality of this package has become a lot clearer! The one point I'd like to see improved is the definition of what a grid actually is in the specific context of this package. Coming from a background of good old number crunching, my intuitive definition is pretty close to this, an essential aspect being regularity. From the examples I see that gridtest takes a more general point of view which even includes random sampling, which from my point of view is the exact opposite of a grid (see e.g. discussion of Monte-Carlo integration vs. grid-based numerical integration). This is in fact part of my initial struggle to understand the point of this package. My very first reaction was "what does this do better than @usedata Thanks for your feedback, and sorry for the late reply. I suspect that our different initial reaction is mainly due to coming from different backgrounds, which is why it's always good to have several reviewers! Thanks also for your offer of a video call. Unfortunately, these days I do reviews in the break between video calls, which are already far too numerous on my agenda! As for reusable/re-editable, in the Open Source world that's always a user choice from a purely technical/legal point of view. However, the way this package is written and presented clearly says "install and reuse me". A re-editable version would be just a collection of commented examples, with each of them containing no more code than strictly necessary for one specific use case. |
Thank you @khinsen for your feedback! I'd be happy to add more detail about the definition for a grid, and let's discuss 1) where it would be best, and 2) what additional detail you are looking for. The first spot we mention grids is on the first page you encounter, under the "parameterization" section: https://vsoch.github.io/gridtest/#grids The second place is under the getting started guide, there is a "concepts" section from this page https://vsoch.github.io/gridtest/getting-started/index.html#concepts: In both I make a very general reference to a grid of parameters, but you are correct that I don't make reference to any kind of grid search (or anything related to machine learning). What would you look for in either/both of those sections to make the distinction? You might also be interested to see the background section that I added per @cbrueffer feedback, which does bring in some of these more ML oriented libraries https://github.com/vsoch/gridtest/pull/35/files#diff-e2778e7674f45b806282a9611faa7220R37 (please feel free to give feedback there too!) Anyhoo, let's discuss what could be made clearer, linked to, etc., how you would like to improve the documentation to clear this up. |
I'd put the explanation in the "parametrization" section, and then put a link on the word "grid" in the getting started guide, and also in the README. As for the contents of that explanation... I don't know, that's why I am asking! I doubt it requires any reference to machine learning libraries, parameter grids have been used for decades in other applications. What I'd like to see is a description of the types of grids that are supported. If there is any functionality designed for a specific non-obvious use case, a comment would be nice as well. BTW, the new "background" section is very useful, it provides the statement of need for people coming from machine learning (which is not me!) |
Sounds good! Here is a PR with the suggested changes vsoch/gridtest#36. |
@whedon generate pdf |
@usethedata @vsoch I am overall happy with the software and its documentation now. Re-reading the paper, I note it suffers from the same problems as the initial documentation, and deserves an overhaul along the same lines: summarize what GridTest does and which development situations it addresses. The section "use cases" starts by stating that there are use cases beyond testing - fine, but it would be great to explain the testing use cases first! The first sub-section, "Grids as first-class citizens" doesn't even describe a use case. The "state of the field" aspect also needs some improvements. It should outline the differences compared to the well-known Python testing libraries and compared to grid-generating functionality in numerical libraries (such as scikit-learn). |
@khinsen I'll get on this! I was slow to respond to vsoch/gridtest#35 (I didn't see the comment!) so once that is merged, I'll also work on the use cases section. |
@whedon generate pdf |
@whedon generate pdf |
@khinsen please take a look! I've done the following changes:
Please take a look and let me know your feedback! |
@vsoch Thanks, that looks good! I like the many links to further details on the Web site. |
@whedon accept |
|
|
👋 @openjournals/joss-eics, this paper is ready to be accepted and published. Check final proof 👉 openjournals/joss-papers#1541 If the paper PDF and Crossref deposit XML look good in openjournals/joss-papers#1541, then you can now move forward with accepting the submission by compiling again with the flag
|
@vsoch - I'm the AEiC on duty currently, and this mostly looks good to go. One exception is the use of second person, which I would prefer to see as third person for a scientific article, as opposed to a tutorial or user documentation, where second person would be fine. Would you be willing to change this? |
Sure! How would you like me to change it - refer to "the user?" |
that's what I was thinking |
okay, all set! @whedon generate pdf |
@whedon generate pdf |
One more fix - in line 153 of the source, "it's" should be "its" |
Fixed! vsoch/gridtest@ce4d3d1 That particular word has always been trouble for me - thanks for the catch! |
@whedon accept |
|
|
👋 @openjournals/joss-eics, this paper is ready to be accepted and published. Check final proof 👉 openjournals/joss-papers#1542 If the paper PDF and Crossref deposit XML look good in openjournals/joss-papers#1542, then you can now move forward with accepting the submission by compiling again with the flag
|
@whedon accept deposit=true |
|
🐦🐦🐦 👉 Tweet for this paper 👈 🐦🐦🐦 |
🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨 Here's what you must now do:
Any issues? Notify your editorial technical team... |
Woohoo!! Thank you @usethedata @khinsen @cbrueffer @whedon and @danielskatz - this was a lot of fun! |
Thanks to @cbrueffer & @khinsen for reviewing, and @usethedata for editing! And congratulations to @vsoch (Vanessa Sochat)!! |
🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉 If you would like to include a link to your paper from your README use the following code snippets:
This is how it will look in your documentation: We need your help! Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:
|
Submitting author: @vsoch (Vanessa Sochat)
Repository: https://github.com/vsoch/gridtest
Version: 0.0.15
Editor: @usethedata
Reviewer: @cbrueffer, @khinsen
Archive: 10.5281/zenodo.3930366
Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@cbrueffer & @khinsen, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @usethedata know.
✨ Please try and complete your review in the next six weeks ✨
Review checklist for @cbrueffer
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @khinsen
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
The text was updated successfully, but these errors were encountered: