Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rclean: A Tool for Writing Cleaner, More Transparent Code #300

Closed
1 of 8 tasks
MKLau opened this issue May 14, 2019 · 7 comments
Closed
1 of 8 tasks

Rclean: A Tool for Writing Cleaner, More Transparent Code #300

MKLau opened this issue May 14, 2019 · 7 comments

Comments

@MKLau
Copy link

MKLau commented May 14, 2019

Submitting Author: Matthew K. Lau (@MKLau)
Repository: https://github.com/provtools/rclean


  • Paste the full DESCRIPTION file inside a code block below:
Type: Package
Package: Rclean
Title: A Tool for Writing Cleaner, More Transparent Code
Version: 1.1.0
Date: 2019-04-24
Author: Matthew K. Lau
Maintainer: Matthew K. Lau <[email protected]>
Description: To create clearer, more concise code provides this
	     toolbox helps coders to isolate the essential parts of a script that
	     produces a chosen result, such as an object, tables and figures
	     written to disk and even warnings and errors. 
URL: https://github.com/ProvTools/Rclean
BugReports: https://github.com/ProvTools/Rclean/issues
License: GPL-3 | file LICENSE
Imports: igraph, jsonlite, formatR, CodeDepends
Suggests: roxygen2, testthat
RoxygenNote: 6.0.1

Scope

  • Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below.:

    • data retrieval
    • data extraction
    • database access
    • data munging
    • data deposition
    • reproducibility
    • geospatial data
    • text analysis
  • Explain how the and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:

In writing analytical scripts, software best practices are often a
lower priority than producing inferential results, leading to large,
complicated code bases that often need refactoring. The "code
cleaning" capabilities of the Rclean package provide a means to
rigorously identify the minimal code required to produce a given
result (e.g. object, table, plot, etc.), reducing the effort required
to create simpler, more transparent code that is easier to reproduce.

  • Who is the target audience and what are scientific applications of
    this package?

The target audience is domain scientists that have little to no formal
training in software engineering. Multiple studies on scientific
reproducibility have pointed to data and software availability as
limiting factors. This tool will provide an easy to use tool for
writing cleaner analytical code.

There are other packages that analyze the syntax and structure of
code, such as lintr, formatr and cleanr. Rclean, as far as we are
aware, is the only package written for R that uses a data provenance
approach to construct the interdependencies of objects and functions
and then uses graph analytics to rigorously determine the desired
pathways to determine the minimal code-base needed to generate an
result.

  • Any other questions or issues we should be aware of?:

Not that I can think of at the moment.

@noamross
Copy link
Contributor

Thank you for this inquiry, @MKLau! I have a question: Can the package work without provR, as described in the README? If the two are in a tightly linked workflow, it may make more sense to have both reviewed or do them together.

@MKLau
Copy link
Author

MKLau commented May 17, 2019

Thanks @noamross! No, but it does require some kind of provenance input. I have updated the package to use prospective provenance, which doesn't require code execution to conduct analyses. I would say that this is now the primary workhorse for the package. Sorry, I'm still working on the docs and haven't pushed the feature into a release yet, so it's not yet described. It depends heavily on the CodeDepends package by Duncan Lang. It's on CRAN, but I'm not sure if it's ROpenSci reviewed.

@noamross
Copy link
Contributor

Hello @MKLau. After some discussion, we've decided that this package is in-scope under the reproducibility category for converting analysis scripts to reproducible workflows. I note that your package will require a vignette and to report testing coverage before being sent for review.

@MKLau
Copy link
Author

MKLau commented May 22, 2019

Hi @noamross, thanks, great to hear, and issues noted.

@MKLau
Copy link
Author

MKLau commented Jul 10, 2019

@noamross

Hi Noam, just finished revisions to the package (including the addition of testing coverage and a vignette). Not sure what the next step is given your pre-review and my paused submission to JOSS.

Could you point me in the best direction?

@noamross
Copy link
Contributor

My apologies for the delayed reply, @MKLau. I see you opened another issue as a pre-submission inquiry, but you can go ahead and do a full submission for review. @annakrystalli is acting editor-in-chief while I'm traveling and will assign it out.

@MKLau
Copy link
Author

MKLau commented Jul 24, 2019

Thanks @noamross, and sorry for the confusion. Safe travels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants