-
-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Submission: cRegulome #149
Comments
👋 @MahShaaban |
That's alright @karthik. Safe travels. |
Thanks for the submission! Looking for reviewers now. I also have two comments about one vignette and one test below. Editor checks:
Here are some suggestions from ── GP cRegulome ───────────── It is good practice to ✖ write unit tests for all functions, and all package code in
✖ not import packages as a whole, as this can cause name clashes between the imported packages. Instead, import only the specific functions you need. Minor comments:
|
Thanks @karthik. I am looking forward.
|
For a replacement to
No! As reasonable an amount of coverage as possible. We realize that 100% coverage is hard to achieve for most packages. I'm currently seeking reviewers and have sent out requests. I will update the thread when I'm ready to assign reviewers. |
Assigning @PeteHaitch as reviewer 1. (Note: Pete is away till the 21st and his review will be due 3 weeks from then which is Oct 12th) |
Assigning @mmulvahill as reviewer 2. (Matt is also away next week and will return and begin 9/25). Review due by 10/16 @MahShaaban Please stay tuned till the reviews show up. |
@MahShaaban If you wish to add a badge to your README, you can do so with the following code
|
Great. I am looking forward to hearing from them. Thanks @karthik for your efforts. |
Hi @MahShaaban, I've just started to look at cRegulome. I first tried running upset(ob_tf, study = 'ACC')
#> Error in mutate_impl(.data, dots) :
#> Evaluation error: object 'x' not found. Stepping through this function, the error arises in the Are you able to reproduce this error? Cheers, Output of devtools::check() on my machine> devtools::check()
Updating cRegulome documentation
Loading cRegulome
Setting env vars --------------------------------------------------------------------------------------
CFLAGS : -Wall -pedantic
CXXFLAGS: -Wall -pedantic
Building cRegulome ------------------------------------------------------------------------------------
'/Library/Frameworks/R.framework/Resources/bin/R' --no-site-file --no-environ --no-save --no-restore \
--quiet CMD build '/Users/Peter/GitHub/cRegulome' --no-resave-data --no-manual
* checking for file ‘/Users/Peter/GitHub/cRegulome/DESCRIPTION’ ... OK
* preparing ‘cRegulome’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
Loading required package: R.oo
Loading required package: R.methodsS3
R.methodsS3 v1.7.1 (2016-02-15) successfully loaded. See ?R.methodsS3 for help.
R.oo v1.21.0 (2016-10-30) successfully loaded. See ?R.oo for help.
Attaching package: 'R.oo'
The following objects are masked from 'package:methods':
getClasses, getMethods
The following objects are masked from 'package:base':
attach, detach, gc, load, save
R.utils v2.5.0 (2016-11-07) successfully loaded. See ?R.utils for help.
Attaching package: 'R.utils'
The following object is masked from 'package:utils':
timestamp
The following objects are masked from 'package:base':
cat, commandArgs, getOption, inherits, isOpen, parse, warnings
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
Attaching package: 'dbplyr'
The following objects are masked from 'package:dplyr':
ident, sql
Attaching package: 'tidyr'
The following object is masked from 'package:R.utils':
extract
Attaching package: 'cRegulome'
The following objects are masked from 'package:graphics':
hist, plot
The following object is masked from 'package:base':
print
trying URL 'https://www.dropbox.com/s/t8ga5j8o81jkcuv/test.db.gz?raw=1'
Content type 'application/octet-stream' length 875072 bytes (854 KB)
==================================================
downloaded 854 KB
Joining, by = c("mirna_base", "feature")
Joining, by = c("mirna_base", "feature")
Picking joint bandwidth of 0.0665
Quitting from lines 260-262 (using_cRegulome.Rmd)
Error: processing vignette 'using_cRegulome.Rmd' failed with diagnostics:
Evaluation error: object 'x' not found.
Execution halted
Error: Command failed (1) Session infodevtools::session_info()
#> Session info ------------------------------------------------------------------------------------------
#> setting value
#> version R version 3.4.1 (2017-06-30)
#> system x86_64, darwin15.6.0
#> ui RStudio (1.1.331)
#> language (EN)
#> collate en_AU.UTF-8
#> tz America/New_York
#> date 2017-09-20
#> Packages ----------------------------------------------------------------------------------------------
#> package * version date source
#> assertthat 0.2.0 2017-04-11 CRAN (R 3.4.0)
#> base * 3.4.1 2017-07-07 local
#> bindr 0.1 2016-11-13 CRAN (R 3.4.0)
#> bindrcpp 0.2 2017-06-17 CRAN (R 3.4.0)
#> BiocGenerics 0.23.1 2017-09-05 Bioconductor
#> bit 1.1-12 2014-04-09 CRAN (R 3.4.0)
#> bit64 0.9-7 2017-05-08 CRAN (R 3.4.0)
#> bitops 1.0-6 2013-08-17 CRAN (R 3.4.0)
#> blob 1.1.0 2017-06-17 CRAN (R 3.4.0)
#> codetools 0.2-15 2016-10-05 CRAN (R 3.4.1)
#> colorout * 1.1-2 2017-07-18 Github (jalvesaq/colorout@020a14d)
#> colorspace 1.3-2 2016-12-14 CRAN (R 3.4.0)
#> commonmark 1.4 2017-09-01 CRAN (R 3.4.1)
#> compiler 3.4.1 2017-07-07 local
#> crayon 1.3.4 2017-09-16 CRAN (R 3.4.1)
#> cRegulome * 0.99.0 <NA> Bioconductor
#> datasets * 3.4.1 2017-07-07 local
#> DBI 0.7 2017-06-18 CRAN (R 3.4.0)
#> dbplyr 1.1.0 2017-06-27 CRAN (R 3.4.0)
#> devtools * 1.13.3 2017-08-02 CRAN (R 3.4.1)
#> digest 0.6.12 2017-01-27 CRAN (R 3.4.0)
#> dplyr 0.7.3 2017-09-09 CRAN (R 3.4.1)
#> fortunes 1.5-4 2016-12-29 CRAN (R 3.4.0)
#> futile.logger 1.4.3 2016-07-10 CRAN (R 3.4.0)
#> futile.options 1.0.0 2010-04-06 CRAN (R 3.4.0)
#> GenomeInfoDb 1.13.4 2017-06-06 Bioconductor
#> GenomeInfoDbData 0.99.1 2017-06-07 Bioconductor
#> GenomicRanges 1.29.14 2017-09-15 Bioconductor
#> ggjoy 0.4.0 2017-09-15 CRAN (R 3.4.1)
#> ggplot2 2.2.1 2016-12-30 CRAN (R 3.4.0)
#> ggridges 0.4.1 2017-09-15 CRAN (R 3.4.1)
#> glue 1.1.1 2017-06-21 CRAN (R 3.4.0)
#> graphics * 3.4.1 2017-07-07 local
#> grDevices * 3.4.1 2017-07-07 local
#> grid 3.4.1 2017-07-07 local
#> gridExtra 2.3 2017-09-09 CRAN (R 3.4.1)
#> gtable 0.2.0 2016-02-26 CRAN (R 3.4.0)
#> httr 1.3.1 2017-08-20 CRAN (R 3.4.1)
#> IRanges 2.11.16 2017-09-15 Bioconductor
#> lambda.r 1.2 2017-09-16 CRAN (R 3.4.1)
#> lazyeval 0.2.0 2016-06-12 CRAN (R 3.4.0)
#> magrittr 1.5 2014-11-22 CRAN (R 3.4.0)
#> memoise 1.1.0 2017-05-26 Github (hadley/memoise@e372cde)
#> methods * 3.4.1 2017-07-07 local
#> munsell 0.4.3 2016-02-13 CRAN (R 3.4.0)
#> parallel 3.4.1 2017-07-07 local
#> pkgconfig 2.0.1 2017-03-21 CRAN (R 3.4.0)
#> plyr 1.8.4 2016-06-08 CRAN (R 3.4.0)
#> pryr 0.1.2 2015-06-20 CRAN (R 3.4.0)
#> purrr 0.2.3 2017-08-02 CRAN (R 3.4.1)
#> R.methodsS3 1.7.1 2016-02-16 CRAN (R 3.4.0)
#> R.oo 1.21.0 2016-11-01 CRAN (R 3.4.0)
#> R.utils 2.5.0 2016-11-07 CRAN (R 3.4.0)
#> R6 2.2.2 2017-06-17 CRAN (R 3.4.0)
#> Rcpp 0.12.12 2017-07-15 CRAN (R 3.4.1)
#> RCurl 1.95-4.8 2016-03-01 CRAN (R 3.4.0)
#> repete * 0.0.0.9009 2017-08-18 Github (PeteHaitch/repete@f82233c)
#> reshape2 1.4.2 2016-10-22 CRAN (R 3.4.0)
#> rlang 0.1.2.9000 2017-09-13 Github (tidyverse/rlang@ff02f2a)
#> roxygen2 6.0.1 2017-02-06 CRAN (R 3.4.0)
#> RSQLite 2.0 2017-06-19 CRAN (R 3.4.0)
#> rstudioapi 0.7.0-9000 2017-09-13 Github (rstudio/rstudioapi@8e8bfb0)
#> S4Vectors 0.15.8 2017-09-14 Bioconductor
#> scales 0.5.0 2017-08-24 CRAN (R 3.4.1)
#> stats * 3.4.1 2017-07-07 local
#> stats4 3.4.1 2017-07-07 local
#> stringi 1.1.5 2017-04-07 CRAN (R 3.4.0)
#> stringr 1.2.0 2017-02-18 CRAN (R 3.4.0)
#> testthat 1.0.2 2016-04-23 CRAN (R 3.4.0)
#> tibble 1.3.4 2017-08-22 CRAN (R 3.4.1)
#> tidyr 0.7.1 2017-09-01 CRAN (R 3.4.1)
#> tools 3.4.1 2017-07-07 local
#> UpSetR 1.3.3 2017-03-21 CRAN (R 3.4.0)
#> utils * 3.4.1 2017-07-07 local
#> VennDiagram 1.6.17 2016-04-18 CRAN (R 3.4.0)
#> withr 2.0.0 2017-07-28 CRAN (R 3.4.1)
#> xml2 1.1.1 2017-01-24 CRAN (R 3.4.0)
#> XVector 0.17.1 2017-08-19 Bioconductor
#> yaml 2.1.14 2016-11-12 CRAN (R 3.4.0)
#> zlibbioc 1.23.0 2017-04-27 Bioconductor |
Hi @PeteHaitch, thanks a lot for taking the time to check the package. Thanks again Mahmoud |
Starting fresh and running I'll take a closer look tomorrow. In the meantime, can you please post the output of |
I ran Session infoSession info ------------------------------------------------------------------------------- setting value version R version 3.4.1 (2017-06-30) system x86_64, darwin15.6.0 ui RStudio (1.0.143) language (EN) collate en_US.UTF-8 tz Asia/Seoul date 2017-09-21 |
After downgrading some github-version packages to CRAN/BioC release-version the error has gone away (I suspect it was changes in rlang not being compatible with release-versions of other tidyverse pkgs. In any case, this was my problem not that of cRegulome). I can now proceed with my review. Thanks for your patience! |
@MahShaaban, @karthik here is my review Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Package not being co-submitted to JOSS
Functionality
Final approval (post-review)
Estimated hours spent reviewing: 5 hours Review CommentscRegulome provides access to a database of pre-computed correlations of transcription factor-gene pairs and microRNA-gene pairs in cancer. It also provides several options for plotting these data. Unfortunately, what is lacking right now is a clear statement of need and example(s) illustrating the potential use cases of cRegulome. The documentation is insufficient for a new user to know when they might make use of cRegulome and how it would be useful. The How are you planning to distribute the package, @MahShaaban? You didn't tick the box saying you plan to upload to CRAN; is that correct? Are you instead planning a Bioconductor submission? If submitting to Bioconductor then I anticipate that you will require much more integration with existing infrastructure and packages. I'm happy to discuss this further, as I have quite a bit of experience with BioC. If not CRAN or BioC, then how do you plan to distribute the package? The package claims to support all cancer types studied by the TCGA (a consortium whose data are widely used in bioinformatics/computational biology). However, in testing the package I was only able to successfully query 2/34 cancer types (the 2 types used in the examples were the only ones I could get to work). Here's a reproducible example (I'm not 100% sure this isn't user error on my part): library(cRegulome)
#>
#> Attaching package: 'cRegulome'
#> The following objects are masked from 'package:graphics':
#>
#> hist, plot
#> The following object is masked from 'package:base':
#>
#> print
get_db(test = FALSE)
R.utils::gunzip('cRegulome.db.gz', overwrite = TRUE)
conn <- DBI::dbConnect(RSQLite::SQLite(), "cRegulome.db")
# All study IDs listed at
# https://tcga-data.nci.nih.gov/docs/publications/tcga/
dat <- get_mir(conn,
mir = c('hsa-let-7b', 'hsa-mir-134'),
study = c("LAML", "ACC", "BLCA", "LGG", "BRCA", "CESC", "CHOL",
"COAD", "ESCA", "FPPP", "GBM", "HNSC", "KICH", "KIRC",
"KIRP", "LIHC", "LUAD", "LUSC", "DLBC", "MESO", "OV",
"PAAD", "PCPG", "PRAD", "READ", "SARC", "SKCM", "STAD",
"TGCT", "THYM", "THCA", "UCS", "UCEC", "UVM"),
min_cor = .5,
max_num = 100,
targets_only = TRUE)
#> Error: Strings must match column names. Unknown columns: LAML, LGG, BRCA, CESC, CHOL, COAD, ESCA, FPPP, GBM, HNSC, KICH, KIRC, KIRP, LIHC, LUAD, LUSC, DLBC, MESO, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, TGCT, THYM, THCA, UCS, UCEC, UVM The SQL databases downloaded by Some further questions on the contents of the database:
The Can users add additional databases? If so, then some guidance should be provided (e.g., format of database). You should import the pipe operator into the package namespace and not define it inline (e.g. as in https://github.com/MahShaaban/cRegulome/blob/d5aae84a7a6fc9d8daac3ee2653fa932e30804fb/R/methods.R#L119) The author of ggjoy, Claus Wilke, has deprecated the package and requested that users switch to ggridges; he wrote a blog post explaining his decision. Additional questions and comments on specific issues from the rOpenSci review template are below. Hopefully these comments are helpful. Let me know if you have any questions. A statement of need
Installation instructions
Vignette(s)
Function Documentation
Community guidelines
Functionality
Automated tests
|
@PeteHaitch, I'd like to thank you first for your time and effort.
A statement of need Installation instructions Vignette(s) Function Documentation
Community guidelines Functionality Automated tests Please, let me know if I missed anything or if you have further concerns |
Hi @MahShaaban Thank you for your response to the review and the changes to the package. There are still a few to sort through, and I'll address those point-by-point, below. But first, my main concern, and where I need guidance from @karthik or someone else from rOpenSci, is that rOpenSci and Bioconductor (where you plan to submit your package) have different review criteria, expectations, and requirements (https://www.bioconductor.org/developers/package-submission/, https://www.bioconductor.org/developers/package-guidelines/). I don't speak on behalf of Bioconductor, however, I know that they aim to have packages that interoperate well with existing Bioconductor packages and data structures. Currently, I don't think cRegulome does this, which may be a difficulty when submitting to Bioconductor for review. From rOpenSci's perspective this (quite reasonably) may not be a problem, but your package will be reviewed upon submission to Bioconductor and, ultimately, it will have to satisfy Bioconductor's review criteria and not rOpenSci's in order to be accepted as a Bioconductor package. So right now, I see 3 options:
To emphasise, I think cRegulome may be a useful bioinformatics package, but I'm unsure how to reconcile the expectations and requirements of an rOpenSci and a Bioconductor package, especially when the package is yet to be reviewed by Bioconductor. Point-by-point commentsUpdate to
|
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Functionality
Final approval (post-review)
Estimated hours spent reviewing: Review CommentsForgive any inaccurate use of terminology. I've developed databse API's for microRNA data, but my work as a statistician and analyst is in a different area of biomedical research. A statement of needPackage is dual purpose -- 1. serving data for microRNA annotations and transcription factor correlations and 2. provide a set of descriptive methods and visualizations for these data types. This is clear enough from the README and vignette. My preference is to keep the readme simple -- installation instructions and brief statement of purpose -- and provide detailed instructions and scientific context in the vignettes. Along these lines, I would shorten the readme and add to the vignette. Bioconductor has specific classes and package templates for packages that serve to annotation data. (I just went through this for our Also, I would consider separating it into two packages, especially if the miRNA/tf correlation statistics will have other applications to other databases and if the cRegulome.db (or Cistrome and miRCancer) database is not yet on Bioconductor (I didn't see the component DBs in an initial, cursory look). Installation instructions: for the development version of package and any non-standard dependencies in READMEInstalled fine for me. A few notes from From
Either of the From
From
Vignette(s) demonstrating major functionality that runs successfully locallyRuns successfully and demonstrates functionality. Suggestions:
Function Documentation: for all exported functions in R help
Function naming
SummaryOverall the biggest thing to consider is Bioconductor integration and whether the database and its summary functions are independently novel enough to be better suited to two packages. On the latter point, I can see an argument for both approaches. |
Hi @PeteHaitch, I think at this point I would go with the first option, submit to rOpenSci and CRAN. Integrating the package in the Bioconductor seems like it would need substantial changes that wouldn't be natural based on the choices I already made.
Looking forward to hearing from you. |
Thanks @PeteHaitch for spotting these issues. I removed the |
Adding BiocInstaller to |
Thanks @PeteHaitch. That solves the note. |
I'll take a look over it again later this week |
Same here -- though I'm traveling most of this week. If I don't get to it this week, I will early next. |
Thanks for all the thoughtful and detailed review comments @mmulvahill and @PeteHaitch! |
A gentle ping for @PeteHaitch and @mmulvahill |
I haven't been able to get the package to install and build vignettes on a clean install (hence my delay responding). This appears to be due to a quirk in automating the bioconductor installs that I can't quite figure out. In the 2nd update I link to below, I added some code for loading and updating with biocLite. Despite this, I still get an error with
Separately, I would add update #1 (to using_cRegulome.Rmd) Anyone have suggestions? If not, at this point I'm okay with moving forward. |
Hi @MahShaaban The vignette still needs more careful proofreading. In my opinion, the vignettes should be the most useful parts of the package but they can be hard to follow in their current state. For example in the 'Case Study' vignette, "In the previous section, we showed the code for obtaining the TF/microRNA-gene expression correlation in stomach and esophogeal cancer using Cistrome and miRCancerdb for comparison purpose." But it's not clear to me that this is indeed what was done in the previous section. There are also many typos; try running It should be made clear in the vignettes that some of what the user is viewing is based on the 'test' database. Otherwise they may be surprised that the output doesn't match their own when they run the code themselves using the full database. Even better may be to use the mocking strategy I suggested in #149 (comment) This shouldn't be too hard to set up. As I noted in my initial review (#149 (comment)):
I still think this is a really good idea, but I'm willing to move on at this point. Some minor points:
Cheers, |
Thanks @mmulvahill & @PeteHaitch for these comments. |
Here are a few changes I made to address the issues with the installation and the vignette. Code changes
Vignettes
I hope these changes solve the issues raised by @mmulvahill and @PeteHaitch |
I submitted a pull request fixing the vignette metadata to ensure it's built and installed as part of I tried running through the "Using cRegulome" vignette as a new user might, but it unfortunately does not work. Specifically, if you build the vignette and try running through the steps at https://github.com/MahShaaban/cRegulome/blob/master/vignettes/using_cRegulome.Rmd#getting-started, you'll get an error: > # load required libraries
> library(cRegulome)
> library(RSQLite)
> library(ggplot2)
> # download the db file when using it for the first time
> if(!file.exists('cRegulome.db')) {
+ get_db(test = TRUE)
+ }
>
> # connect to the db file
> conn <- dbConnect(SQLite(), 'cRegulome.db')
> # enter a custom query with different arguments
> dat <- get_mir(conn,
+ mir = 'hsa-let-7g',
+ study = 'STES',
+ min_abs_cor = .3,
+ max_num = 5)
Error: Unknown column `STES` This is because the test database you connect to as a vignette user (in the You can try building and installing the package with vignettes ( Incidentally, you may wish to update the installation instructions in the |
Thanks @PeteHaitch for the fixes, very helpful as always. |
Hi @MahShaaban, Thanks for the changes. I was able to work through the vignette like a new user. I have gone through the rOpenSci package review checklist and I am happy to recommend approving cRegulome 🎉 Below are a few cosmetic suggestions/changes (some with PRs) that you might consider:
Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Package not being co-submitted to JOSS
Functionality
Final approval (post-review)
|
Thanks @PeteHaitch for these pull requests and the comments.
I really appreciate the time and effort you put in reviewing the package. The comments and the changes you made helped a lot. Thanks again Mahmoud |
I second @PeteHaitch's recommendation of approving Package ReviewPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
DocumentationThe package includes all the following forms of documentation:
Functionality
Final approval (post-review)
|
Thanks a ton @PeteHaitch and @mmulvahill! We are very grateful for your time, expertise, and attention to detail. 🙏👏 |
@MahShaaban Your badge should update soon to peer-reviewed. Congrats on your package being accepted! 🎉 Here are your next steps: Please also add a footer to the bottom of your README
Once moved, please re-run all checks in preparation for submission to CRAN. I can help with this if you run into any issues. Welcome aboard! We'd also love a blog post about your package, either a short-form intro to it (https://ropensci.org/tech-notes/) or long-form post with more narrative about its development. ((https://ropensci.org/blog/). If you are, @stefaniebutland will be in touch about content and timing. |
Thanks again @PeteHaitch, @mmulvahill and @karthik. |
Hello @MahShaaban and congratulations on your package being accepted! I'm rOpenSci's Community Manager. As Karthik said, we'd love to have a blog post about |
Thanks @stefaniebutland. I am certainly considering the blog post. I will look into it as soon as I can. |
@MahShaaban I see that you've now transferred the repo, fantastic! Welcome! 🎉
|
Thanks @maelle. After I transferred the repo I updated the the ci links including that for Appveyor. It seems to be working; builds for the commits and the badge links to the correct page. Do I need to update the badge again with the link you provided in the previous comment? |
Oh if it points to the correct builds then no. ☺ |
But check e.g after your next commit just to be sure. |
Summary
What does this package do? (explain in 50 words or less):
Obtains a database of pre-calculated microRNA and transcription facor-gene correlations in cancer. In addition, the package defines methods for handling and visualizing the data.
Paste the full DESCRIPTION file inside a code block below:
URL for the package (the development repository, not a stylized html page):
https://github.com/MahShaaban/cRegulome
Please indicate which category or categories from our [package fit policies]
Data retrieval, because the package obtains a local database of Cistrome Cancer and miRCancerdb databases
Who is the target audience?
Biologists and computational biologists with little knowledge of are and interested in investigating gene expression regulation in cancer.
Are there other R packages that accomplish the same thing? If so, how does
yours differ or meet our criteria for best-in-category?
Requirements
Confirm each of the following by checking the box. This package:
Publication options
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Detail
Does
R CMD check
(ordevtools::check()
) succeed? Paste and describe any errors or warnings:Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:
If this is a resubmission following rejection, please explain the change in circumstances:
If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:
The text was updated successfully, but these errors were encountered: