Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin R devtools, all of the packages it depends on, and biocLite #752

Merged
merged 6 commits into from
Oct 25, 2018

Conversation

kurtwheeler
Copy link
Contributor

Issue Number

#742

Purpose/Implementation Notes

We got bit by having unpinned R dependencies again today. devtools went up to version 2.0.0 on 18/10/19, causing it to depend on additional packages which we didn't have. Version 1.13.6 seems to have been working fine for us, and includes less dependencies we have to pin, so we aren't upgrading to it.

However the new version being available meant that we were installing it because we weren't specifying which version we were installing. It caused the following error:

Error in vapply(remotes, install_remote, ..., FUN.VALUE = character(1)) : 
  values must be type 'character',
 but FUN(X[[1]]) result is type 'logical'
Calls: install_with_url ... update -> update.package_deps -> install_remotes -> vapply
Execution halted
The command '/bin/sh -c Rscript install_affy_only.R' returned a non-zero code: 1

in this circle build among others because every R dependency install script we have uses devtools.

So what I did was create a script which installs a specific version of devtools: 1.13.6. This script is now run on all docker images before any other R scripts are run. The script also pins all of devtools' dependencies, because we don't want them to break either. Finally, I've downloaded biocLite.R and included it in our repo so it cannot change under us.

Types of changes

  • Bugfix (non-breaking change which fixes an issue)

Functional tests

The unit tests cover this well.

Checklist

  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

@kurtwheeler kurtwheeler requested a review from Miserlou October 24, 2018 14:33
options(Ncpus=parallel::detectCores())


install_package_version <- function(package_name, version) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using the URL format https://cran.r-project.org/package=httr&version=1.3.0 combined with an R library to find the redirect location?

What happens in the current implementation if an invalid package version is specified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

combined with an R library to find the redirect location

This is a chicken and the egg problem. How can we install a pinned version of that library to find the redirect location? You can see that httr is one of the packages being installed in this file.

If an invalid package version is specified the package doesn't fails to install and the script will break because whatever package depends on it won't be installed.

However it would be nice if passing an invalid URL to install.packages would throw an error and interrupt the script rather than it just moving on. Do either you or @jaclyn-taroni maybe know a way to force it to do so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume option(warn=2) would do it but doesn't seem to be the case locally

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't necessarily like this, but could do without tryCatch block like so:

  curl_result <- system(paste0("curl --head ", package_url), intern=TRUE)
  if (grepl("404", curl_result[1])) {
    package_url <- paste0("https://cran.r-project.org/src/contrib/Archive/", package_name, "/", package_tarball)
    curl_result <- system(paste0("curl --head ", package_url), intern=TRUE)
    if (grepl("404", curl_result[1])) {
      stop(paste("Package", package_name, "version", version, "does not exist!"))
    }
  }

@Miserlou
Copy link
Contributor

Has biocLite.R been modified at all from the source, or is it just downloaded and run locally? I'm not sure that either of those are really optimal, as there are other files on the website that the script may want to fetch, or there is dead code that we might want to remove.

@Miserlou
Copy link
Contributor

Or, neither of that. Nothing about this is very pretty. Have we tried asking them to host versioned/ permanently archived versions of this file?

@kurtwheeler
Copy link
Contributor Author

Have we tried asking them to host versioned/ permanently archived versions of this file?

We haven't, but looking at https://www.bioconductor.org/install/#why-biocLite it seems like this is their solution to versioning woes. However there's a chance we may just not need it because like a lot of R stuff it seems to have been optimized for ease of use for individual hackers.

These days, the main purpose of source("https://bioconductor.org/biocLite.R") is to install and attach the ‘BiocInstaller’ package.

In a new installation, the script installs the most recent version of the BiocInstaller package relevant to the version of R in use, regardless of the relative times of R and Bioconductor release cycles. The BiocInstaller package serves as the primary way to identify the version of Bioconductor in use

I think our three options here are:

  • Download and include all the biocLite related files and modify them to work locally, optionally removing irrelevant code that's dealing with versions we don't care about.
  • Asking the bioconductor team to host versioned/ permanently archived versions of this file, however this still runs into the same issue of it referencing other files they host, unless they version all of their files and modify them to use reference the versions of all the scripts. I'd rate the probability of them doing this as low, so if we want to try this we should start on plan B in the meantime.
  • Figure out all the packages and their versions that biocLite installs and use install.packages() to install those. This might not be too hard because we can probably figure out which packages end up getting installed by looking at the source code, and then just check the versions of those packages by running biocLite ourselves and pinning whatever it installs.

I think I'd lean towards the last option. @jaclyn-taroni do you have any input on other potential solutions or which seems the most likely to work/stay stable in the R ecosystem?

@jaclyn-taroni
Copy link
Member

We'll always be pinned to/know the version of R we'll be using correct? If that's the case, my understanding is that particular versions of Bioconductor are tied to particular version of R -- so, it could be a matter of identify the correct URL to install.packages(bioc_url, type = "source", repos = NULL) ?

@kurtwheeler
Copy link
Contributor Author

Yep, that's what I was thinking with the third option. Sounds like that's the way to go.

@leipzig
Copy link

leipzig commented Oct 25, 2018

I think the Bioconductor people are moving from BiocInstaller to BiocManager::install

Copy link
Contributor

@Miserlou Miserlou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are passing so this looks good.

Let's make sure we keep an archived copy of https://bioconductor.org/packages/3.6/bioc/src/contrib/BiocInstaller_1.28.0.tar.gz since I don't trust them.

@kurtwheeler kurtwheeler merged commit 6036138 into dev Oct 25, 2018
@kurtwheeler kurtwheeler deleted the kurtwheeler/fix-r-deps branch October 25, 2018 20:29
kurtwheeler added a commit that referenced this pull request Jan 10, 2019
Pin R devtools, all of the packages it depends on, and biocLite
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants