Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fixed GRIDSS container for VIRUSBreakend #12

Merged
merged 7 commits into from
Apr 3, 2024

Conversation

scwatts
Copy link
Collaborator

@scwatts scwatts commented Mar 26, 2024

Background

  • VIRUSBreakend is a tool that exists alongside GRIDSS and is available through PapenfussLab/gridss
  • The VIRUSBreakend but not GRIDSS execution script makes heavy use of grep, however the corresponding BioContainers Docker image currently used in oncoanalyser only provides BusyBox grep
  • BusyBox grep has very poor performance compared to GNU grep and increases VIRUSBreakend runtime by ~1 hour
  • I've attempted to replace BusyBox grep with GNU grep in the Bioconda recipe but their CI infrastructure doesn't have sufficient disk space to build + test the relevant artifacts
    • I've previously had good builds on Bioconda but they now consistently fail because of insufficient disk space
    • See: Update GRIDSS 2.13.2 bioconda/bioconda-recipes#46160
    • The main cause is the RepeatMasker database size, which the Bioconda community is actively trying to resolve
  • Fixing this problem through Bioconda is not currently possible as far as I can tell

Changes

For a temporary fix, I have done the following:

  • Created a Dockerfile specifically to restore VIRUSBreakend performance by using GNU grep
  • Uploaded the containers:
    • Docker image: docker.io/scwatts/gridss:2.13.2--1
    • Singularity image: https://pub-29f2e5b2b7384811bdbbcba44f8b5083.r2.dev/singularity/gridss:2.13.2--1
  • Adjusted VIRUSBreakend NF process container directive accordingly

The new VIRUSBreakend Docker image and Singularity have been successfully tested using the COLO829 mini dataset.

Additional requirements

The Docker image would need to be pushed to the nf-core Quay account:

docker pull docker.io/scwatts/gridss:2.13.2--1
docker tag docker.io/scwatts/gridss:2.13.2--1 quay.io/nf-core/gridss:2.13.2--1
docker push quay.io/nf-core/gridss:2.13.2--1

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/oncoanalyser branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nf-test test main.nf.test -profile test,docker).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

scwatts added 2 commits March 26, 2024 10:19
The BioContainers Docker image only provides BusyBox grep, which has
poor performance compared to GNU grep and increases runtime by up to an
hour. The current Bioconda CI infrastructure is unable to build a
patched GRIDSS image at the moment, so I've made a temporary fix here
instead to restore expected performance.
@scwatts scwatts added this to the Release 1.0.0 milestone Mar 26, 2024
@scwatts scwatts requested a review from maxulysse March 26, 2024 01:46
Copy link

github-actions bot commented Mar 26, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 7432c55

+| ✅ 155 tests passed       |+
#| ❔   6 tests were ignored |#
!| ❗  40 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: assets/multiqc_config.yml
  • nextflow_config - Config manifest.version should end in dev: 0.3.1
  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.
  • pipeline_todos - TODO string in output.md: Write this documentation describing your workflow's output
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
  • pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
  • system_exit - System.exit in main.nf: System.exit(1) [line 44]
  • system_exit - System.exit in main.nf: System.exit(1) [line 46]
  • system_exit - System.exit in WorkflowMain.groovy: System.exit(1) [line 114]
  • system_exit - System.exit in WorkflowMain.groovy: System.exit(1) [line 121]
  • system_exit - System.exit in WorkflowMain.groovy: System.exit(1) [line 128]
  • system_exit - System.exit in WorkflowMain.groovy: System.exit(1) [line 142]
  • system_exit - System.exit in WorkflowMain.groovy: System.exit(1) [line 152]
  • system_exit - System.exit in WorkflowMain.groovy: System.exit(1) [line 157]
  • system_exit - System.exit in WorkflowMain.groovy: System.exit(1) [line 170]
  • system_exit - System.exit in WorkflowMain.groovy: System.exit(1) [line 186]
  • system_exit - System.exit in WorkflowMain.groovy: System.exit(1) [line 195]
  • system_exit - System.exit in WorkflowOncoanalyser.groovy: System.exit(1) [line 62]
  • system_exit - System.exit in Processes.groovy: System.exit(1) [line 33]
  • system_exit - System.exit in Processes.groovy: System.exit(1) [line 49]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 29]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 40]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 48]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 56]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 64]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 69]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 117]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 193]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 205]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 213]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 220]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 228]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 243]
  • system_exit - System.exit in Utils.groovy: System.exit(1) [line 337]

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 2.13.1
  • Run at 2024-04-03 12:35:41

scwatts added 2 commits March 26, 2024 19:37
Having distroless as the base Docker image for the VIRUSBreakend
container was causing RepeatMasker to crash without an error/traceback
when executing a system process. Switching to
`quay.io/bioconda/base-glibc-busybox-bash:2.1.0` (the standard
BioContainers base image) resolves the problem.
Co-authored-by: Maxime U Garcia <[email protected]>
@scwatts scwatts merged commit 48d34aa into dev Apr 3, 2024
4 checks passed
@scwatts scwatts deleted the nf-core-gridss-docker-image branch April 3, 2024 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants