Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move long read preprocessing into a subworkflow #674

Merged
merged 37 commits into from
Oct 11, 2024
Merged
Show file tree
Hide file tree
Changes from 36 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
08445cf
Added modules porechop/abi and chopper for QC of long reads
Sep 20, 2024
5aa3049
Move long read preprocessing into subworkflow, and swapping porechop …
Sep 20, 2024
451612a
Remove module import from main workflow, and add PORECHOP_ABI in conf…
Sep 20, 2024
59ef702
Exchange local filtlong module to nf-core/filtlong
muabnezor Sep 24, 2024
cd5ce2f
Add filtlong and porechop logs to multiqc
muabnezor Sep 24, 2024
5820fb4
Added --longread_preprocessing_tools parameters, to let user specify …
muabnezor Sep 24, 2024
5335fc5
Merge branch 'dev' into move_lr_qc
muabnezor Sep 25, 2024
1496ce9
Update modules/nf-core/filtlong/main.nf
muabnezor Sep 25, 2024
c11601a
make subworkflow name more verbose
muabnezor Sep 30, 2024
31deb5c
make --longread_adaptertrimming_tool as enum porechop or porechop_abi
muabnezor Sep 30, 2024
b2da50b
Make prefix for porechop/porechop-abi more verbose
muabnezor Sep 30, 2024
8955e31
lint
muabnezor Sep 30, 2024
644e2b1
Change default search pattern for filtlong.log files for the filtlong…
muabnezor Oct 1, 2024
3e15502
Update CHANGELOG.md
muabnezor Oct 3, 2024
ec8538c
Update CHANGELOG.md
muabnezor Oct 3, 2024
c6fb9b3
Update nextflow_schema.json
muabnezor Oct 3, 2024
b24f52f
Update conf/modules.config
muabnezor Oct 3, 2024
fcf509c
Update conf/modules.config
muabnezor Oct 3, 2024
8f9cac6
Update conf/modules.config
muabnezor Oct 3, 2024
10ecdc5
Update nextflow_schema.json
muabnezor Oct 3, 2024
ea6f9a4
Update nextflow.config
muabnezor Oct 3, 2024
b43b9f6
Update subworkflows/local/longread_preprocessing.nf
muabnezor Oct 3, 2024
25981a1
Update subworkflows/local/longread_preprocessing.nf
muabnezor Oct 3, 2024
edf8dab
Update subworkflows/local/longread_preprocessing.nf
muabnezor Oct 3, 2024
0a06944
Merge branch 'dev' into move_lr_qc
muabnezor Oct 3, 2024
669a854
Linting fix
muabnezor Oct 3, 2024
a1098f3
make porechop-abi default long read adapter trimming tool
muabnezor Oct 3, 2024
17ba45c
Fix changelog
muabnezor Oct 4, 2024
f54c967
Fix porechop_abi pattern in modules.config
muabnezor Oct 4, 2024
110b5dd
Merge branch 'dev' into move_lr_qc
muabnezor Oct 4, 2024
3472c87
remove chopper citation for now
muabnezor Oct 4, 2024
ab5f482
retrigger checks
muabnezor Oct 4, 2024
2cfd5e5
Merge branch 'dev' into move_lr_qc
muabnezor Oct 4, 2024
7b31394
Apply suggestions from code review
jfy133 Oct 11, 2024
78edf25
Apply suggestions from code review
jfy133 Oct 11, 2024
3c3b46a
Add previouysly undocumented output files to docs
jfy133 Oct 11, 2024
7c7e954
[automated] Fix code linting
nf-core-bot Oct 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### `Added`

- [#674](https://github.com/nf-core/mag/pull/674) - Added `--longread_adaptertrimming_tool` Where user can chose between porechop_abi (default) and porechop (added by @muabnezor)

### `Changed`

- [#674](https://github.com/nf-core/mag/pull/674) - Changed to porechop-abi as default adapter trimming tool for long reads. User can still use porechop if preferred (added by @muabnezor)

### `Fixed`

- [#674](https://github.com/nf-core/mag/pull/674) - Make longread preprocessing a subworkflow (added by @muabnezor)
- [#674](https://github.com/nf-core/mag/pull/674) - Add porechop and filtlong logs to multiqc (added by @muabnezor)
- [#674](https://github.com/nf-core/mag/pull/674) - Change local filtlong module to the official nf-core/filtlong module (added by @muabnezor)

### `Dependencies`

| Tool | Previous version | New version |
| ------------ | ---------------- | ----------- |
| Porechop_ABI | | 0.5.0 |
| Filtlong | 0.2.0 | 0.2.1 |

### `Deprecated`

## 3.1.0 [2024-10-04]
Expand Down
2 changes: 2 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,8 @@

- [Porechop](https://github.com/rrwick/Porechop)

- [Porechop-abi](https://github.com/bonsai-team/Porechop_ABI)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- [Porechop-abi](https://github.com/bonsai-team/Porechop_ABI)
- [Porechop-abi](https://github.com/bonsai-team/Porechop_ABI)
> Bonenfant, Q., Noé, L., & Touzet, H. (2022). Porechop_ABI: discovering unknown adapters in ONT sequencing reads for downstream trimming. In BioRxiv. doi: 10.1101/2022.07.07.499093


- [Prodigal](https://pubmed.ncbi.nlm.nih.gov/20211023/)

> Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010 Mar 8;11:119. doi: 10.1186/1471-2105-11-119. PMID: 20211023; PMCID: PMC2848648.
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ Other code contributors include:
- [Jim Downie](https://github.com/prototaxites)
- [Phil Palmer](https://github.com/PhilPalmer)
- [@willros](https://github.com/willros)
- [Adam Rosenbaum](https://github.com/muabnezor)

Long read processing was inspired by [caspargross/HybridAssembly](https://github.com/caspargross/HybridAssembly) written by Caspar Gross [@caspargross](https://github.com/caspargross)

Expand Down
6 changes: 6 additions & 0 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ run_modules:
- quast
- kraken
- prokka
- porechop
- filtlong

## Module order
top_modules:
Expand All @@ -35,6 +37,7 @@ top_modules:
- "fastp"
- "adapterRemoval"
- "porechop"
- "filtlong"
- "fastqc":
name: "FastQC: after preprocessing"
info: "After trimming and, if requested, contamination removal."
Expand Down Expand Up @@ -109,6 +112,9 @@ sp:
fn_re: ".*[kraken2|centrifuge].*report.txt"
quast:
fn_re: "report.*.tsv"
filtlong:
num_lines: 20
fn_re: ".*_filtlong.log"
jfy133 marked this conversation as resolved.
Show resolved Hide resolved

## File name cleaning
extra_fn_clean_exts:
Expand Down
32 changes: 24 additions & 8 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -171,20 +171,36 @@ process {
publishDir = [
path: { "${params.outdir}/QC_longreads/porechop" },
mode: params.publish_dir_mode,
pattern: "*_trimmed.fastq",
pattern: "*_porechop_trimmed.fastq.gz",
enabled: params.save_porechop_reads
]
ext.prefix = { "${meta.id}_run${meta.run}_trimmed" }
ext.prefix = { "${meta.id}_run${meta.run}_porechop_trimmed" }
}

withName: PORECHOP_ABI {
publishDir = [
path: { "${params.outdir}/QC_longreads/porechop" },
mode: params.publish_dir_mode,
pattern: "*_porechop-abi_trimmed.fastq.gz",
enabled: params.save_porechop_reads
]
ext.prefix = { "${meta.id}_run${meta.run}_porechop-abi_trimmed" }
}

withName: FILTLONG {
ext.args = [
"--min_length ${params.longreads_min_length}",
"--keep_percent ${params.longreads_keep_percent}",
"--trim",
"--length_weight ${params.longreads_length_weight}"
].join(' ').trim()
publishDir = [
path: { "${params.outdir}/QC_longreads/Filtlong" },
mode: params.publish_dir_mode,
pattern: "*_lr_filtlong.fastq.gz",
enabled: params.save_filtlong_reads
]
ext.prefix = { "${meta.id}_run${meta.run}_lengthfiltered" }
path: { "${params.outdir}/QC_longreads/Filtlong" },
mode: params.publish_dir_mode,
pattern: "*_filtlong.fastq.gz",
enabled: params.save_filtlong_reads
]
ext.prefix = { "${meta.id}_run${meta.run}_filtlong" }
}

withName: NANOLYSE {
Expand Down
14 changes: 14 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,20 @@ The pipeline uses Nanolyse to map the reads against the Lambda phage and removes

The pipeline uses filtlong and porechop to perform quality control of the long reads that are eventually provided with the TSV input file.


<details markdown="1">
<summary>Output files</summary>

- `QC_longreads/porechop/`
- `[sample]_[run]_porechop_trimmed.fastq.gz`: If `--longread_adaptertrimming_tool 'porechop'`, the adapter trimmed FASTQ files from porechop
- `[sample]_[run]_porechop-abi_trimmed.fastq.gz`: If `--longread_adaptertrimming_tool 'porechop_abi'`, the adapter trimmed FASTQ files from porechop_ABI
- `QC_longreads/filtlong/`
- `[sample]_[run]_filtlong.fastq.gz`: The length and quality filtered reads in FASTQ from Filtlong

</details>

Trimmed and filtered FASTQ output directories and files will only exist if `--save_porechop_reads` and/or `--save_filtlong_reads` (respectively) are provided to the run command .

No direct host read removal is performed for long reads.
However, since within this pipeline filtlong uses a read quality based on k-mer matches to the already filtered short reads, reads not overlapping those short reads might be discarded.
The lower the parameter `--longreads_length_weight`, the higher the impact of the read qualities for filtering.
Expand Down
10 changes: 10 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,11 @@
"git_sha": "285a50500f9e02578d90b3ce6382ea3c30216acd",
"installed_by": ["modules"]
},
"filtlong": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
"installed_by": ["modules"]
},
"freebayes": {
"branch": "master",
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
Expand Down Expand Up @@ -202,6 +207,11 @@
"git_sha": "3135090b46f308a260fc9d5991d7d2f9c0785309",
"installed_by": ["modules"]
},
"porechop/abi": {
"branch": "master",
"git_sha": "06c8865e36741e05ad32ef70ab3fac127486af48",
"installed_by": ["modules"]
},
"porechop/porechop": {
"branch": "master",
"git_sha": "1d68c7f248d1a480c5959548a9234602b771199e",
Expand Down
33 changes: 0 additions & 33 deletions modules/local/filtlong.nf

This file was deleted.

5 changes: 5 additions & 0 deletions modules/nf-core/filtlong/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 39 additions & 0 deletions modules/nf-core/filtlong/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

65 changes: 65 additions & 0 deletions modules/nf-core/filtlong/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading