Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding mgikit demultiplexing #299

Open
wants to merge 10 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ On release, automated continuous integration tests run the pipeline on a full-si
- [sgdemux](#sgdemux) - demultiplexing bgzipped fastq files produced by Singular Genomics (CONDITIONAL)
- [fqtk](#fqtk) - a toolkit for working with FASTQ files, written in Rust (CONDITIONAL)
- [mkfastq](#mkfastq) - converting bcl files to fastq, and demultiplexing for single-cell sequencing data (CONDITIONAL)
- [mgikit](#mgikit) - Demultiplex fastq files generated by MGI sequencers using [mgikit](https://github.com/sagc-bioinformatics/mgikit) (CONDITIONAL).

3. [checkqc](#checkqc) - (optional) Check quality criteria after demultiplexing (bcl2fastq only)
4. [fastp](#fastp) - Adapter and quality trimming
Expand Down
32 changes: 32 additions & 0 deletions conf/test_mgikit.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple pipeline test.

Use as follows:
nextflow run nf-core/demultiplex -profile test,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

// Limit resources so that this can run on GitHub Actions
process {
resourceLimits = [
cpus: 1,
memory: '7.GB',
time: '4.h'
]
}

params {
config_profile_name = 'Test mgikit profile'
config_profile_description = 'Minimal test dataset to check pipeline function with mgikit'

// Input data
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/refs/heads/demultiplex/testdata/mgi/mgikit_input.csv'
demultiplexer = 'mgikit'
skip_tools = "checkqc,samshee"
}


5 changes: 5 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [sgdemux](#sgdemux) - demultiplexing bgzipped fastq files produced by Singular Genomics (CONDITIONAL)
- [fqtk](#fqtk) - demultiplexing fastq files (CONDITIONAL)
- [mkfastq](#mkfastq) - converting bcl files to fastq, and demultiplexing for single-cell sequencing data (CONDITIONAL)
- [mgikit](#mgikit) - Demultiplex fastq files generated by MGI sequencers using [mgikit](https://github.com/sagc-bioinformatics/mgikit) (CONDITIONAL).
- [checkqc](#checkqc) - (optional) Check quality criteria after demultiplexing (bcl2fastq only)
- [fastp](#fastp) - Adapter and quality trimming
- [Falco](#falco) - Raw read QC
Expand Down Expand Up @@ -136,6 +137,10 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d

</details>

### mgikit

[mgikit](https://github.com/sagc-bioinformatics/mgikit) demultiplexes fastq files generated by MGI sequencers (CONDITIONAL).

### fastp

<details markdown="1">
Expand Down
19 changes: 10 additions & 9 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ When using the demultiplexer fqtk, the _pipeline_ samplesheet must contain an ad
--input '[path to pipeline samplesheet file]'
```

#### Example: Pipeline samplesheet
### Example: Pipeline samplesheet

```csv title="samplesheet.csv"
id,samplesheet,lane,flowcell
Expand All @@ -32,18 +32,18 @@ DDMMYY_SERIAL_NUMBER_FC2,/path/to/SampleSheet2.csv,1,/path/to/sequencer/output2
DDMMYY_SERIAL_NUMBER_FC3,/path/to/SampleSheet3.csv,3,/path/to/sequencer/output3
```

| Column | Description |
| ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- |
| `id` | Flowcell id |
| `samplesheet` | Full path to the _flowcell_ `SampleSheet.csv` file containing the sample information and indexes |
| `lane` | Optional lane number. When a lane number is provided, only the given lane will be demultiplexed |
| `flowcell` | Full path to the Illumina sequencer output directory (often referred as run directory) or a `tar.gz` file containing the contents of said directory |
| Column | Description |
| ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `id` | Flowcell id |
| `samplesheet` | Full path to the _flowcell_ `SampleSheet.csv` file containing the sample information and indexes |
| `lane` | Optional lane number. When a lane number is provided, only the given lane will be demultiplexed |
| `flowcell` | Full path to the Illumina sequencer output directory (often referred as run directory) or a `tar.gz` file containing the contents of said directory. `mgikit` demultiplexing expects a path to a directory here containing the compressed fastq files and `BioInfo.csv` file. |

An [example _pipeline_ samplesheet](https://raw.githubusercontent.com/nf-core/test-datasets/demultiplex/samplesheet/1.3.0/flowcell_input.csv) has been provided with the pipeline.

Note that the run directory in the `flowcell` column must lead to a `tar.gz` for compatibility with the demultiplexers sgdemux and fqtk.

#### Example: Pipeline samplesheet for fqtk
### Example: Pipeline samplesheet for fqtk

```csv title="samplesheet.csv"
id,samplesheet,lane,flowcell,per_flowcell_manifest
Expand All @@ -70,6 +70,7 @@ Each demultiplexing software uses a distinct _flowcell_ samplesheet format. Belo
| **sgdemux** | [sgdemux SampleSheet.csv](https://github.com/nf-core/test-datasets/blob/demultiplex/testdata/sim-data/out.sample_meta.csv) |
| **fqtk** | [fqtk SampleSheet.csv](https://github.com/fulcrumgenomics/nf-core-test-datasets/raw/fqtk/testdata/sim-data/fqtk_samplesheet.csv) |
| **bcl2fastq and bclconvert** | [bcl2fastq and bclconvert SampleSheet.csv](https://raw.githubusercontent.com/nf-core/test-datasets/demultiplex/samplesheet/1.3.0/b2fq-samplesheet.csv) |
| **mgikit** | [mgikit samplesheet.csv](https://github.com/nf-core/test-datasets/blob/demultiplex/testdata/mgi/fc01_sample_sheet.csv) |

## Running the pipeline

Expand Down Expand Up @@ -198,7 +199,7 @@ If `-profile` is not specified, the pipeline will run locally and expect all sof
- `apptainer`
- A generic configuration profile to be used with [Apptainer](https://apptainer.org/)
- `wave`
- A generic configuration profile to enable [Wave](https://seqera.io/wave/) containers. Use together with one of the above (requires Nextflow ` 24.03.0-edge` or later).
- A generic configuration profile to enable [Wave](https://seqera.io/wave/) containers. Use together with one of the above (requires Nextflow `24.03.0-edge` or later).
- `conda`
- A generic configuration profile to be used with [Conda](https://conda.io/docs/). Please only use Conda as a last resort i.e. when it's not possible to run the pipeline with Docker, Singularity, Podman, Shifter, Charliecloud, or Apptainer.

Expand Down
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,11 @@
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
"installed_by": ["modules"]
},
"mgikit/demultiplex": {
"branch": "master",
"git_sha": "0bf42a3bdf105ddc58f6cc5523c86b4617c4ed04",
"installed_by": ["modules"]
},
"multiqc": {
"branch": "master",
"git_sha": "cf17ca47590cc578dfb47db1c2a44ef86f89976d",
Expand Down
5 changes: 5 additions & 0 deletions modules/nf-core/mgikit/demultiplex/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

81 changes: 81 additions & 0 deletions modules/nf-core/mgikit/demultiplex/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

152 changes: 152 additions & 0 deletions modules/nf-core/mgikit/demultiplex/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading