Skip to content

Latest commit

 

History

History
60 lines (38 loc) · 3.05 KB

kaust.md

File metadata and controls

60 lines (38 loc) · 3.05 KB

nf-core/configs: KAUST Configuration

manage the pipeline jobs via the nf-core/configs: KAUST Configuration

The purpose of this custom configurations is to streamline executing nf-core pipelines on the KAUST Ibex cluster.

Getting help

We have a wiki page dedicated to the Bioinformatics team at KAUST to help users: Bioinformatics Workflows.

Using the KAUST config profile

The recommended way to activate Nextflow, that is needed to run the nf-core workflows on Ibex, is to use the module system:

# Log in to the desired cluster
ssh <USER>@ilogin.ibex.kaust.edu.sa

# Activate the modules, you can also choose to use a specific version with e.g. `Nextflow/24.04.4`.
module load nextflow

Launch the pipeline with -profile kaust (one hyphen) to run the workflows using the KAUST profile. This will download and launch the kaust.config which has been pre-configured with a setup suitable for the KAUST servers. It will enable Nextflow to manage the pipeline jobs via the Slurm job scheduler and Singularity to run the tasks. Using the KAUST profile, Docker image(s) containing required software(s) will be downloaded, and converted to Singularity image(s) if needed before execution of the pipeline. To avoid downloading same images by multiple users, we provide a singularity libraryDir that is configured to use images already downloaded in our central container library. Images missing from our library will be downloaded to the user's directory as defined by cacheDir.

Additionally, institute-specific pipeline profiles exists for:

  • mag
  • rnaseq

Accessing reference genomes on Ibex

We provide a collection of reference genomes, enabling users to run workflows seamlessly without the need to download the files. To enable access to this resource, add the species name with the --genome parameter.

Run workflows on Ibex

The KAUST profile makes running the nf-core workflows as simple as:

# Load Nextflow and Singularity modules
module purge
module load nextflow
module load singularity

# Launch nf-core pipeline with the kaust profile, e.g. for analyzing human data:
$ nextflow run nf-core/<PIPELINE> -profile kaust -r <PIPELINE_VERSION> --genome GRCh38.p14 --samplesheet input.csv [...]

Where input_csv contains information about the samples and datafile paths.

Remember to use -bg to launch Nextflow in the background, so that the pipeline doesn't exit if you leave your terminal session. Alternatively, you can also launch a tmux or a screen session to run the commands above. Another good way, is to run it as an independent sbatch job as explained here.

Workflow specific profiles

Please let us know if there are particular processes that continously fail so that we modify the defaults in the corresponding pipeline profile.