diff --git a/Program_Licenses.md b/Program_Licenses.md index 66fdb4c9e..300eb0218 100644 --- a/Program_Licenses.md +++ b/Program_Licenses.md @@ -38,6 +38,7 @@ The licenses of the open-source software that is contained in these Docker image | datasets-sars-cov-2 | Apache 2.0 | https://github.com/CDCgov/datasets-sars-cov-2/blob/master/LICENSE | | diamond | GNU GPLv3 | https://github.com/bbuchfink/diamond/blob/master/LICENSE | | dnaapler | MIT | https://github.com/gbouras13/dnaapler/blob/main/LICENSE | +| dorado | Oxford Nanopore Technologies PLC Public License | [ONT License](https://github.com/nanoporetech/dorado/blob/master/LICENSE) | | dragonflye | GNU GPLv3 | https://github.com/rpetit3/dragonflye/blob/main/LICENSE | | drprg | MIT | https://github.com/mbhall88/drprg/blob/main/LICENSE | | DSK | GNU Affero GPLv3 | https://github.com/GATB/dsk/blob/master/LICENSE | diff --git a/README.md b/README.md index 664e4fa52..6261df935 100644 --- a/README.md +++ b/README.md @@ -143,6 +143,7 @@ To learn more about the docker pull rate limits and the open source software pro | [datasets-sars-cov-2](https://github.com/CDCgov/datasets-sars-cov-2)
[![docker pulls](https://badgen.net/docker/pulls/staphb/datasets-sars-cov-2)](https://hub.docker.com/r/staphb/datasets-sars-cov-2) | | https://github.com/CDCgov/datasets-sars-cov-2 | | [diamond](https://github.com/bbuchfink/diamond)
[![docker pulls](https://badgen.net/docker/pulls/staphb/diamond)](https://hub.docker.com/r/staphb/diamond) | | https://github.com/bbuchfink/diamond| | [dnaapler](https://hub.docker.com/r/staphb/dnaapler)
[![docker pulls](https://badgen.net/docker/pulls/staphb/dnaapler)](https://hub.docker.com/r/staphb/dnaapler) | | https://github.com/gbouras13/dnaapler | +| [dorado](https://hub.docker.com/r/staphb/dorado)
[![docker pulls](https://badgen.net/docker/pulls/staphb/dorado)](https://hub.docker.com/r/staphb/dorado) | | [GitHub Repository](https://github.com/nanoporetech/dorado) | | [dragonflye](https://hub.docker.com/r/staphb/dragonflye)
[![docker pulls](https://badgen.net/docker/pulls/staphb/dragonflye)](https://hub.docker.com/r/staphb/dragonflye) | | https://github.com/rpetit3/dragonflye | | [Dr. PRG ](https://hub.docker.com/r/staphb/drprg)
[![docker pulls](https://badgen.net/docker/pulls/staphb/drprg)](https://hub.docker.com/r/staphb/drprg) | | https://mbh.sh/drprg/ | | [DSK](https://hub.docker.com/r/staphb/dsk)
[![docker pulls](https://badgen.net/docker/pulls/staphb/dsk)](https://hub.docker.com/r/staphb/dsk) | | https://gatb.inria.fr/software/dsk/ | @@ -369,3 +370,6 @@ Each Dockerfile lists the author(s)/maintainer(s) as a metadata `LABEL`, but the * [@stephenturner](https://github.com/stephenturner) * [@soejun](https://github.com/soejun) * [@taylorpaisie](https://github.com/taylorpaisie) + * [@fraser-combe](https://github.com/fraser-combe) + + diff --git a/dorado/0.8.0/Dockerfile b/dorado/0.8.0/Dockerfile index d5664657d..523efa26f 100755 --- a/dorado/0.8.0/Dockerfile +++ b/dorado/0.8.0/Dockerfile @@ -19,9 +19,9 @@ LABEL maintainer.email="fraser.combe@theiagen.com" WORKDIR /usr/src/app # Install dependencies -RUN apt-get update && apt-get install -y \ - build-essential \ - wget +RUN apt-get update && \ + apt-get install -y --no-install-recommends wget ca-certificates && \ + rm -rf /var/lib/apt/lists/* && apt-get autoclean # Download and extract Dorado package RUN wget https://cdn.oxfordnanoportal.com/software/analysis/dorado-${DORADO_VER}-linux-x64.tar.gz \ @@ -36,11 +36,6 @@ RUN mkdir /dorado_models && \ cd /dorado_models && \ dorado download --model all -# Download the specific Pod5 test file -RUN wget -O /usr/src/app/dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5 \ - https://github.com/nanoporetech/dorado/raw/release-v0.7/tests/data/pod5/dna_r10.4.1_e8.2_260bps/\ -dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5 - # Default command CMD ["dorado"] @@ -49,6 +44,12 @@ CMD ["dorado"] # ----------------------------- FROM app AS test + +# Download the specific Pod5 test file +RUN wget -O /usr/src/app/dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5 \ + https://github.com/nanoporetech/dorado/raw/release-v0.7/tests/data/pod5/dna_r10.4.1_e8.2_260bps/\ +dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5 + # Set working directory WORKDIR /usr/src/app diff --git a/dorado/0.8.0/README.md b/dorado/0.8.0/README.md index 0fbc039ff..82534b3a3 100644 --- a/dorado/0.8.0/README.md +++ b/dorado/0.8.0/README.md @@ -20,8 +20,7 @@ This Docker image includes: - **Dorado**: Version **0.8.0**, a tool for basecalling Oxford Nanopore sequencing data. - **NVIDIA CUDA**: Version **12.2.0**, for GPU acceleration (requires NVIDIA GPU). -- **Pre-downloaded basecalling models**: All models are downloaded during the build. -- **Sample Pod5 test file**: Included for testing the basecalling process. +- **Pre-downloaded basecalling models**: All models are downloaded during the build process for basecalling. ## Requirements @@ -29,14 +28,6 @@ This Docker image includes: - **NVIDIA GPU and Drivers**: Installed and configured. - **NVIDIA Container Toolkit**: To enable GPU support in Docker containers. -## Building the Docker Image - - **Build the Docker image** using the following command: - - ```bash - docker build -t dorado-image . - ``` - ## Running the Docker Container To run the Dorado tool within the Docker container, use the following command: @@ -49,7 +40,12 @@ This command will display the help information for Dorado, confirming that it's ## Testing the Docker Image -To test that Dorado is working correctly, perform a basecalling operation using the provided sample Pod5 file and basecalling models. +To test that Dorado is working correctly, you will need to download a sample Pod5 file and perform a basecalling operation using the pre-downloaded basecalling models. + +```bash +wget -O dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5 \ + https://github.com/nanoporetech/dorado/raw/release-v0.7/tests/data/pod5/dna_r10.4.1_e8.2_260bps/dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5 + ### Basecalling Test @@ -74,16 +70,142 @@ docker run --gpus all -v $(pwd):/usr/src/app -it dorado-image bash -c "\ Check the output file to ensure basecalling was successful: ```bash -less basecalled.sam +samtools view basecalled.sam ``` You should see SAM-formatted basecalling results. ## Additional Notes -- **Basecalling Models**: All models are downloaded to `/dorado_models` during the build process. - **Sample Data**: The sample Pod5 file is downloaded to `/usr/src/app` during the build. - **Internal Testing**: An internal test stage is included in the Dockerfile to verify installation. +- **Basecalling Models**: All models are downloaded to `/dorado_models` during the build process. + Below is the list of basecalling models included in the Docker image: + ```yaml + + modification models: + - "dna_r9.4.1_e8_fast@v3.4_5mCG@v0.1" + - "dna_r9.4.1_e8_hac@v3.3_5mCG@v0.1" + - "dna_r9.4.1_e8_sup@v3.3_5mCG@v0.1" + - "dna_r9.4.1_e8_fast@v3.4_5mCG_5hmCG@v0" + - "dna_r9.4.1_e8_hac@v3.3_5mCG_5hmCG@v0" + - "dna_r9.4.1_e8_sup@v3.3_5mCG_5hmCG@v0" + - "dna_r10.4.1_e8.2_260bps_fast@v3.5.2_5mCG@v2" + - "dna_r10.4.1_e8.2_260bps_hac@v3.5.2_5mCG@v2" + - "dna_r10.4.1_e8.2_260bps_sup@v3.5.2_5mCG@v2" + - "dna_r10.4.1_e8.2_400bps_fast@v3.5.2_5mCG@v2" + - "dna_r10.4.1_e8.2_400bps_hac@v3.5.2_5mCG@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v3.5.2_5mCG@v2" + - "dna_r10.4.1_e8.2_260bps_fast@v4.0.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_260bps_hac@v4.0.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_260bps_sup@v4.0.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_fast@v4.0.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_hac@v4.0.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v4.0.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_260bps_fast@v4.1.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_260bps_hac@v4.1.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_260bps_sup@v4.1.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_fast@v4.1.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_hac@v4.1.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v4.1.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_fast@v4.2.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_hac@v4.2.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v4.2.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v4.2.0_5mCG_5hmCG@v3.1" + - "dna_r10.4.1_e8.2_400bps_sup@v4.2.0_5mC@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v4.2.0_6mA@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v4.2.0_6mA@v3" + - "dna_r10.4.1_e8.2_400bps_sup@v4.2.0_5mC_5hmC@v1" + - "dna_r10.4.1_e8.2_400bps_hac@v4.3.0_5mC_5hmC@v1" + - "dna_r10.4.1_e8.2_400bps_sup@v4.3.0_5mC_5hmC@v1" + - "dna_r10.4.1_e8.2_400bps_hac@v4.3.0_6mA@v1" + - "dna_r10.4.1_e8.2_400bps_sup@v4.3.0_6mA@v1" + - "dna_r10.4.1_e8.2_400bps_hac@v4.3.0_6mA@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v4.3.0_6mA@v2" + - "dna_r10.4.1_e8.2_400bps_hac@v4.3.0_5mCG_5hmCG@v1" + - "dna_r10.4.1_e8.2_400bps_sup@v4.3.0_5mCG_5hmCG@v1" + - "dna_r10.4.1_e8.2_400bps_hac@v5.0.0_4mC_5mC@v1" + - "dna_r10.4.1_e8.2_400bps_sup@v5.0.0_4mC_5mC@v1" + - "dna_r10.4.1_e8.2_400bps_hac@v5.0.0_4mC_5mC@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v5.0.0_4mC_5mC@v2" + - "dna_r10.4.1_e8.2_400bps_hac@v5.0.0_5mC_5hmC@v1" + - "dna_r10.4.1_e8.2_400bps_sup@v5.0.0_5mC_5hmC@v1" + - "dna_r10.4.1_e8.2_400bps_hac@v5.0.0_5mC_5hmC@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v5.0.0_5mC_5hmC@v2" + - "dna_r10.4.1_e8.2_400bps_hac@v5.0.0_5mCG_5hmCG@v1" + - "dna_r10.4.1_e8.2_400bps_sup@v5.0.0_5mCG_5hmCG@v1" + - "dna_r10.4.1_e8.2_400bps_hac@v5.0.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v5.0.0_5mCG_5hmCG@v2" + - "dna_r10.4.1_e8.2_400bps_hac@v5.0.0_6mA@v1" + - "dna_r10.4.1_e8.2_400bps_sup@v5.0.0_6mA@v1" + - "dna_r10.4.1_e8.2_400bps_hac@v5.0.0_6mA@v2" + - "dna_r10.4.1_e8.2_400bps_sup@v5.0.0_6mA@v2" + - "rna004_130bps_sup@v3.0.1_m6A_DRACH@v1" + - "rna004_130bps_hac@v5.0.0_m6A@v1" + - "rna004_130bps_sup@v5.0.0_m6A@v1" + - "rna004_130bps_hac@v5.0.0_m6A_DRACH@v1" + - "rna004_130bps_sup@v5.0.0_m6A_DRACH@v1" + - "rna004_130bps_hac@v5.0.0_pseU@v1" + - "rna004_130bps_sup@v5.0.0_pseU@v1" + - "rna004_130bps_hac@v5.1.0_m5C@v1" + - "rna004_130bps_sup@v5.1.0_m5C@v1" + - "rna004_130bps_hac@v5.1.0_inosine_m6A@v1" + - "rna004_130bps_sup@v5.1.0_inosine_m6A@v1" + - "rna004_130bps_hac@v5.1.0_m6A_DRACH@v1" + - "rna004_130bps_sup@v5.1.0_m6A_DRACH@v1" + - "rna004_130bps_hac@v5.1.0_pseU@v1" + - "rna004_130bps_sup@v5.1.0_pseU@v1" + stereo models: + - "dna_r10.4.1_e8.2_4khz_stereo@v1.1" + - "dna_r10.4.1_e8.2_4khz_stereo@v1.1" + - "dna_r10.4.1_e8.2_5khz_stereo@v1.1" + - "dna_r10.4.1_e8.2_5khz_stereo@v1.2" + - "dna_r10.4.1_e8.2_5khz_stereo@v1.3" + simplex models: + - "dna_r9.4.1_e8_fast@v3.4" + - "dna_r9.4.1_e8_hac@v3.3" + - "dna_r9.4.1_e8_sup@v3.3" + - "dna_r9.4.1_e8_sup@v3.6" + - "dna_r10.4.1_e8.2_260bps_fast@v3.5.2" + - "dna_r10.4.1_e8.2_260bps_hac@v3.5.2" + - "dna_r10.4.1_e8.2_260bps_sup@v3.5.2" + - "dna_r10.4.1_e8.2_400bps_fast@v3.5.2" + - "dna_r10.4.1_e8.2_400bps_hac@v3.5.2" + - "dna_r10.4.1_e8.2_400bps_sup@v3.5.2" + - "dna_r10.4.1_e8.2_260bps_fast@v4.0.0" + - "dna_r10.4.1_e8.2_260bps_hac@v4.0.0" + - "dna_r10.4.1_e8.2_260bps_sup@v4.0.0" + - "dna_r10.4.1_e8.2_400bps_fast@v4.0.0" + - "dna_r10.4.1_e8.2_400bps_hac@v4.0.0" + - "dna_r10.4.1_e8.2_400bps_sup@v4.0.0" + - "dna_r10.4.1_e8.2_260bps_fast@v4.1.0" + - "dna_r10.4.1_e8.2_260bps_hac@v4.1.0" + - "dna_r10.4.1_e8.2_260bps_sup@v4.1.0" + - "dna_r10.4.1_e8.2_400bps_fast@v4.1.0" + - "dna_r10.4.1_e8.2_400bps_hac@v4.1.0" + - "dna_r10.4.1_e8.2_400bps_sup@v4.1.0" + - "dna_r10.4.1_e8.2_400bps_fast@v4.2.0" + - "dna_r10.4.1_e8.2_400bps_hac@v4.2.0" + - "dna_r10.4.1_e8.2_400bps_sup@v4.2.0" + - "dna_r10.4.1_e8.2_400bps_fast@v4.3.0" + - "dna_r10.4.1_e8.2_400bps_hac@v4.3.0" + - "dna_r10.4.1_e8.2_400bps_sup@v4.3.0" + - "dna_r10.4.1_e8.2_400bps_fast@v5.0.0" + - "dna_r10.4.1_e8.2_400bps_hac@v5.0.0" + - "dna_r10.4.1_e8.2_400bps_sup@v5.0.0" + - "dna_r10.4.1_e8.2_apk_sup@v5.0.0" + - "rna002_70bps_fast@v3" + - "rna002_70bps_hac@v3" + - "rna004_130bps_fast@v3.0.1" + - "rna004_130bps_hac@v3.0.1" + - "rna004_130bps_sup@v3.0.1" + - "rna004_130bps_fast@v5.0.0" + - "rna004_130bps_hac@v5.0.0" + - "rna004_130bps_sup@v5.0.0" + - "rna004_130bps_fast@v5.1.0" + - "rna004_130bps_hac@v5.1.0" + - "rna004_130bps_sup@v5.1.0" + ``` ## License