This Dockerfile sets up an environment for running Dorado, a tool for basecalling Fast5/Pod5 files from Oxford Nanopore sequencing.
- Introduction
- Requirements
- Building the Docker Image
- Running the Docker Container
- Testing the Docker Image
- Basecalling Test
- Verifying the Output
- Additional Notes
- License
This Docker image includes:
- Dorado: Version 0.8.0, a tool for basecalling Oxford Nanopore sequencing data.
- NVIDIA CUDA: Version 12.2.0, for GPU acceleration (requires NVIDIA GPU).
- Pre-downloaded basecalling models: All models are downloaded during the build process for basecalling.
- Docker: Installed on your system.
- NVIDIA GPU and Drivers: Installed and configured.
- NVIDIA Container Toolkit: To enable GPU support in Docker containers.
To run the Dorado tool within the Docker container, use the following command:
docker run --gpus all -it dorado-image dorado --help
This command will display the help information for Dorado, confirming that it's installed correctly.
To test that Dorado is working correctly, you will need to download a sample Pod5 file and perform a basecalling operation using the pre-downloaded basecalling models.
wget -O dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5 \
https://github.com/nanoporetech/dorado/raw/release-v0.7/tests/data/pod5/dna_r10.4.1_e8.2_260bps/dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5
### Basecalling Test
Run the following command:
```bash
docker run --gpus all -v $(pwd):/usr/src/app -it dorado-image bash -c "\
dorado basecaller /dorado_models/[email protected] \
/usr/src/app/dna_r10.4.1_e8.2_260bps-FLO_PRO114-SQK_NBD114_96_260-4000.pod5 \
--emit-moves > /usr/src/app/basecalled.sam"
Explanation:
--gpus all
: Enables GPU support.-v $(pwd):/usr/src/app
: Mounts the current directory to/usr/src/app
inside the container.bash -c "...":
Runs the basecalling command inside the container.> /usr/src/app/basecalled.sam
: Redirects the output tobasecalled.sam
in your current directory.
Check the output file to ensure basecalling was successful:
samtools view basecalled.sam
You should see SAM-formatted basecalling results.
- Sample Data: The sample Pod5 file is downloaded to
/usr/src/app
during the docker image build.- Note: If you are using the pre-built StaPH-B docker image downloaded from dockerhub or quay.io, it will only include the
app
stage. This means that the sample Pod5 file will not be available in the container. You will need to download the sample Pod5 file manually using thewget
example command shown above.
- Note: If you are using the pre-built StaPH-B docker image downloaded from dockerhub or quay.io, it will only include the
- Internal Testing: An internal test stage is included in the Dockerfile to verify installation.
- Basecalling Models: All models are downloaded to
/dorado_models
during the build process. Below is the list of basecalling models included in the Docker image:modification models: - "[email protected][email protected]" - "[email protected][email protected]" - "[email protected][email protected]" - "[email protected]_5mCG_5hmCG@v0" - "[email protected]_5mCG_5hmCG@v0" - "[email protected]_5mCG_5hmCG@v0" - "[email protected]_5mCG@v2" - "[email protected]_5mCG@v2" - "[email protected]_5mCG@v2" - "[email protected]_5mCG@v2" - "[email protected]_5mCG@v2" - "[email protected]_5mCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected][email protected]" - "[email protected]_5mC@v2" - "[email protected]_6mA@v2" - "[email protected]_6mA@v3" - "[email protected]_5mC_5hmC@v1" - "[email protected]_5mC_5hmC@v1" - "[email protected]_5mC_5hmC@v1" - "[email protected]_6mA@v1" - "[email protected]_6mA@v1" - "[email protected]_6mA@v2" - "[email protected]_6mA@v2" - "[email protected]_5mCG_5hmCG@v1" - "[email protected]_5mCG_5hmCG@v1" - "[email protected]_4mC_5mC@v1" - "[email protected]_4mC_5mC@v1" - "[email protected]_4mC_5mC@v2" - "[email protected]_4mC_5mC@v2" - "[email protected]_5mC_5hmC@v1" - "[email protected]_5mC_5hmC@v1" - "[email protected]_5mC_5hmC@v2" - "[email protected]_5mC_5hmC@v2" - "[email protected]_5mCG_5hmCG@v1" - "[email protected]_5mCG_5hmCG@v1" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_5mCG_5hmCG@v2" - "[email protected]_6mA@v1" - "[email protected]_6mA@v1" - "[email protected]_6mA@v2" - "[email protected]_6mA@v2" - "[email protected]_m6A_DRACH@v1" - "[email protected]_m6A@v1" - "[email protected]_m6A@v1" - "[email protected]_m6A_DRACH@v1" - "[email protected]_m6A_DRACH@v1" - "[email protected]_pseU@v1" - "[email protected]_pseU@v1" - "[email protected]_m5C@v1" - "[email protected]_m5C@v1" - "[email protected]_inosine_m6A@v1" - "[email protected]_inosine_m6A@v1" - "[email protected]_m6A_DRACH@v1" - "[email protected]_m6A_DRACH@v1" - "[email protected]_pseU@v1" - "[email protected]_pseU@v1" stereo models: - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" simplex models: - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "rna002_70bps_fast@v3" - "rna002_70bps_hac@v3" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]" - "[email protected]"
Dorado is licensed under Oxford Nanopore Technologies' License.
Note: Please ensure that you have the necessary NVIDIA drivers and the NVIDIA Container Toolkit installed to utilize GPU acceleration.