Skip to content

Commit

Permalink
Merge pull request #773 from Kincekara/checkm
Browse files Browse the repository at this point in the history
adds checkm
  • Loading branch information
erinyoung authored Oct 31, 2023
2 parents 638fa31 + 3f78628 commit 55d0b32
Show file tree
Hide file tree
Showing 4 changed files with 87 additions and 0 deletions.
1 change: 1 addition & 0 deletions Program_Licenses.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ The licenses of the open-source software that is contained in these Docker image
| centroid | GitHub No License | https://github.com/https://github.com/stjacqrm/centroid |
| CDC-SPN | GitHub No License | https://github.com/BenJamesMetcalf/Spn_Scripts_Reference |
| cfsan-snp-pipeline | non-standard license see --> | https://github.com/CFSAN-Biostatistics/snp-pipeline/blob/master/LICENSE.txt |
| CheckM | GNU GPLv3 | https://github.com/Ecogenomics/CheckM/blob/master/LICENSE |
| Circlator | GNU GPLv3 | https://github.com/sanger-pathogens/circlator/blob/master/LICENSE |
| colorid | MIT | https://github.com/hcdenbakker/colorid/blob/master/LICENSE |
| datasets-sars-cov-2 | Apache 2.0 | https://github.com/CDCgov/datasets-sars-cov-2/blob/master/LICENSE |
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ To learn more about the docker pull rate limits and the open source software pro
| [centroid](https://hub.docker.com/r/staphb/centroid/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/centroid)](https://hub.docker.com/r/staphb/centroid) | <ul><li>1.0.0</li></ul> | https://github.com/stjacqrm/centroid |
| [CDC-SPN](https://hub.docker.com/r/staphb/cdc-spn/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/cdc-spn)](https://hub.docker.com/r/staphb/cdc-spn) | <ul><li>0.1 (no version)</li></ul> | https://github.com/BenJamesMetcalf/Spn_Scripts_Reference |
| [cfsan-snp-pipeline](https://hub.docker.com/r/staphb/cfsan-snp-pipeline) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/cfsan-snp-pipeline)](https://hub.docker.com/r/staphb/cfsan-snp-pipeline) | <ul><li>2.0.2</li> <li>2.2.1</li> </ul> | https://github.com/CFSAN-Biostatistics/snp-pipeline |
| [CheckM](https://hub.docker.com/r/staphb/checkm) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/checkm)](https://hub.docker.com/r/staphb/checkm) | <ul><li>[1.2.2](./checkm/1.2.2)</li></ul> | https://github.com/Ecogenomics/CheckM |
| [Circlator](https://hub.docker.com/r/staphb/circlator) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/circlator)](https://hub.docker.com/r/staphb/circlator) | <ul><li>1.5.6</li><li>1.5.5</li></ul> | https://github.com/sanger-pathogens/circlator |
| [Clustalo](https://hub.docker.com/r/staphb/clustalo) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/clustalo)](https://hub.docker.com/r/staphb/clustalo) | <ul><li>1.2.4</li></ul> | http://www.clustal.org/omega/ |
| [colorid](https://hub.docker.com/r/staphb/colorid) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/colorid)](https://hub.docker.com/r/staphb/colorid) | <ul><li>0.1.4.3</li></ul> | https://github.com/hcdenbakker/colorid |
Expand Down
49 changes: 49 additions & 0 deletions checkm/1.2.2/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
FROM ubuntu:jammy as app

ARG CHECKM_VER="1.2.2"
ARG PPLACER_VER="v1.1.alpha19"

LABEL base.image="ubuntu:jammy"
LABEL dockerfile.version="1"
LABEL software="CheckM"
LABEL software.version="${CHECKM_VER}"
LABEL description="CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes."
LABEL website="dockerfile-template/Dockerfile dockerfile-template/README.md"
LABEL license="https://github.com/Ecogenomics/CheckM/blob/master/LICENSE"
LABEL maintainer="Kutluhan Incekara"
LABEL maintainer.email="[email protected]"

# install system requirements
RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
unzip \
python3-pip \
hmmer \
prodigal &&\
apt-get autoclean && rm -rf /var/lib/apt/lists/*

# install checkm and its dependencies
RUN pip install --no-cache-dir numpy matplotlib pysam checkm-genome &&\
wget https://github.com/matsen/pplacer/releases/download/${PPLACER_VER}/pplacer-linux-${PPLACER_VER}.zip && \
unzip pplacer-linux-${PPLACER_VER}.zip && rm pplacer-linux-${PPLACER_VER}.zip

ENV PATH=$PATH:/pplacer-Linux-${PPLACER_VER} \
LC_ALL=C

# 'CMD' instructions set a default command when the container is run.
CMD [ "checkm", "-h" ]

# 'WORKDIR' sets working directory
WORKDIR /data

## Test ##
FROM app as test

# download database and inform CheckM of where the files have been placed
RUN wget https://data.ace.uq.edu.au/public/CheckM_databases/checkm_data_2015_01_16.tar.gz &&\
mkdir checkm_db && tar -C checkm_db -xvf checkm_data_2015_01_16.tar.gz &&\
checkm data setRoot checkm_db

# run test with internal test data
RUN checkm taxonomy_wf species "Escherichia coli" ./checkm_db/test_data/ ./checkm_test_results

36 changes: 36 additions & 0 deletions checkm/1.2.2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# CheckM container

Main tool: [CheckM](https://github.com/Ecogenomics/CheckM)

Code repository: https://github.com/Ecogenomics/CheckM

Additional tools:
- HHMER: 3.3.2
- prodigal: 2.6.3
- pplacer: 1.1.alpha19-0-g807f6f3

Basic information on how to use this tool:
- executable: checkm
- help: <-h>
- version: <-h>
- description: CheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes.

Additional information:

This container does not include precalculated data files that CheckM relies on</br>
Those files can be downloaded from either:
- https://data.ace.uq.edu.au/public/CheckM_databases
- https://zenodo.org/record/7401545#.Y44ymHbMJD8

The reference data must be decompress into a directory. Inform CheckM of where the files have been placed with the following command:
```
checkm data setRoot <checkm_database_dir>
```

Full documentation: https://github.com/Ecogenomics/CheckM/wiki

## Example Usage

```bash
checkm lineage_wf -t 8 -x fasta input_folder output_folder
```

0 comments on commit 55d0b32

Please sign in to comment.