Skip to content

Commit

Permalink
updating the emmtyper blast database
Browse files Browse the repository at this point in the history
  • Loading branch information
erinyoung committed Oct 28, 2024
1 parent 1610fdf commit 3636948
Show file tree
Hide file tree
Showing 3 changed files with 123 additions and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ To learn more about the docker pull rate limits and the open source software pro
| [DSK](https://hub.docker.com/r/staphb/dsk) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/dsk)](https://hub.docker.com/r/staphb/dsk) | <ul><li>[0.0.100](./dsk/0.0.100/)</li><li>[2.3.3](./dsk/2.3.3/)</li></ul> | https://gatb.inria.fr/software/dsk/ |
| [el_gato](https://hub.docker.com/r/staphb/elgato) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/elgato)](https://hub.docker.com/r/staphb/elgato) | <ul><li>[1.15.2](./elgato/1.15.2)</li><li>[1.18.2](./elgato/1.18.2)</li><li>[1.19.0](./elgato/1.19.0)</li>[1.20.0](./elgato/1.20.0)</li></ul> | https://github.com/appliedbinf/el_gato |
| [emboss](https://hub.docker.com/r/staphb/emboss) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/emboss)](https://hub.docker.com/r/staphb/emboss) | <ul><li>6.6.0 (no version)</li></ul> | http://emboss.sourceforge.net |
| [emmtyper](https://hub.docker.com/r/staphb/emmtyper) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/emmtyper)](https://hub.docker.com/r/staphb/emmtyper) | <ul><li>0.2.0</li></ul> | https://github.com/MDU-PHL/emmtyper |
| [emmtyper](https://hub.docker.com/r/staphb/emmtyper) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/emmtyper)](https://hub.docker.com/r/staphb/emmtyper) | <ul><li>[0.2.0](./emmtyper/0.2.0/)</li><li>[0.2.0](./emmtyper/0.2.0-20241028/)</li></ul> | https://github.com/MDU-PHL/emmtyper |
| [emm-typing-tool](https://hub.docker.com/r/staphb/emm-typing-tool) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/emm-typing-tool)](https://hub.docker.com/r/staphb/emm-typing-tool) | <ul><li>0.0.1 (no version)</li></ul> | https://github.com/phe-bioinformatics/emm-typing-tool |
| [EToKi](https://hub.docker.com/r/staphb/etoki) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/etoki)](https://hub.docker.com/r/staphb/etoki) | <ul><li>1.2.1</li></ul> | https://github.com/zheminzhou/EToKi |
| [FastANI](https://hub.docker.com/r/staphb/fastani) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/fastani)](https://hub.docker.com/r/staphb/fastani) | <ul><li>1.1</li><li>1.32</li><li>1.33</li><li>1.33 + RGDv2</li><li>[1.34](fastani/1.34)</li><li>[1.34 + RGDv2](fastani/1.34-RGDV2/)</li></ul> | https://github.com/ParBLiSS/FastANI |
Expand Down
79 changes: 79 additions & 0 deletions emmtyper/0.2.0-20241028/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
FROM mambaorg/micromamba:1.5.8 as app

ARG EMMTYPER_VER="0.2.0"
ARG SCRIPT_HASH="c0d1c26625cfe9ac458306089358dc26edad06f0"

# build and run as root users since micromamba image has 'mambauser' set as the $USER
USER root
# set workdir to default for building
WORKDIR /

LABEL base.image="mambaorg/micromamba:1.5.8"
LABEL dockerfile.version="1"
LABEL software="emmtyper"
LABEL software.version=${EMMTYPER_VER}
LABEL description="Conda environment for emmtyper. emmtyper is a command line tool for emm-typing of Streptococcus pyogenes using a de novo or complete assembly."
LABEL website="https://github.com/MDU-PHL/emmtyper"
LABEL license="GNU General Public License v3.0"
LABEL license.url="https://github.com/MDU-PHL/emmtyper/blob/master/LICENSE"
LABEL maintainer="Erin Young"
LABEL maintainer.email="[email protected]"

# install dependencies; cleanup apt garbage
RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
ca-certificates \
procps \
unzip && \
apt-get autoclean && rm -rf /var/lib/apt/lists/*

#install emmtyper and dependencies
RUN micromamba create -n emmtyper -c conda-forge -c bioconda -c defaults emmtyper=${EMMTYPER_VER} pip && \
micromamba clean -a -y -f

# install script for downloading emmtyper database
RUN wget -q https://github.com/Daniel-VM/cdc-utilities/archive/${SCRIPT_HASH}.zip && \
unzip ${SCRIPT_HASH}.zip && \
echo '#!/usr/bin/env python3' > /usr/local/bin/emm_download_makedb.py && \
cat cdc-utilities*/emm_download_makedb.py >> /usr/local/bin/emm_download_makedb.py && \
rm -rf ${SCRIPT_HASH}.zip cdc-utilities* && \
chmod +x /usr/local/bin/emm_download_makedb.py

# set the environment, put new conda env in PATH by default
ENV PATH="/opt/conda/envs/emmtyper/bin:/opt/conda/envs/env/bin:${PATH}" \
LC_ALL=C.UTF-8

RUN pip install --no-cache-dir requests beautifulsoup4

CMD emmtyper --help && emm_download_makedb.py -h

WORKDIR /cdc_emm_database

# get latest emmtyper database
RUN emm_download_makedb.py \
--ftp_url 'https://ftp.cdc.gov/' \
--remote_path 'pub/infectious_diseases/biotech/tsemm/' \
--local_path /cdc_emm_database

# create a blast database without a date in the filename
RUN makeblastdb -in /cdc_emm_database/cdc_emm_database*fasta -dbtype nucl -out /cdc_emm_database/cdc_emm

WORKDIR /data

FROM app as test

RUN emmtyper --help && emm_download_makedb.py -h

RUN wget -q ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/006/785/GCA_000006785.2_ASM678v2/GCA_000006785.2_ASM678v2_genomic.fna.gz && \
gunzip GCA_000006785.2_ASM678v2_genomic.fna.gz && \
mv GCA_000006785.2_ASM678v2_genomic.fna test_data.fasta

RUN emmtyper test_data.fasta && \
emmtyper -w pcr test_data.fasta -o test_out && \
head -10 test_out

# testing new database
RUN emmtyper --blast_db /cdc_emm_database/cdc_emm test_data.fasta -o test3 && \
head -10 test3

RUN emmtyper --version
43 changes: 43 additions & 0 deletions emmtyper/0.2.0-20241028/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# emmtyper container

Main tool : [emmtyper](https://github.com/MDU-PHL/emmtyper)

Code repository: [emmtyper](https://github.com/MDU-PHL/emmtyper)

Additional tools:
- [Daniel-VM/cdc-utilities](https://github.com/Daniel-VM/cdc-utilities): c0d1c26625cfe9ac458306089358dc26edad06f0

Basic information on how to use this tool:
- executable: emmtyper
- help: --help
- version: --version
- description: |
'emmtyper' is a command line tool for emm-typing of _Streptococcus pyogenes_ using a _de novo_ or complete assembly.

Additional information:

This image also contains `emm_download_makedb.py` from https://github.com/Daniel-VM/cdc-utilities for downloading the most-recent fasta file for emm typing.

Full documentation: https://github.com/MDU-PHL/emmtyper

## Example Usage

```bash
# run emmtyper in BLAST (default) mode:
emmtyper sample.fasta -o outfile

# or with output written in verbose format:
emmtyper sample.fasta -o outfile -f verbose

# run emmtyper in PCR mode (useful for troubleshooting, see documentation)
emmtyper -w pcr sample.fasta -o outfile

# downloading a new fasta file for the most-current emm types
emm_download_makedb.py \
--ftp_url 'https://ftp.cdc.gov/' \
--remote_path 'pub/infectious_diseases/biotech/tsemm/' \
--local_path ./out_fasta

# using the database in the image downloaded via emm_download_makedb.py
emmtyper --blast_db /cdc_emm_database/cdc_emm sample.fasta -o outfile
```

0 comments on commit 3636948

Please sign in to comment.