Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add srst2 0.2.0 with custom Vibrio cholerae database #618

Merged
merged 13 commits into from
Apr 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ To learn more about the docker pull rate limits and the open source software pro
| [SNVPhyl-tools](https://hub.docker.com/r/staphb/snvphyl-tools) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/snvphyl-tools)](https://hub.docker.com/r/staphb/snvphyl-tools) | <ul><li>1.8.2</li></ul> | https://github.com/phac-nml/snvphyl-tools |
| [SPAdes](https://hub.docker.com/r/staphb/spades/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/spades)](https://hub.docker.com/r/staphb/spades) | <ul><li>3.8.2</li><li>3.12.0</li><li>3.13.0</li><li>3.14.0</li><li>3.14.1</li><li>3.15.0</li><li>3.15.1</li><li>3.15.2</li><li>3.15.3</li><li>3.15.4</li><li>3.15.5</li></ul> | https://github.com/ablab/spades </br> http://cab.spbu.ru/software/spades/ |
| [SRA-toolkit](https://hub.docker.com/r/staphb/sratoolkit/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/sratoolkit)](https://hub.docker.com/r/staphb/sratoolkit) | <ul><li>2.9.2</li></ul> | https://github.com/ncbi/sra-tools |
| [SRST2](https://hub.docker.com/r/staphb/srst2/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/srst2)](https://hub.docker.com/r/staphb/srst2) | <ul><li>0.2.0</li></ul> | https://github.com/katholt/srst2 |
| [SRST2](https://hub.docker.com/r/staphb/srst2/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/srst2)](https://hub.docker.com/r/staphb/srst2) | <ul><li>0.2.0</li><li>[0.2.0 + custom Vibrio cholerae database](srst2/0.2.0-vibrio-230224/README.md)</li></ul> | https://github.com/katholt/srst2 |
| [Staramr](https://hub.docker.com/r/staphb/staramr/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/staramr)](https://hub.docker.com/r/staphb/staramr) | <ul><li>0.5.1</li><li>0.7.1</li><li>0.8.0</li></ul> | https://github.com/phac-nml/staramr |
| [TBProfiler](https://hub.docker.com/r/staphb/tbprofiler/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/tbprofiler)](https://hub.docker.com/r/staphb/tbprofiler) | <ul><li>4.3.0</li><li>4.4.0</li><li>4.4.2</li></ul> | https://github.com/jodyphelan/TBProfiler |
| [TipToft](https://hub.docker.com/r/staphb/tiptoft/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/tiptoft)](https://hub.docker.com/r/staphb/tiptoft) | <ul><li>1.0.0</li><li>1.0.2</li></ul> | https://github.com/andrewjpage/tiptoft |
Expand Down
95 changes: 95 additions & 0 deletions srst2/0.2.0-vibrio-230224/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
FROM ubuntu:xenial as app

# for easy upgrade later. ARG variables only persist at build time
# Main package version
ARG SRST2_VER=0.2.0

# Dependency versions
ARG BOWTIE2_VER=2.2.6-2
ARG SAMTOOLS_VER=0.1.18

LABEL base.image="ubuntu:xenial"
LABEL dockerfile.version="1"
LABEL software="SRST2"
LABEL software.version="v0.2.0"
LABEL description="Short Read Sequence Typing for Bacterial Pathogens"
LABEL website="https://github.com/katholt/srst2"
LABEL license="https://github.com/katholt/srst2/blob/master/LICENSE.txt"
LABEL maintainer="Holly Halstead"
LABEL maintainer.email="[email protected]"
LABEL maintainer2="Inês Mendes"
LABEL maintainer2.email="[email protected]"

# install dependencies; cleanup apt garbage
RUN apt-get update && apt-get install -y --no-install-recommends \
python2.7 \
python-scipy \
python-biopython \
make \
libc6-dev \
g++ \
zlib1g-dev \
build-essential \
git \
libx11-dev \
xutils-dev \
zlib1g-dev \
bowtie2=${BOWTIE2_VER} \
curl \
libncurses5-dev \
unzip \
wget \
locate \
python-pip \
python-setuptools && \
apt-get autoclean && rm -rf /var/lib/apt/lists/*

# download samtools source code; unzip; compile; put executable in /usr/local/bin
RUN curl -O -L https://sourceforge.net/projects/samtools/files/samtools/${SAMTOOLS_VER}/samtools-${SAMTOOLS_VER}.tar.bz2 && \
tar -xjf samtools-${SAMTOOLS_VER}.tar.bz2 && \
rm samtools-${SAMTOOLS_VER}.tar.bz2 && \
cd samtools-${SAMTOOLS_VER} && \
make && \
cp -v samtools /usr/local/bin

# Install SRST2; make /data
RUN pip install biopython git+https://github.com/katholt/srst2.git@v${SRST2_VER} && \
mkdir /data

# add custom database to /vibrio-cholerae-db directory, make readable to all
ADD vibrio_230224.fasta /vibrio-cholerae-db/

# index custom database in /vibrio-cholerae-db directory; ensure files are readable to all users
RUN bowtie2-build /vibrio-cholerae-db/vibrio_230224.fasta /vibrio-cholerae-db/vibrio_230224.fasta && \
samtools faidx /vibrio-cholerae-db/vibrio_230224.fasta && \
chmod -R 755 /vibrio-cholerae-db

# set final working directory
WORKDIR /data

# test layer
FROM app as test

# check help options
RUN srst2 --version && \
getmlst.py -h && \
slurm_srst2.py -h

# test getmlst.py script as well as usage of srst2 for calling the ST on a Shigella sonnei isolate
# https://www.ebi.ac.uk/ena/browser/view/ERR024070
RUN getmlst.py --species "Escherichia coli#1" && \
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR024/ERR024070/ERR024070_1.fastq.gz && \
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR024/ERR024070/ERR024070_2.fastq.gz && \
srst2 --input_pe ERR024070*.fastq.gz --output shigella1 --log --save_scores --mlst_db Escherichia_coli#1.fasta --mlst_definitions profiles_csv --mlst_delimiter '_' && \
ls shigella1__ERR024070.Escherichia_coli#1.pileup \
shigella1__ERR024070.Escherichia_coli#1.scores \
shigella1__ERR024070.Escherichia_coli#1.sorted.bam \
shigella1__mlst__Escherichia_coli#1__results.txt

# test for vibrio custom DB, print output summary
# https://www.ebi.ac.uk/ena/browser/view/SRR7062495
RUN wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR706/005/SRR7062495/SRR7062495_1.fastq.gz && \
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR706/005/SRR7062495/SRR7062495_2.fastq.gz && \
srst2 --input_pe SRR7062495*.fastq.gz --gene_db /vibrio-cholerae-db/vibrio_230224.fasta --output SRR7062495 && \
ls SRR7062495__genes__vibrio_230224__results.txt && \
cat SRR7062495__genes__vibrio_230224__results.txt
77 changes: 77 additions & 0 deletions srst2/0.2.0-vibrio-230224/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# SRST2 container

Main tool : [SRST2](https://github.com/katholt/srst2) 0.2.0

Additional tools:

- Biopython 1.76
- [Bowtie2](https://github.com/BenLangmead/bowtie2) 2.2.6-2
- python 2.7
- [SAMtools](https://github.com/samtools/samtools) 0.1.18
- SciPy 0.16

Full documentation: [https://github.com/katholt/srst2](https://github.com/katholt/srst2)

SRST2 performs short read sequence typing for bacterial pathogens when given Illumina sequence data, a MLST database, and/or a database of gene sequences such as resistance genes, virulence genes, etc.

## Custom *Vibrio cholerae* database info

This docker image includes a *Vibrio cholerae-specific* database of gene targets (traditionally used in PCR methods) for detecting O1 & O139 serotypes, toxin-production markers, and Biotype markers within the O1 serogroup ("El Tor" or "Classical" biotypes). These sequences were shared via personal communication with Dr. Christine Lee, of the National Listeria, Yersinia, Vibrio and Enterobacterales Reference Laboratory within the Enteric Diseases Laboratory Branch at CDC.

The genes included (and their purpose) included in the database are as follows:

- `ctxA` - Cholera toxin, an indication of toxigenic cholerae
- `ompW` - outer membrane protein, a *V. cholerae* species marker (presence of any allele of this gene distinguishes *V. cholerae* from *V. parahaemolyticus* and *V. vulnificus*)
- `tcpA` - toxin co-pilus A, used to infer Biotype, either "El Tor" or "Clasical"
- database includes an allele for each Biotype. `tcpA_classical` and `tcpA_ElTor`
- `toxR` - transcriptional activator (controls cholera toxin, pilus, and outer-membrane protein expression) - Species marker (allele distinguishes *V. cholerae* from *V. parahaemolyticus* and *V. vulnificus*)
- `wbeN` - O antigen encoding region - used to identify the O1 serogroup
- `wbfR` - O antigen encoding region - used to identify the O139 serogroup

The database's FASTA file & index files are located within `/vibrio-cholerae-db/` in the container's file system and can be utilized via the example command below.

## Basic usage - MLST

### 1 - Gather your input files

```bash
getmlst.py --species 'Escherichia coli#1'
```

### 2 - Run MLST

```bash
srst2 --input_pe strainA_1.fastq.gz strainA_2.fastq.gz --output strainA_test --log --mlst_db Escherichia_coli#1.fasta --mlst_definitions profiles_csv --mlst_delimiter _
```

### 3 - Check the outputs

MLST results are output in: `strainA_test__mlst__Escherichia_coli#1__results.txt`

## Basic usage - Vibrio characterization

### 1 - Run srst2

```bash
srst2 --input_pe SRR7062495_1.fastq.gz SRR7062495_2.fastq.gz --gene_db /vibrio-cholerae-db/vibrio_230224.fasta --output SRR7062495_test
```

### 2 - Check the outputs

Summary results are output in: `SRR7062495_test__genes__vibrio_230224__results.txt` and detailed results are found in `SRR7062495_test__fullgenes__vibrio_230224__results.txt`

```bash
# summary
$ column -t -s $'\t' -n SRR7062495_test__genes__vibrio_230224__results.txt
Sample ctxA ompW tcpA_ElTor toxR wbeN_O1
SRR7062495 ctxA_O395 ompW_O395* tcpA_ElTor_C6706* toxR_O395* wbeN_O1_INDRE

# detailed results
$ column -t -s $'\t' -n SRR7062495_test__fullgenes__vibrio_230224__results.txt
Sample DB gene allele coverage depth diffs uncertainty divergence length maxMAF clusterid seqid annotation
SRR7062495 vibrio_230224 ctxA ctxA_O395 100.0 103.877 0.0 777 0.063 1 1 CP000627.1
SRR7062495 vibrio_230224 ompW ompW_O395 100.0 78.414 6snp 0.917 654 0.04 2 3 CP000626.1
SRR7062495 vibrio_230224 toxR toxR_O395 100.0 74.081 14snp 1.582 885 0.053 5 6 CP000627.1
SRR7062495 vibrio_230224 tcpA_ElTor tcpA_ElTor_C6706 100.0 82.698 1snp 0.148 675 0.046 4 5 CP064350.1
SRR7062495 vibrio_230224 wbeN_O1 wbeN_O1_INDRE 100.0 112.119 0.0 2478 0.091 6 7
```
16 changes: 16 additions & 0 deletions srst2/0.2.0-vibrio-230224/vibrio_230224.fasta
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
>1__ctxA__ctxA_O395__1 CP000627.1
ATGGTAAAGATAATATTTGTGTTTTTTATTTTCTTATCATCATTTTCATATGCAAATGATGATAAGTTATATCGGGCAGATTCTAGACCTCCTGATGAAATAAAGCAGTCAGGTGGTCTTATGCCAAGAGGACAGAGTGAGTACTTTGACCGAGGTACTCAAATGAATATCAACCTTTATGATCATGCAAGAGGAACTCAGACGGGATTTGTTAGGCACGATGATGGATATGTTTCCACCTCAATTAGTTTGAGAAGTGCCCACTTAGTGGGTCAAACTATATTGTCTGGTCATTCTACTTATTATATATATGTTATAGCCACTGCACCCAACATGTTTAACGTTAATGATGTATTAGGGGCATACAGTCCTCATCCAGATGAACAAGAAGTTTCTGCTTTAGGTGGGATTCCATACTCCCAAATATATGGATGGTATCGAGTTCATTTTGGGGTGCTTGATGAACAATTACATCGTAATAGGGGCTACAGAGATAGATATTACAGTAACTTAGATATTGCTCCAGCAGCAGATGGTTATGGATTGGCAGGTTTCCCTCCGGAGCATAGAGCTTGGAGGGAAGAGCCGTGGATTCATCATGCACCGCCGGGTTGTGGGAATGCTCCAAGATCATCGATGAGTAATACTTGCGATGAAAAAACCCAAAGTCTAGGTGTAAAATTCCTTGACGAATACCAATCTAAAGTTAAAAGACAAATATTTTCAGGCTATCAATCTGATATTGATACACATAATAGAATTAAGGATGAATTATGA
>2__ompW__ompW_RFB16__2 CP043556.1
ATGAAACAAACCATTTGCGGCCTAGCCGTACTTGCAGCCCTAAGCTCCGCTCCTGTATTTGCTCACCAAGAAGGTGACTTTATTGTGCGCGCGGGTATTGCCTCGGTAGTACCTAATGACAGTAGCGATAAAGTGTTAAACACTCAAAGTGAGTTGGCAGTTAATAGCAATACCCAGTTAGGGTTAACGCTTGGCTATATGTTTACTGACAACATCAGTTTTGAAGTCCTTGCTGCTACGCCATTTTCACATAAGATTTCTACCTCTGGTGGTGAGTTAGGTAGCCTTGGTGATATTGGTGAAACAAAACATTTGCCACCTACCTTTATGGTCCAATACTACTTTGGTGAAGCTAATTCGACTTTCCGTCCATATGTTGGTGCGGGTTTGAATTACACCACTTTCTTTGATGAAAGCTTTAATAGTACGGGTACTAATAATGCATTGAGTGATTTAAAACTGGACGACTCATGGGGACTTGCTGCTAACGTTGGCTTTGATTATATGCTCAATGATAGCTGGTTCCTCAACGCTTCTGTGTGGTATGCCAATATTGAAACAACGGCAACCTACAAAGCAGGTGCAGATGCCAAATCCACGGATGTTGAAATCAATCCTTGGGTATTTATGATCGCGGGTGGTTATAAGTTCTAA
>2__ompW__ompW_O395__3 CP000626.1
ATGAAACAAACCATTTGCGGCCTAGCCGTACTTGCAGCCCTAAGCTCCGCTCCTGTATTTGCTCACCAAGAAGGTGACTTTATTGTGCGCGCGGGTATTGCCTCGGTAGTACCTAATGACAGTAGCGATAAAGTGTTAAACACTCAAAGTGAGTTGGCAGTTAATAGCAATACCCAGTTAGGGTTAACGCTTGGCTATATGTTTACTGACAACATCAGTTTTGAAGTCCTCGCTGCTACGCCATTTTCACATAAGATTTCTACCTCTGGTGGTGAGTTAGGTAGCCTTGGTGATATTGGTGAAACAAAACATTTGCCACCTACCTTTATGGTCCAATACTACTTTGGTGAAGCTAATTCGACTTTCCGTCCATATGTTGGTGCGGGTTTGAATTACACCACTTTCTTTGATGAAAGCTTTAATAGTACGGGTACTAATAATGCATTGAGTGATTTAAAACTGGACGACTCATGGGGACTTGCTGCTAACGTTGGCTTTGATTATATGCTCAATGATAGCTGGTTCCTCAACGCTTCTGTGTGGTATGCCAATATTGAAACAACGGCAACCTACAAAGCAGGTGCAGATGCCAAATCCACGGATGTTGAAATCAATCCTTGGGTATTTATGATCGCGGGTGGTTATAAGTTCTAA
>3__tcpA_classical__tcpA_classical_395__4 AF325733.1
ATGCAATTATTAAAACAGCTTTTTAAGAAGAAATTTGTAAAAGAAGAACACGATAAGAAAACCGGTCAAGAGGGTATGACATTACTCGAAGTGATCATCGTTCTAGGCATTATGGGGGTGGTTTCGGCGGGGGTTGTTACTCTGGCGCAGCGTGCGATTGATTCGCAGAATATGACCAAGGCCGCGCAAAGTCTCAATAGTATCCAAGTTGCACTGACACAGACATACCGTGGTCTAGGTAATTATCCAGCAACAGCTGATGCGACAGCTGCTAGTAAGCTAACTTCAGGCTTGGTTAGTTTAGGTAAAATATCATCCGATGAGGCAAAAAACCCATTCATTGGTACAAATATGAATATTTTTTCATTTCCGCGTAATGCAGCAGCTAATAAAGCATTTGCAATTTCAGTGGATGGTCTGACACAGGCTCAATGCAAGACACTTATTACCAGTGTCGGTGATATGTTCCCATATATTGCAATCAAAGCTGGTGGCGCAGTAGCACTTGCAGATCTAGGTGATTTTGAGAATTCTGCAGCAGCGGCTGAGACAGGCGTTGGTGTGATCAAATCTATCGCTCCCGCTAGTAAGAATTTAGATCTAACGAACATCACTCACGTTGAGAAATTATGTAAAGGTACTGCTCCATTCGGCGTTGCATTTGGTAACAGCTAA
>4__tcpA_ElTor__tcpA_ElTor_C6706__5 CP064350.1
ATGCAATTATTAAAACAGCTTTTTAAGAAGAAGTTTGTAAAAGAAGAACACGATAAGAAAACCGGTCAAGAGGGTATGACATTACTCGAAGTAATCATTGTTCTGGGTATTATGGGTGTGGTCTCAGCGGGTGTTGTTACGCTGGCTCAGCGTGCGATTGATTCGCAGAATATGACTAAGGCTGCGCAAAATCTAAACAGCGTGCAAATTGCAATGACACAAACTTATCGTAGTCTTGGTAATTATCCAGCTACCGCAAACGCAAATGCTGCTACACAGCTAGCTAATGGTTTGGTCAGCCTTGGTAAGGTTTCAGCTGATGAGGCAAAGAATCCTTTCACTGGTACAGCTATGGGGATTTTCTCATTTCCACGAAACTCTGCAGCGAATAAAGCATTCGCAATTACAGTCGGTGGCTTGACCCAAGCACAATGTAAGACTTTGGTTACAAGCGTAGGGGATATGTTTCCATTTATCAACGTGAAAGAAGGTGCTTTCGCTGCTGTCGCTGATCTTGGTGATTTCGAAACGAGTGTCGCAGATGCTGCTACTGGCGCTGGCGTAATTAAGTCCATTGCACCAGGAAGTGCCAACTTAAACCTAACTAATATCACGCATGTTGAGAAGCTTTGTACAGGAACTGCTCCATTCACAGTAGCTTTTGGTAACAGTTAA
>5__toxR__toxR_O395__6 CP000627.1
ATGTTCGGATTAGGACACAACTCAAAAGAGATATCGATGAGTCATATTGGTACTAAATTCATTCTTGCTGAAAAATTTACCTTCGATCCCCTAAGCAATACTCTGATTGACAAAGAAGATAGTGAAGAGATCATTCGATTAGGCAGCAACGAAAGCCGAATTCTTTGGCTGCTGGCCCAACGTCCAAACGAGGTGATTTCTCGCAATGATTTGCATGACTTTGTTTGGCGAGAGCAAGGTTTTGAAGTCGATGATTCCAGCTTAACCCAAGCCATTTCGACTCTGCGCAAAATGCTCAAAGATTCGACAAAGTCCCCACAATACGTCAAAACGGTTCCGAAGCGCGGTTACCAATTGATCGCCCGAGTGGAAACGGTTGAAGAAGAGATGGCTCGCGAAAACGAAGCTGCTCATGACATCTCTCAGCCAGAATCTGTCAATGAATACGCAGAATCAAGCAGTGTGCCTTCATCAGCCACTGTAGTGAACACACCGCAGCCAGCCAATGTCGTGGCGAATAAATCGGCTCCAAACTTGGGGAATCGACTGTTTATTCTGATAGCGGTCTTACTTCCCCTCGCAGTATTACTGCTCACTAACCCAAGCCAATCCAGCTTTAAACCCCTAACGGTTGTCGATGGCGTAGCCGTCAATATGCCGAATAACCACCCTGATCTTTCAAATTGGCTACCGTCAATCGAACTGTGCGTTAAAAAATACAATGAAAAACATACTGGTGGACTCAAGCCGATAGAAGTGATTGCCACTGGTGGACAAAATAACCAGTTAACGCTGAATTACATTCACAGCCCTGAAGTTTCAGGGGAAAACATAACCTTACGCATCGTTGCTAACCCTAACGATGCCATCAAAGTGTGTGAGTAG
>6__wbeN_O1__wbeN_O1_INDRE__7
ATGCCTGTAAATAACGAAAATCTGACCAGTGTACTTGATGCTCGCCCTTTTGAATTATCAGAAGAGCAAAAATCTCCACTATTTAAAGCGAACTTACTTGCAGAGTTAGTACATCATTATCAATGCAACGAGATGTATCGCAAATTTTGTCAAAAAAACAAATTTGACCCTTTGGTATTTGATGGTGAGGTTGCAGATATTCCACCCATACCTGTGCACATCTTCAAAGCAATAGGACATAAATTATCTTCGGTAAGCGATGATACGATAAAAGCGAAGCTTCAATCTTCTGCTACCAGTGGCGTACCCAGTACCATATTGTTAGATAAGGTAACCGCTCGTCGACAGACTCGAGCAATGGCAAGAGTTATGCAGGAGGTGTTGGGGCCTAAACGTCGCCCGTTTTGCATTATGGATATTGATCCGACAAGCCCAAATGCCACTAACCTTGGGGCTCGTATTGCGGCGGTAAAAGGTTACCTAAACTTCGCCTCAACATCGAAGTATTTTATAGATGCTGATAGCCCAAGTGCTCCACTTGAATTTCTGGAGCAAAAGTTTGTTGAACATCTGAATTCACTTGCGAGTGAAGAGCCGCTCATAATTTTTGGATTCACGTTTGTACTTTATCACACGGTTTTTAAGACCCTTAAAGACAAGGGGATCTCGTTTCAATTGCCTAAAGGTTCTCAGGTTATTCATATTGGTGGTTGGAAAAAACTTGAGTCAGAGAAGGTGGATAAAATTACCTTTAATCGAGATATCGCCTCAGTATTGGGTATTTCTCCTGATGATGTTGTGGATATCTATGGTTTCACTGAACAGATGGGGCTTAATTACCCAGATTGTAAAGCAGGATGGAAACATATTCATGCCTATTCTGACGTAATTATTCGTGATGAATCGAACCTAGAAGTGTGTGGGCCAGGTAAAGTAGGCTTACTTGAGTTTGTAAGCCCACTACCGCATTCATATCCGGGGAATGTTGTACTTACAGATGACCTTGGTGTGATTGAAGAAAGTCTTTGTGAGTGTGGTAAAGCTGGAAAAAGATTCAAAGTCATTGGACGAGCAAAAAAAGCAGAAGTAAGAGGCTGTGGTGATGTTATGTCTGAGAAATTGACTAAAAAGCCATCGTATAAGCCACTTTCTCAACAAGAAGAGAGGTTGACTATCTACCACTCACCGATATTTCTCGATGATACTATGTCCGCATCTCAGCAGCTTGATCAAATCTTTTGTTCTTTAAAGAGGAAGCAAAAATGGCTGGCTAACCAACCATTAGAAGCTATTCTTGGTTTAATCAATGAAGCGCGCAAAAGCTGGTCGAGTACGCCGGAGCTTGACCCTTATCGACATACTGGATTGAACTTCCTAGCTGATTGGTGTGAACCCAATCGTTTGAAAAACCTGCTTGATTCAGCATTGAATGGTCAGCGAGCTTTTTTGGATAATTTTTTACCTCGTAAAGATATTAGCCATAGCTCTCAAAAAGCAATGCCAAGAGGTATCGTATCTCACTGGCTGTCGGGTAACGTACCGTTACTCGGCATGTTTGCGCTGGTACAGAGTATTTTAAGTAAAAATGCCAACATTCTGAAAGTTTCAGCAAGCGAATCGCAAGCTTTGCCAGTATTATTGGCGACTTTTAAAGGCCTTAGCTACACTACCCCAGGTGGTTACACTATCCACGGTGATGACTTATTAGGGACTCTCGCTGTTGTATATTTTGATCGACACCAAACTAAAATTGCAGAGAAGTTTTCGGCCAATGCTGATGTGCGTATAGCTTGGGGGGGACGAGAGGCAATCGAGTCTGTAAGTGGCCTTCCAAAGAAATATAATAGTCAAGATATCCTCTTTGGACCTAAGCTTTCTATGATGGTTGTTGGCAGCGATGCTCTAGACTCTGACAAGGCAATCAGAAAGTTGATTCGTCGGGCTGCAACTGACTCTAGTGTGTTCGATCAGTTTGCTTGCGCTTCTCCGCACACCATTTTTGTTGAGAAGGGCGGTCTAATAACACCTAAAGAGTTTGCAGAGAAGCTTGCCTCAGCAATGGATAAGGCTCTTGTACGCTTACCAACTCAAGTACCAGACATTGGGCAAGCAAATAAGATTCGCTCAAAGATAGCGGAATATGCATTTATTGGCGAATATTGGCATGACAAGCACTTACGTTGGACGGTGTTGTTTGATGAAGGGATAGAGCTTGTTGAGCCGACATATCAACGTGTTATTACAGTAAAAGCAGTTGATAATGTATTTGATGTAGTCGACAGTGTACATGAAGATATCCAAACGGTCGGGTTGGCGATGAATGGTGAAAAGCGTCTTCGTTTTGCTAACGAGATAATGTTAAAAGGTGCGATGCGATGTCCAGATGTCGGCTACATGACCCATTTTGATTCCCCATGGGATGGGGTTGTAGCGCTAGATAGAATGGTTCGTTGGGTAACTCTAGGAGGACCGCTGTGA
>7__wbfR_O139__wbfR_MO45__8
ATGTGCGGTGTAGCGGGTTTTATTAGTAAGCGTTTATCGCCGGTCGACTGTTTAACTTCCATGGTCGAAAGTATTATGCATCGTGGACCGAATGATAGTGGTCTATGGGTTGATGATGACTTTGGTGTCTGTTTAGCGCACGCACGCTTATCAATACAGGATTTAAGTTCAGCTGGGCATCAGCCGATGCATTCAAAATCTGAGCGCTATGTTATGATTTTTAATGGTGAAATATACAATCATTTAACATTGCGTGAAGAACTGATCGAGATTGTACCAAGTTACTGGAATGGTCATTCAGATACCGAAACCTTGTTGGCTGGTTTTGAAGTGTGGGGAATAGAACAGACCATACAAAAATGTGTCGGTATGTTTGCTATCGTCCTATGGGATAAAGTACTTAAACAGTTGATCTTGATTCGGGATCGATTTGGTGAGAAGCCTCTTTATTACGGGTGGCAGCGCGATACTTTTCTGTTTGCTTCTGAGTTAAAAGCGCTTAAAGCTCATCCCAGTTTTGAAGGCAGCATTAATCGTCAGGCGTTATCGCATTTTTTTCGTTTGAATTACATACCAACGCCCTTATCCATTTATGAAGGTATCTTCAAGTTAGAGCCGGGTGTTATTGCTGTCTTTTCTCACGAGGGGCAGTTGCTCTCTAAACAAACATTTTGGGATGCCAGTCATGCTGTTTCTCTGCAAAATTTTTCCGATCATGATGCCGTTGATAAATTAGATGACTTAATTAAGCAGTCTATTCAAGATCAAGCGTTATCGGATGTTCCGTTAGGGGCTTTTTTATCCGGAGGGGTTGATTCGTCCACTGTGGTGGGTATTTTACAATCCCTCTCTACTCGTCCGGTCAAAACCTTTACGATCGGGTTTGACCACGCGGATTTTAATGAAGCGAGTGAGGCCTCAGACGTTGCAAAACACTTAGGAACGGATCATGTCGAGTTAATTGTCAGTGCAGAAGATGCTCTAGCGATTATTAATCAGTTACCTGTTATGTACGATGAACCTTTTGCTGACGCCTCTCAAGTGCCTACGTTTCTGGTTTCGAAGCTGGCTAAAAAAGAGGTCACTGTATGCTTGTCTGGTGATGGGGGCGATGAACTGTTTTGTGGTTATAACCGCTATCATTACACTGCTAAAGTTTGGTCGTATTTAGAAAAAATTCCCTTTCCAATCCGAAAAATGCTCTCAGTCTTTTTGTTGACGCTTTCGCCATCTTCTTGGGATGTTTTAAGTAAAACTTTAGGTTTGAATACCAGATTACCAAATTTAGGCAATAAAATTCAAAAAGGTGCCCAAGCTTTAAAGGCAAGAGATATTGAAGACCTTTATACACGGGTTGTCTCCAACTGGGATCTAGATGAGCCTTTGGTTAAAAATACTGCGGTTGAGAAATTACCGTTTTTGTCTGACTTAACAGAACTTTCCCATCTTAATGACTTAGAAAAAATGATGTTGTGGGATAAGCAATCTTATCTAATGGACGATGTTTTAGTGAAAACAGATCGTGCTACGATGGCGTGTTCATTAGAAGGGCGGGTTCCCTTGTTAGACCACCGCATTGCTGAGTTTGCTGCCAGTTTGCCGATCCATTTGAAATACCGAGGTGGAAAGGGAAAGTGGCTTTTACGAGAAGTACTGTATCGTTATGTACCTAAAAAATTAATTGAAAGGCCAAAAAAAGGGTTTAGTTTACCCATCGCTGAATGGTTGAGAGGACCGCTAAAAGATTGGGCGAATGTTTTGCTGGATTCTGATCGTATTGATAAAGAAGGCTTTTTGTCGTCTGAATTGGTTCAAAAGAAGTGGCGTGAACATTTAGCGGGTAAACGAGATTGGTCGTCGCAGTTGTGGAGCGTTCTAATGTTCCAATTATGGCTTGAGAAAAACAAATGA