-
Notifications
You must be signed in to change notification settings - Fork 126
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Co-authored-by: Kutluhan Incekara <[email protected]>
- Loading branch information
Showing
4 changed files
with
72 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
FROM ubuntu:jammy as app | ||
|
||
ARG PANQC_VER="0.4.0" | ||
|
||
LABEL base.image="ubuntu:jammy" | ||
LABEL dockerfile.version="1" | ||
LABEL software="panqc" | ||
LABEL software.version="${PANQC_VER}" | ||
LABEL description="A pan-genome quality control toolkit for evaluating nucleotide redundancy in pan-genome analyses." | ||
LABEL website="https://github.com/maxgmarin/panqc" | ||
LABEL license="https://github.com/maxgmarin/panqc/blob/main/LICENSE" | ||
LABEL maintainer="Erin Young" | ||
LABEL maintainer.email="[email protected]" | ||
|
||
RUN apt-get update && apt-get install -y --no-install-recommends \ | ||
wget \ | ||
ca-certificates \ | ||
procps \ | ||
python3 \ | ||
python3-pip \ | ||
python3-dev \ | ||
gcc && \ | ||
apt-get autoclean && rm -rf /var/lib/apt/lists/* | ||
|
||
RUN pip install cython | ||
|
||
RUN wget -q https://github.com/maxgmarin/panqc/archive/refs/tags/${PANQC_VER}.tar.gz && \ | ||
pip install --no-cache-dir ${PANQC_VER}.tar.gz && \ | ||
rm ${PANQC_VER}.tar.gz && \ | ||
mkdir /data | ||
|
||
ENV LC_ALL=C | ||
|
||
CMD panqc nrc --help && panqc utils --help | ||
|
||
WORKDIR /data | ||
|
||
FROM app as test | ||
|
||
WORKDIR /test | ||
|
||
RUN panqc nrc --help && \ | ||
panqc utils --help | ||
|
||
RUN wget -q https://github.com/maxgmarin/panqc/archive/refs/tags/${PANQC_VER}.tar.gz && \ | ||
tar -xvf ${PANQC_VER}.tar.gz && \ | ||
cd panqc-${PANQC_VER}/tests/data && \ | ||
panqc nrc -a TestSet1.InputAsmPaths.tsv -r TestSet1.pan_genome_reference.fa.gz -m TestSet1.gene_presence_absence.csv -o test_results/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# panqc container | ||
|
||
Main tool: [panqc](https://github.com/maxgmarin/panqc) | ||
|
||
Code repository: https://github.com/maxgmarin/panqc | ||
|
||
Basic information on how to use this tool: | ||
- executable: panqc nrc || panqc utils | ||
- help: --help | ||
- version: NA | ||
- description: | | ||
|
||
> The panqc Nucleotide Redundancy Correction (NRC) pipeline adjusts for redundancy at the DNA level within pan-genome estimates in two steps. In step one, all genes predicted to be absent at the Amino Acid (AA) level are compared to their corresponding assembly at the nucleotide level. In cases where the nucleotide sequence is found with high coverage and sequence identity (Query Coverage & Sequence Identity > 90%), the gene is marked as “present at the DNA level”. Next, all genes are clustered and merged using a k-mer based metric of nucleotide similarity. Cases where two or more genes are divergent at the AA level but highly similar at the nucleotide level will be merged into a single “nucleotide similarity gene cluster”. After applying this method the pan-genome gene presence matrix is readjusted according to these results. | ||
|
||
Full documentation: [https://github.com/maxgmarin/panqc](https://github.com/maxgmarin/panqc) | ||
|
||
## Example Usage | ||
|
||
```bash | ||
panqc nrc --asms InputAsmPaths.tsv --pg-ref pan_genome_reference.fa --is-rtab gene_presence_absence.Rtab --results_dir results/ | ||
``` |