Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ToolDistillator #5967

Merged
merged 19 commits into from
Apr 30, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
3956353
Add ToolDistillator
clsiguret Apr 25, 2024
49ceab1
Change ToolDistillator command
clsiguret Apr 25, 2024
c840f09
Apply suggestions from code review for ToolDistillator
clsiguret Apr 25, 2024
90ce67c
Apply suggestions from code review for ToolDistillator
clsiguret Apr 25, 2024
3a68ae4
Apply suggestions from code review for ToolDistillator
clsiguret Apr 25, 2024
77e1ddb
Apply suggestions from code review for ToolDistillator
clsiguret Apr 25, 2024
a58618f
Remove files not used
clsiguret Apr 25, 2024
4fe5480
Remove tooldistillator_summary.json because size file was too large
clsiguret Apr 25, 2024
e72ec95
Update tooldistillator_summarize.xml: planemo test was not working wi…
clsiguret Apr 25, 2024
79fa47d
New release 0.8.4.1 ToolDistillator in Bioconda
clsiguret Apr 26, 2024
596a21c
Error planemo test: table in section Help of tooldistillator.xml had …
clsiguret Apr 26, 2024
abf6da7
Change test-data and tests for Bakta because the files were too large
clsiguret Apr 26, 2024
39d39fc
Change test-data and tests for Kraken2 because the files were too large
clsiguret Apr 26, 2024
b877947
Remove test-data html for MultiQC because the files were too large an…
clsiguret Apr 26, 2024
877a83a
Apply suggestions from code review
clsiguret Apr 26, 2024
560f36a
Change double-quotes to single-quotes
clsiguret Apr 26, 2024
e256bbd
Remove synthax: diff to None
clsiguret Apr 26, 2024
18eaff1
Remove empty or not used files
clsiguret Apr 26, 2024
a190a4d
Change tests in XML files + remove None and str term
clsiguret Apr 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions tools/tooldistillator/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: tooldistillator
owner: iuc
long_description: |
ToolDistillator extract and aggregate information from different tool outputs to JSON parsable files
categories:
- Sequence Analysis
remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/master/tools/tooldistillator
homepage_url: https://gitlab.com/ifb-elixirfr/abromics/tooldistillator
type: unrestricted
auto_tool_repositories:
name_template: "{{ tool_id }}"
description_template: "{{ tool_name }}: Extract information from tool output(s) to JSON and/or aggregate several JSON reports"
clsiguret marked this conversation as resolved.
Show resolved Hide resolved
suite:
name: "suite_tooldistillator"
description: "Tool to extract and aggregate information from different tool outputs to JSON parsable files"
long_description: |
ToolDistillator extract and aggregate information from different tool outputs to JSON parsable files
suite:
name: "suite_tooldistillator"
description: "Tool to extract and aggregate information from different tool outputs to JSON parsable files"
long_description: |
ToolDistillator extract and aggregate information from different tool outputs to JSON parsable files
clsiguret marked this conversation as resolved.
Show resolved Hide resolved
38 changes: 38 additions & 0 deletions tools/tooldistillator/macro.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
<?xml version="1.0"?>
<macros>
<token name="@TOOL_VERSION@">0.8.4.1</token>
<token name="@VERSION_SUFFIX@">0</token>
<token name="@PROFILE@">21.05</token>
<xml name="version_command">
<version_command><![CDATA[tooldistillator --version]]></version_command>
</xml>
<xml name="analysis_software_version">
<param argument="--analysis_software_version" type="text" optional="true" label="Analysis software version"/>
</xml>
<xml name="reference_database_version">
<param argument="--reference_database_version" type="text" optional="true" label="Database software version"/>
</xml>
<xml name="biotools">
<xrefs>
<xref type="bio.tools">tooldistillator</xref>
</xrefs>
</xml>
<xml name="requirements">
<requirements>
<requirement type="package" version="@TOOL_VERSION@">tooldistillator</requirement>
</requirements>
</xml>
<xml name="citations">
<citations>
<citation type="doi">10.5281/zenodo.8282656</citation>
</citations>
</xml>
<xml name="element_assert" token_name="" token_text="">
<element name="@NAME@">
<assert_contents>
<has_text text="@TEXT@"/>
<yield/>
</assert_contents>
</element>
</xml>
</macros>
25 changes: 25 additions & 0 deletions tools/tooldistillator/test-data/abricate/report.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#FILE SEQUENCE START END STRAND GENE COVERAGE COVERAGE_MAP GAPS %COVERAGE %IDENTITY DATABASE ACCESSION PRODUCT RESISTANCE
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00002 142975 143985 + bopD 1-1011/1011 =============== 0/0 100.00 98.91 vfdb NP_814691 (bopD) sugar-binding transcriptional regulator LacI family [BopD (VF0362)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00003 7884 11111 - fss3 1-3228/3228 =============== 0/0 100.00 99.38 vfdb NP_815578 (fss3) Enterococcus faecalis surface protein Fss3 fibrinogen binding protein [Fibrinogen binding protein (AI273)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00003 162314 163240 + efaA 1-927/927 =============== 0/0 100.00 99.89 vfdb NP_815739 (efaA) endocarditis specific antigen [EfaA (VF0354)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00006 33703 37014 + ebpA 1-3312/3312 =============== 0/0 100.00 99.49 vfdb NP_814821 (ebpA) endocarditis and biofilm-associated pilus tip protein EbpA [Ebp pili (VF0538)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00006 37018 38448 + ebpB 1-1431/1431 =============== 0/0 100.00 99.30 vfdb NP_814822 (ebpB) endocarditis and biofilm-associated pilus minor subunit EbpB [Ebp pili (VF0538)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00006 38445 40328 + ebpC 1-1878/1878 ========/====== 1/6 100.00 98.99 vfdb NP_814823 (ebpC) endocarditis and biofilm-associated pilus major subunit EbpC [Ebp pili (VF0538)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00006 40422 41276 + srtC 1-855/855 =============== 0/0 100.00 99.30 vfdb NP_814824 (srtC) sortase [Ebp pili (VF0538)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00006 44356 46374 + ace 1-2025/2025 ========/====== 2/6 99.70 96.74 vfdb NP_814829 (ace) collagen adhesin protein [Ace (VF0355)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00010 49447 50301 - sprE 1-855/855 =============== 0/0 100.00 98.36 vfdb NP_815515 (sprE) serine proteinase V8 family [SprE (VF0358)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00010 50350 51882 - gelE 1-1533/1533 =============== 0/0 100.00 98.96 vfdb NP_815516 (gelE) coccolysin [Gelatinase (VF0357)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00010 52117 53200 - fsrC 261-1344/1344 ..============= 0/0 80.65 99.26 vfdb NP_815517 (fsrC) histidine kinase putative [Fsr (VF0360)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 14420 15232 - cpsK 1-813/813 =============== 0/0 100.00 99.88 vfdb NP_816131 (cpsK) ABC transporter permease protein [Capsule (VF0361)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 15232 16575 - cpsJ 1-1344/1344 =============== 0/0 100.00 98.96 vfdb NP_816132 (cpsJ) ABC transporter ATP-binding protein [Capsule (VF0361)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 16601 17739 - cpsI 1-1140/1140 ========/====== 1/1 99.91 98.33 vfdb NP_816133 (cpsI) UDP-galactopyranose mutase [Capsule (VF0361)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 17798 18196 - cpsH 1-399/399 =============== 0/0 100.00 97.49 vfdb NP_816134 (cpsH) lipoprotein [Capsule (VF0361)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 18214 20666 - cpsG 1-2454/2454 ========/====== 1/1 99.96 99.51 vfdb NP_816135 (cpsG) MurB family protein [Capsule (VF0361)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 20736 23245 - cpsE 1-2510/2511 =============== 0/0 99.96 99.40 vfdb NP_816137 (cpsE) glycosyl transferase group 2 family protein [Capsule (VF0361)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 23315 24781 - cpsD 1-1467/1467 =============== 0/0 100.00 98.91 vfdb NP_816138 (cpsD) glycosyl transferase group 2 family protein [Capsule (VF0361)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 24800 25969 - cpsC 1-1170/1170 =============== 0/0 100.00 99.57 vfdb NP_816139 (cpsC) teichoic acid biosynthesis protein putative [Capsule (VF0361)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 26443 27243 - cpsB 1-801/801 =============== 0/0 100.00 99.50 vfdb NP_816140 (cpsB) phosphatidate cytidylyltransferase [Capsule (VF0361)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 27240 28055 - cpsA 1-816/816 =============== 0/0 100.00 99.51 vfdb NP_816141 (cpsA) undecaprenyl diphosphate synthase [Capsule (VF0361)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00018 36270 40580 - fss2 1-4305/4956 ========/=====. 11/22 86.70 92.92 vfdb NP_816151 (fss2) Enterococcus faecalis surface protein Fss2 fibrinogen binding protein [Fibrinogen binding protein (AI272)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00023 14087 18004 + prgB/asc10 1-3918/3918 ========/====== 2/4 99.95 94.26 vfdb NP_817031 (prgB/asc10) aggregation substance PrgB/Asc10 [AS (VF0352)] [Enterococcus faecalis V583]
/storage/scratch/piemari/60579281/abricate/10_Enterococcus_faecalis_S17_L001.fasta contig00042 2539 8502 - fss1 1-5964/5964 =============== 0/0 100.00 98.31 vfdb NP_813892 (fss1) Enterococcus faecalis surface protein Fss1 fibrinogen binding protein [Fibrinogen binding protein (AI271)] [Enterococcus faecalis V583]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is empty, is it needed or should it be removed?

Empty file.
95 changes: 95 additions & 0 deletions tools/tooldistillator/test-data/bakta/bakta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
{
"genome": {
"genus": null,
"species": null,
"strain": null,
"complete": true,
"gram": "?",
"translation_table": 11
},
"stats": {
"no_sequences": 1,
"size": 1330,
"gc": 0.4518796992481203,
"n_ratio": 0.0,
"n50": 1330,
"coding_ratio": 0.6203007518796992
},
"features": [
{
"type": "cds",
"contig": "contig_1",
"start": 413,
"stop": 736,
"strand": "+",
"frame": 2,
"gene": null,
"product": "hypothetical protein",
"db_xrefs": [],
"nt": "ATGACAAAACGAAGTGGAAGTAATACGCGCAGGCGGGCTATCAGTCGCCCTGTTCGTCTGACGGCAGAAGAAGACCAGGAAATCAGAAAAAGGGCTGCTGAATGCGGCAAGACCGTTTCTGGTTTTTTACGGGCGGCAGCTCTCGGTAAGAAAGTTAACTCACTGACTGATGACCGGGTGCTGAAAGAAGTTATGCGACTGGGGGCGTTGCAGAAAAAACTCTTTATCGACGGCAAGCGTGTCGGGGACAGAGAGTATGCGGAGGTGCTGATCGCTATTACGGAGTATCACCGTGCCCTGTTATCCAGGCTTATGGCAGATTAG",
"aa": "MTKRSGSNTRRRAISRPVRLTAEEDQEIRKRAAECGKTVSGFLRAAALGKKVNSLTDDRVLKEVMRLGALQKKLFIDGKRVGDREYAEVLIAITEYHRALLSRLMAD",
"aa_hexdigest": "d9bdebc84195542e775c3d22458b507e",
"start_type": "ATG",
"rbs_motif": "GGAG/GAGG",
"hypothetical": true,
"genes": [],
"seq_stats": {
"molecular_weight": 12072.90819999999,
"isoelectric_point": 10.367886161804197
},
"id": "IHHALPPJCH_1",
"locus": "IHHALP_00005"
},
{
"type": "cds",
"contig": "contig_1",
"start": 971,
"stop": 141,
"strand": "-",
"frame": 1,
"gene": null,
"product": "hypothetical protein",
"db_xrefs": [],
"nt": "ATGAACAAGCAGCAGCAAACTGCACTCAACATGGCGGGATTCATAAAAAGCCAGAGCCTGACGCTGCTCGAAAAACTGGACGCACTCGATGCTGACGAGCAGGCCACCATGTGTGAGAAGCTGCACGAACTCGCAGAAGAACAAATAGAAGCAATAAAAAATAAAGATAAAACTTTATTTATTGTCTATGCTACTGATATTTATAGCCCGAGCGAATTTTTCTCAAAAATCGAATCCGACTTGAAGAAAAAGAAAAGCAAGGGTGATGTTTTTTTTGATTTAATAATTCCTAACGGTGGAAAAAAAGATCGTTACGTCTATACGTCATTTAATGGCGAGAAGTTTTCAAGTTACACATTAAACAAAGTTACGAAAACTGATGAATATAATGATTTATCTGAGCTCTCGGCTTCGTTCTTTAAAAAAAACTTTGATAAGATCAACGTAAACCTTCTATCCAAAGCCACATCATTTGCTTTGAAAAAAGGCATTCCAATATAA",
"aa": "MNKQQQTALNMAGFIKSQSLTLLEKLDALDADEQATMCEKLHELAEEQIEAIKNKDKTLFIVYATDIYSPSEFFSKIESDLKKKKSKGDVFFDLIIPNGGKKDRYVYTSFNGEKFSSYTLNKVTKTDEYNDLSELSASFFKKNFDKINVNLLSKATSFALKKGIPI",
"aa_hexdigest": "1e7027cbe48346e06a83e802a9385584",
"start_type": "ATG",
"rbs_motif": "AGGA/GGAG/GAGG",
"edge": true,
"hypothetical": true,
"genes": [],
"seq_stats": {
"molecular_weight": 18866.325799999995,
"isoelectric_point": 7.696590614318848
},
"id": "IHHALPPJCH_2",
"locus": "IHHALP_00010"
}
],
"sequences": [
{
"id": "contig_1",
"description": "[gcode=11] [completeness=complete] [topology=circular] [plasmid-name=unnamed1]",
"sequence": "TTCTTCTGCGAGTTCGTGCAGCTTCTCACACATGGTGGCCTGCTCGTCAGCATCGAGTGCGTCCAGTTTTTCGAGCAGCGTCAGGCTCTGGCTTTTTATGAATCCCGCCATGTTGAGTGCAGTTTGCTGCTGCTTGTTCATCTTTCTGTTTTCTCCGTTCTGTCTGTCATCTGCGTCGTGTGATTATATCGCGCACCACTTTTCGACCGTCTTACCGCCGGTATTCTGCCGACGGACATTTCAGTCAGACAACACTGTCACTGCCAAAAAACAGCAGTGCTTTGTTGGTAATTCGAACTTGCAGACAGGACAGGATGTGCAATTGTTATACCGCGCATACATGCACGCTATTACAATTACCCTGGTCAGGGCTTCGCCCCGACACCCCATGTCAGATACGGAGCCATGTTTTATGACAAAACGAAGTGGAAGTAATACGCGCAGGCGGGCTATCAGTCGCCCTGTTCGTCTGACGGCAGAAGAAGACCAGGAAATCAGAAAAAGGGCTGCTGAATGCGGCAAGACCGTTTCTGGTTTTTTACGGGCGGCAGCTCTCGGTAAGAAAGTTAACTCACTGACTGATGACCGGGTGCTGAAAGAAGTTATGCGACTGGGGGCGTTGCAGAAAAAACTCTTTATCGACGGCAAGCGTGTCGGGGACAGAGAGTATGCGGAGGTGCTGATCGCTATTACGGAGTATCACCGTGCCCTGTTATCCAGGCTTATGGCAGATTAGCTTCCCGGAGAGAAACTGTCGAAAACAGACGGTATGAACGCCGTAAGCCCCCAAACCGATCGCCATTCACTTTCATGCATAGCTATGCAGTGAGCTGAAAGCGATCCTGACGCATTTTTCCGGTTTACCCCGGGGAAAACATCTCTTTTTGCGGTGTCTGCGTCAGAATCGCGTTCAGCGCGTTTTGGCGGTGCGCGTAATGAGACGTTATGGTAAATGTCTTCTGGCTTGATATTATATTGGAATGCCTTTTTTCAAAGCAAATGATGTGGCTTTGGATAGAAGGTTTACGTTGATCTTATCAAAGTTTTTTTTAAAGAACGAAGCCGAGAGCTCAGATAAATCATTATATTCATCAGTTTTCGTAACTTTGTTTAATGTGTAACTTGAAAACTTCTCGCCATTAAATGACGTATAGACGTAACGATCTTTTTTTCCACCGTTAGGAATTATTAAATCAAAAAAAACATCACCCTTGCTTTTCTTTTTCTTCAAGTCGGATTCGATTTTTGAGAAAAATTCGCTCGGGCTATAAATATCAGTAGCATAGACAATAAATAAAGTTTTATCTTTATTTTTTATTGCTTCTATTTG",
"length": 1330,
"complete": true,
"type": "plasmid",
"topology": "circular",
"simple_id": "contig_1",
"orig_id": "NC_002127.1",
"orig_description": "Escherichia coli O157:H7 str. Sakai plasmid pOSAK1, complete sequence",
"name": "unnamed1"
}
],
"run": {
"start": "2024-02-11 00:24:53",
"end": "2024-02-11 00:25:06"
},
"version": {
"bakta": "1.9.2",
"db": {
"version": "5.0",
"type": "full"
}
}
}
4 changes: 4 additions & 0 deletions tools/tooldistillator/test-data/bakta/bakta_aminoacid.faa
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
>IHHALP_00005 hypothetical protein
MTKRSGSNTRRRAISRPVRLTAEEDQEIRKRAAECGKTVSGFLRAAALGKKVNSLTDDRVLKEVMRLGALQKKLFIDGKRVGDREYAEVLIAITEYHRALLSRLMAD
>IHHALP_00010 hypothetical protein
MNKQQQTALNMAGFIKSQSLTLLEKLDALDADEQATMCEKLHELAEEQIEAIKNKDKTLFIVYATDIYSPSEFFSKIESDLKKKKSKGDVFFDLIIPNGGKKDRYVYTSFNGEKFSSYTLNKVTKTDEYNDLSELSASFFKKNFDKINVNLLSKATSFALKKGIPI
36 changes: 36 additions & 0 deletions tools/tooldistillator/test-data/bakta/bakta_annotation.gff3
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
##gff-version 3
##feature-ontology https://github.com/The-Sequence-Ontology/SO-Ontologies/blob/v3.1/so.obo
# Annotated with Bakta
# Software: v1.9.2
# Database: v5.0, full
# DOI: 10.1099/mgen.0.000685
# URL: github.com/oschwengers/bakta
##sequence-region contig_1 1 1330
contig_1 Bakta region 1 1330 . + . ID=contig_1;Name=contig_1;Is_circular=true
contig_1 Prodigal CDS 413 736 . + 0 ID=IHHALP_00005;Name=hypothetical protein;locus_tag=IHHALP_00005;product=hypothetical protein
contig_1 Prodigal CDS 971 1471 . - 0 ID=IHHALP_00010;Name=hypothetical protein;locus_tag=IHHALP_00010;product=hypothetical protein
##FASTA
>contig_1
TTCTTCTGCGAGTTCGTGCAGCTTCTCACACATGGTGGCCTGCTCGTCAGCATCGAGTGC
GTCCAGTTTTTCGAGCAGCGTCAGGCTCTGGCTTTTTATGAATCCCGCCATGTTGAGTGC
AGTTTGCTGCTGCTTGTTCATCTTTCTGTTTTCTCCGTTCTGTCTGTCATCTGCGTCGTG
TGATTATATCGCGCACCACTTTTCGACCGTCTTACCGCCGGTATTCTGCCGACGGACATT
TCAGTCAGACAACACTGTCACTGCCAAAAAACAGCAGTGCTTTGTTGGTAATTCGAACTT
GCAGACAGGACAGGATGTGCAATTGTTATACCGCGCATACATGCACGCTATTACAATTAC
CCTGGTCAGGGCTTCGCCCCGACACCCCATGTCAGATACGGAGCCATGTTTTATGACAAA
ACGAAGTGGAAGTAATACGCGCAGGCGGGCTATCAGTCGCCCTGTTCGTCTGACGGCAGA
AGAAGACCAGGAAATCAGAAAAAGGGCTGCTGAATGCGGCAAGACCGTTTCTGGTTTTTT
ACGGGCGGCAGCTCTCGGTAAGAAAGTTAACTCACTGACTGATGACCGGGTGCTGAAAGA
AGTTATGCGACTGGGGGCGTTGCAGAAAAAACTCTTTATCGACGGCAAGCGTGTCGGGGA
CAGAGAGTATGCGGAGGTGCTGATCGCTATTACGGAGTATCACCGTGCCCTGTTATCCAG
GCTTATGGCAGATTAGCTTCCCGGAGAGAAACTGTCGAAAACAGACGGTATGAACGCCGT
AAGCCCCCAAACCGATCGCCATTCACTTTCATGCATAGCTATGCAGTGAGCTGAAAGCGA
TCCTGACGCATTTTTCCGGTTTACCCCGGGGAAAACATCTCTTTTTGCGGTGTCTGCGTC
AGAATCGCGTTCAGCGCGTTTTGGCGGTGCGCGTAATGAGACGTTATGGTAAATGTCTTC
TGGCTTGATATTATATTGGAATGCCTTTTTTCAAAGCAAATGATGTGGCTTTGGATAGAA
GGTTTACGTTGATCTTATCAAAGTTTTTTTTAAAGAACGAAGCCGAGAGCTCAGATAAAT
CATTATATTCATCAGTTTTCGTAACTTTGTTTAATGTGTAACTTGAAAACTTCTCGCCAT
TAAATGACGTATAGACGTAACGATCTTTTTTTCCACCGTTAGGAATTATTAAATCAAAAA
AAACATCACCCTTGCTTTTCTTTTTCTTCAAGTCGGATTCGATTTTTGAGAAAAATTCGC
TCGGGCTATAAATATCAGTAGCATAGACAATAAATAAAGTTTTATCTTTATTTTTTATTG
CTTCTATTTG
8 changes: 8 additions & 0 deletions tools/tooldistillator/test-data/bakta/bakta_annotation.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Annotated with Bakta
# Software: v1.9.2
# Database: v5.0, full
# DOI: 10.1099/mgen.0.000685
# URL: github.com/oschwengers/bakta
#Sequence Id Type Start Stop Strand Locus Tag Gene Product DbXrefs
contig_1 cds 413 736 + IHHALP_00005 hypothetical protein
contig_1 cds 971 141 - IHHALP_00010 hypothetical protein
24 changes: 24 additions & 0 deletions tools/tooldistillator/test-data/bakta/bakta_contigs_sequences.fna
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
>contig_1 [gcode=11] [completeness=complete] [topology=circular] [plasmid-name=unnamed1]
TTCTTCTGCGAGTTCGTGCAGCTTCTCACACATGGTGGCCTGCTCGTCAGCATCGAGTGC
GTCCAGTTTTTCGAGCAGCGTCAGGCTCTGGCTTTTTATGAATCCCGCCATGTTGAGTGC
AGTTTGCTGCTGCTTGTTCATCTTTCTGTTTTCTCCGTTCTGTCTGTCATCTGCGTCGTG
TGATTATATCGCGCACCACTTTTCGACCGTCTTACCGCCGGTATTCTGCCGACGGACATT
TCAGTCAGACAACACTGTCACTGCCAAAAAACAGCAGTGCTTTGTTGGTAATTCGAACTT
GCAGACAGGACAGGATGTGCAATTGTTATACCGCGCATACATGCACGCTATTACAATTAC
CCTGGTCAGGGCTTCGCCCCGACACCCCATGTCAGATACGGAGCCATGTTTTATGACAAA
ACGAAGTGGAAGTAATACGCGCAGGCGGGCTATCAGTCGCCCTGTTCGTCTGACGGCAGA
AGAAGACCAGGAAATCAGAAAAAGGGCTGCTGAATGCGGCAAGACCGTTTCTGGTTTTTT
ACGGGCGGCAGCTCTCGGTAAGAAAGTTAACTCACTGACTGATGACCGGGTGCTGAAAGA
AGTTATGCGACTGGGGGCGTTGCAGAAAAAACTCTTTATCGACGGCAAGCGTGTCGGGGA
CAGAGAGTATGCGGAGGTGCTGATCGCTATTACGGAGTATCACCGTGCCCTGTTATCCAG
GCTTATGGCAGATTAGCTTCCCGGAGAGAAACTGTCGAAAACAGACGGTATGAACGCCGT
AAGCCCCCAAACCGATCGCCATTCACTTTCATGCATAGCTATGCAGTGAGCTGAAAGCGA
TCCTGACGCATTTTTCCGGTTTACCCCGGGGAAAACATCTCTTTTTGCGGTGTCTGCGTC
AGAATCGCGTTCAGCGCGTTTTGGCGGTGCGCGTAATGAGACGTTATGGTAAATGTCTTC
TGGCTTGATATTATATTGGAATGCCTTTTTTCAAAGCAAATGATGTGGCTTTGGATAGAA
GGTTTACGTTGATCTTATCAAAGTTTTTTTTAAAGAACGAAGCCGAGAGCTCAGATAAAT
CATTATATTCATCAGTTTTCGTAACTTTGTTTAATGTGTAACTTGAAAACTTCTCGCCAT
TAAATGACGTATAGACGTAACGATCTTTTTTTCCACCGTTAGGAATTATTAAATCAAAAA
AAACATCACCCTTGCTTTTCTTTTTCTTCAAGTCGGATTCGATTTTTGAGAAAAATTCGC
TCGGGCTATAAATATCAGTAGCATAGACAATAAATAAAGTTTTATCTTTATTTTTTATTG
CTTCTATTTG
Loading