You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I used pharokka the other day to reorient and annotate the genome of a newly sequenced Cutibacterium phage. The reorientation was completed correctly, but in the genbank file the CDS coordinates of terL were given as 130..1512 instead of the expected 1..1512. The attached image shows the beginning of the reoriented genome with the terL as predicted by PHANOTATE highlighted in yellow.
I am not sure whether it can be called a bug or not though as I understand that dnaapler uses tblastx whilst PHANOTATE gene predictions are based on the analysis of the phage genome as a whole.
LOCUS 1 29446 bp DNA linear PHG 19-JAN-2025
DEFINITION 1 length=29446 depth=1.00x.
ACCESSION 1
VERSION 1
KEYWORDS .
SOURCE .
ORGANISM .
.
FEATURES Location/Qualifiers
CDS 130..1512
/ID="FGUJEDQP_CDS_0001"
/transl_table=11
/phrog="9"
/top_hit="NC_023862_p15"
/locus_tag="FGUJEDQP_CDS_0001"
/function="head and packaging"
/product="terminase large subunit"
/source="PHANOTATE_1.5.1"
/score="-4561.899041514007"
/phase="0"
/translation="VLDDWLAIGSNGRLASGVCGVFVPRQNGKNAILEVVELFKATIQG
RRILHTAHELKSARKAFMRLRSFFENERQFPDLYRMVKTIRATNGQEAIVLHHPDCATF
ERKCGCPGWGSVEFVARSRGSARGFTVDDLVCDEAQELSDEQLEALLPTVSAAPSGDPQ
QIFLGTPPGPLADGSVVLRLRGQALSGGKRFAWTEFSIPDESDPDDLTRSWRKLAGDTN
PALGRRLNFGTVSDEHESMSAAGFARERLGWWDRGQSATSVIPADKWAQSAVDDVELVG
GKVFGVSFSRSGDRVALAGAGKADAGVHVEVIDGLSGTIVDGVGRLADWLAVRWGDTDR
IMVAGSGAVLLQKALTDRGVPGRGVVVADTGVYVEACQSFLEGVRSGVVSHPRADSRRD
MLDIAVRSAVQKRKGSAWGWGSTFKDGSEVPLEAVSLAYLGAKMAKARRRERSGRKRVS
The text was updated successfully, but these errors were encountered:
TimSkvortsov
changed the title
Phage contig reorientation with dnaapler in pharokka - terL in does not begin at position 1 in the Genbank file
Phage contig reorientation with dnaapler in pharokka - terL in the reoriented contig does not begin at position 1 in the Genbank file
Feb 6, 2025
What you have observed is an error that gene predictors often cannot score orfs in a circular fashion, and so are prone to miscall the CDS near the contig endpoints. In the context of phanotate, I am not sure how common this is but something for me to look at. But yes, I’d trust the true terL to start at 1 here given dnaapler uses tblastx/mmseqs.
On the plus side, this is a known issue and the developer of pyrodigal is working on implementing a fix - see eg gbouras13/dnaapler#90althonos/pyrodigal#65 - so hopefully soon it will be fixed at least for pyrodigal (which is an option in pharokka). Perhaps trying out pharokka on this genome with -g prodigal in any case may be interesting as it stands too.
Hello, I used pharokka the other day to reorient and annotate the genome of a newly sequenced Cutibacterium phage. The reorientation was completed correctly, but in the genbank file the CDS coordinates of terL were given as
130..1512
instead of the expected1..1512
. The attached image shows the beginning of the reoriented genome with the terL as predicted by PHANOTATE highlighted in yellow.I am not sure whether it can be called a bug or not though as I understand that dnaapler uses tblastx whilst PHANOTATE gene predictions are based on the analysis of the phage genome as a whole.
The text was updated successfully, but these errors were encountered: