From 1ab0e445f1c01a97a8248ca831fec4cd8d50dfeb Mon Sep 17 00:00:00 2001 From: Heng Li Date: Mon, 12 Dec 2022 16:18:31 -0500 Subject: [PATCH] Release miniprot-0.6 (r185) --- NEWS.md | 14 ++++++++++++++ miniprot.1 | 2 +- miniprot.h | 2 +- tex/miniprot.tex | 10 +++++----- 4 files changed, 21 insertions(+), 7 deletions(-) diff --git a/NEWS.md b/NEWS.md index fb3ab4d..b229200 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,17 @@ +Release 0.6-r185 (12 December 2022) +----------------------------------- + +Notable changes: + + * Improvement: for each protein, only output alignments close to the best + alignment. Also added option --outs to tune the threshold. + + * New feature: output GTF with option --gtf. + +(0.6: 22 December 2022, r185) + + + Release 0.5-r179 (17 October 2022) ---------------------------------- diff --git a/miniprot.1 b/miniprot.1 index 8a203e2..27aede2 100644 --- a/miniprot.1 +++ b/miniprot.1 @@ -1,4 +1,4 @@ -.TH miniprot 1 "17 October 2022" "miniprot-0.5 (r179)" "Bioinformatics tools" +.TH miniprot 1 "12 December 2022" "miniprot-0.6 (r185)" "Bioinformatics tools" .SH NAME .PP miniprot - protein-to-genome alignment with splicing and frameshifts diff --git a/miniprot.h b/miniprot.h index 5bb5438..4e298ca 100644 --- a/miniprot.h +++ b/miniprot.h @@ -3,7 +3,7 @@ #include -#define MP_VERSION "0.5-r182-dirty" +#define MP_VERSION "0.6-r185" #define MP_F_NO_SPLICE 0x1 #define MP_F_NO_ALIGN 0x2 diff --git a/tex/miniprot.tex b/tex/miniprot.tex index 26864b6..34d53b9 100644 --- a/tex/miniprot.tex +++ b/tex/miniprot.tex @@ -438,7 +438,7 @@ \subsection{Evaluated tools} To evaluate what aligners can map proteins to a whole genome, we randomly sampled 1\% of zebrafish proteins and mapped with various aligners. Only -miniprot-0.5, Spaln2-2.4.13c~\citep{Iwata:2012aa}, GeMoMa-1.9~\citep{Keilwagen:2019wz} +miniprot-0.6, Spaln2-2.4.13c~\citep{Iwata:2012aa}, GeMoMa-1.9~\citep{Keilwagen:2019wz} GenomeThreader-1.7.3~\citep{DBLP:journals/infsof/GremmeBSK05} could finish the alignment in an hour. GenomeThreader found less than 30\% of coding regions in Spaln2 or miniprot alignment. It is not sensitive enough for the human-fish @@ -538,10 +538,10 @@ \subsection{Evaluating protein-to-genome alignment} careful algorithm. Table~\ref{tab:eval} only considers the best hit of each protein. Miniprot by -default may output multiple suboptimal alignments. If we count all -human-zebrafish alignments, we could improve the base sensitivity to 65.32\% -but with junction accuracy dropped to 90.87\%. The base specificity drops -further to 84.96\% because miniprot starts to report pseudogenes. +default may output multiple suboptimal alignments per protein if their +alignment scores are no less than 99\% of the best alignment. If we count all +human-zebrafish alignments outputted by minimap2, we could improve the base +sensitivity to 60.76\% with a minor cost on base specificity to 95.25\%. \section{Discussions}