Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

duplicate and close variants of the same alignment in the output #37

Open
azat-badretdin opened this issue Mar 29, 2023 · 4 comments
Open
Labels
enhancement New feature or request question Further information is requested

Comments

@azat-badretdin
Copy link

When I use these parameters:

 ./miniprot -G 100 -O 10 -J 34 -F 30 --gff -ut32 nucleotide.fasta proteins.fasta

I get very close variants of the same alignment:

gpipedev21:issue-34$ grep WP_004242317 miniprot.gff  | grep PAF
##PAF   gi|490362554|ref|WP_004242317.1|        343     149     343     +       gi|545778205|gb|U00096.3|       4641652 3221864 3222446 402     582     0       AS:i:680        ms:i:680      np:i:159 da:i:-1 do:i:0  cg:Z:194M       cs:Z::2*accC*gacS*aatA*atcV:2*atcV:2*cacS*gaaD*cccR*ggcQ:1*ggtD:9*cgcY:1*agtA*aaaQ*gaaS*atcV*atcT:2*tatF:1*aacA:2*gttY*aatD:7*gaaQ:1*gagS:1*ggcA*aagA:8*gcgT:3*cgaS:1*aaaR*caaG:3*gaaG:3*tggY:2*ggtD:3*tcgA:3*gaaA:7*cggG:1*gacS:19*attL:2*cgaQ*ggcH*ctgI*aacA:2*cagE:2*tcgA:10*cgaK:2*tttI:1*ccgS:9*atgV:8*gtgL*tatF:1*aaaR*gccL:2*ggtE:1*gcgQ*ctgE:2*ttaQ*gtcI:1*gttA*cccA:1*aaaR:1*aaaI:5*cgtK
##PAF   gi|490362554|ref|WP_004242317.1|        343     154     343     +       gi|545778205|gb|U00096.3|       4641652 3221879 3222446 396     567     0       AS:i:675        ms:i:675      np:i:157 da:i:-1 do:i:0  cg:Z:189M       cs:Z:*atcV:2*atcV:2*cacS*gaaD*cccR*ggcQ:1*ggtD:9*cgcY:1*agtA*aaaQ*gaaS*atcV*atcT:2*tatF:1*aacA:2*gttY*aatD:7*gaaQ:1*gagS:1*ggcA*aagA:8*gcgT:3*cgaS:1*aaaR*caaG:3*gaaG:3*tggY:2*ggtD:3*tcgA:3*gaaA:7*cggG:1*gacS:19*attL:2*cgaQ*ggcH*ctgI*aacA:2*cagE:2*tcgA:10*cgaK:2*tttI:1*ccgS:9*atgV:8*gtgL*tatF:1*aaaR*gccL:2*ggtE:1*gcgQ*ctgE:2*ttaQ*gtcI:1*gttA*cccA:1*aaaR:1*aaaI:5*cgtK

This also expresses itself, maybe, in duplication of some alignment output. For example:

gi|545778205|gb|U00096.3|       miniprot        CDS     729583  733323  6547    +       0       Parent=MP001848;Rank=18;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        mRNA    729583  733323  6547    +       .       ID=MP001849;Rank=19;Identity=0.9719;Positive=0.9783;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        CDS     729583  733323  6547    +       0       Parent=MP001849;Rank=19;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        mRNA    729583  733323  6547    +       .       ID=MP001850;Rank=20;Identity=0.9719;Positive=0.9783;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        CDS     729583  733323  6547    +       0       Parent=MP001850;Rank=20;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        mRNA    729583  733323  6547    +       .       ID=MP001851;Rank=21;Identity=0.9719;Positive=0.9783;Target=gi|15829983|ref|NP_308756.1| 1 1247
gi|545778205|gb|U00096.3|       miniprot        CDS     729583  733323  6547    +       0       Parent=MP001851;Rank=21;Identity=0.9719;Target=gi|15829983|ref|NP_308756.1| 1 1247

The alignments are the same, but the Rank=x value is different in each case.

@lh3
Copy link
Owner

lh3 commented Apr 3, 2023

These two different hits. For now, you have to filter them out by yourself.

@lh3 lh3 added question Further information is requested enhancement New feature or request labels Apr 3, 2023
@azat-badretdin
Copy link
Author

Thanks. Which example are you talking about? Or both?

@lh3
Copy link
Owner

lh3 commented Apr 3, 2023

Both

@azat-badretdin
Copy link
Author

For now

This seems that there is a hope that the hits will be on per region in the future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants