Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit to reported alignments? #13

Closed
rwhetten opened this issue Sep 26, 2022 · 1 comment
Closed

Limit to reported alignments? #13

rwhetten opened this issue Sep 26, 2022 · 1 comment
Labels
question Further information is requested

Comments

@rwhetten
Copy link

The latest version was able to index a 22.5 Gb genome (1.75 million scaffolds) in 32 min using 16 cores and 99 Gb RAM, and align a file of 51,751 proteins to the index in 31 min using 16 cores and 42 Gb RAM. Thanks to @lh3 for the quick fixes! The output GFF file reports multiple alignment positions for many proteins, which is expected due to an abundance of pseudogenes in this assembly. The distribution of number of alignment positions appears to be truncated at 51 - there are 2513 proteins with 51 reported alignment positions, and no proteins with any more than that. Is this the expected behavior? In this assembly, it would not be unreasonable to see hundreds of alignment positions for some proteins.

@lh3
Copy link
Owner

lh3 commented Sep 26, 2022

Glad to know miniprot works on your 22 Gb fragmented assembly in reasonable time. Thanks for testing!

If you want to see more alignments, increase both -N and --outn to something like:

miniprot -N 1000 --outn=1000

N controls how many hits miniprot evaluates internally. Increasing its value will make miniprot run slower. --outn controls how many hits to output. It doesn't affect performance much.

@lh3 lh3 closed this as completed Sep 26, 2022
@lh3 lh3 added the question Further information is requested label Sep 26, 2022
lh3 added a commit that referenced this issue Sep 27, 2022
It doesn't hurt to have a large default value. Related to #13.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants