You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The latest version was able to index a 22.5 Gb genome (1.75 million scaffolds) in 32 min using 16 cores and 99 Gb RAM, and align a file of 51,751 proteins to the index in 31 min using 16 cores and 42 Gb RAM. Thanks to @lh3 for the quick fixes! The output GFF file reports multiple alignment positions for many proteins, which is expected due to an abundance of pseudogenes in this assembly. The distribution of number of alignment positions appears to be truncated at 51 - there are 2513 proteins with 51 reported alignment positions, and no proteins with any more than that. Is this the expected behavior? In this assembly, it would not be unreasonable to see hundreds of alignment positions for some proteins.
The text was updated successfully, but these errors were encountered:
Glad to know miniprot works on your 22 Gb fragmented assembly in reasonable time. Thanks for testing!
If you want to see more alignments, increase both -N and --outn to something like:
miniprot -N 1000 --outn=1000
N controls how many hits miniprot evaluates internally. Increasing its value will make miniprot run slower. --outn controls how many hits to output. It doesn't affect performance much.
The latest version was able to index a 22.5 Gb genome (1.75 million scaffolds) in 32 min using 16 cores and 99 Gb RAM, and align a file of 51,751 proteins to the index in 31 min using 16 cores and 42 Gb RAM. Thanks to @lh3 for the quick fixes! The output GFF file reports multiple alignment positions for many proteins, which is expected due to an abundance of pseudogenes in this assembly. The distribution of number of alignment positions appears to be truncated at 51 - there are 2513 proteins with 51 reported alignment positions, and no proteins with any more than that. Is this the expected behavior? In this assembly, it would not be unreasonable to see hundreds of alignment positions for some proteins.
The text was updated successfully, but these errors were encountered: