-
-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CallMolecularConsensusReads truncate reads #759
Comments
@ruolin I think this is a bug. I'll update shortly. |
Buggy output, where the reads are shorter by exactly the # of bases inserted.
|
@ruolin we do some upfront quality trimming, but more importantly trimming based on the insert size, so that we don't include reads on end of a pair that map past bases on the mate. But this calculate (I think) is wrong based on your example, and causes the read to be truncated/shortened by exactly the insertion in your case, and more generally the difference in mapped read bases versus mapped reference bases. I have tested your test case on the fix. Thank-you for the very clear report and test case. I am hoping we can include the fix soon. |
@nh13 Hi Thanks for quick update. The quality trimming and the pass the mate end trimming might explain some of the other truncations I have seen. But There is currently no option to turn the trimming off in the |
@ruolin For quality trimming and mate-end trimming, it's always possible to add a command line option. It makes more sense for the former, but for the latter, I am not sure it makes a lot of sense to skip trimming if we have reads that past the start of its mate. I think rather than turning off the trimming, you could try running the fix to see if that works? |
Sounds good. thanks. |
Hi fgbio community,
I have a single read-pair family (cD=1) which after call consensus was truncated. This is indeed weird because it should just do nothing and output the same read. To reproduce this run
java fgbio-1.5.0.jar CallMolecularConsensusReads -i a.bam -o a.ccs.bam -M 1
. The following is the a.bam.The text was updated successfully, but these errors were encountered: