Replies: 3 comments
-
Hi @harrismia , Sure I can look into this. Is there any publicly available data that I could use to optimise the variant calling pipeline? |
Beta Was this translation helpful? Give feedback.
-
Hi @jodyphelan , Thanks a lot! The following BioProjects have some publicly available data: PRJNA719670, PRJNA480888, PRJNA436997 and PRJNA421446. PRJEB8783, PRJEB31443, PRJEB27802 and PRJNA598991 . |
Beta Was this translation helpful? Give feedback.
-
Thanks, I've had a look at these data and performed a small analysis to optimise the variant calling parameters - namely the frequency at which variants are called. As pacbio is fairly noisy compared to illumina a minimum altrnate fraction is required to minimise the false positive variants. Using Illumina as the gold standard I looked at pairs of Illumina and pacbio to characterise the performance at several cutoffs. ROC curveBy looking at the ROC curve it looks like 0.6 is a decent tradeoff between False positive and True positive rate. I'll add in the code to enable |
Beta Was this translation helpful? Give feedback.
-
Hi Jody,
Michael
Beta Was this translation helpful? Give feedback.
All reactions