-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancing somatic variant calling and execution speed #22
Comments
Hi longphase team. Thanks for asking. We spoke in more detail via email. ClairS is ready to make use of additional HP taggings in addition to the current HP1 and HP2. Basically, there is no limit to the number of HP categories ClairS can take. For parallelization, supporting range processing sounds good, ClairS will most likely use it in a per chromosome fashion. |
We have released version 1.7. The complete list of haplotag parameters
|
@sloth-eat-pudding ZX |
In a #18 (comment) issue, it was mentioned that a "longer phaseset and an improved haplotagging ratio" were needed. Therefore, I attempted to incorporate indels for phasing and haplotagging. Explanation of data sources
Would you be interested in trying to incorporate indels as well? |
Yes, doing that in the next version. |
Hello,
I am working with HCC1395 data, analyzing tumor samples at 75x coverage and normal samples at 45x coverage. I utilized Clair3 to process the normal.bam file, generating a normal.vcf. This file was then employed for phasing and haplotagging the tumor.bam, followed by using a somatic mutation caller. The results showed a notable decrease in false positives.
In an instance where false positives were converted to true negatives, it was observed that the mutations in the normal sample were heterozygous, whereas in the tumor sample, they were homozygous. This suggests a loss of heterozygosity (LOH) event, making the strategy of phasing and tagging most reads into the same haplotype seem correct. Have you considered this method?
Moreover, I noted in literature that the primary reason for choosing Longphase for phasing is its speed. We still have a speed advantage in haplotagging. ClairS employs parallel acceleration at the chromosome level and we can introduce a feature to specify a range. Could this reduce the training costs for you? I also conducted a haplotag test, and the results do not seem to show any significant differences.
The text was updated successfully, but these errors were encountered: