Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question HipSTR + reference genome hg19 + ancientDNA #86

Open
arenvale opened this issue May 12, 2022 · 0 comments
Open

Question HipSTR + reference genome hg19 + ancientDNA #86

arenvale opened this issue May 12, 2022 · 0 comments

Comments

@arenvale
Copy link

Hi Thomas!
Thanks for the tool, I think it will be very useful to me. I'm fairly new to this, so maybe my questions are a bit silly. I looked to see if anyone else had already asked this question but didn't find it.
I am trying to run it to get CODIS STRs from human whole genomes. However I am having some problems.
I downloaded the hg19 reference genome, but the fasta contains what, from what I understand, are the chromosome sequences, plus other regions of each chromosome (patches/alternate locus group/unlocalized genomic contig). When I wanted to run HipSTR with this reference fasta I got an error because the chromosome names in the .bed did not correspond to those in the fasta, so I unified them. Now I got it to run but I could not recover any STR from any chromosome:

./HipSTR --bams CO001.bam --fasta hg19_refgenome.fa --regions str_codis-chrY_hg19.bed --str-vcf str_calls.vcf.gz --bam-samps CO001 --bam-libs CO001
Detected 1 BAM/CRAM files
User-specified read groups for 1 unique samples
Reading region file str_codis-chrY_hg19.bed
Region file contains 30 regions

Processing region 11 2192317 2192345
0 reads overlapped region, of which 
	0 were hard clipped
	0 had an 'N' base call
	0 had low base quality scores
	0 did not have a unique mapping
	0 did not have a mate pair
	0 PASSED ALL FILTERS
Found 0 fully paired reads and 0 unpaired reads for downstream analyses
Removed 0 sets of PCR duplicate reads
Phased SNPs add info for 0 out of 0 reads and 0 out of 0 samples
Skipping locus with too few reads: TOTAL=0, MIN=100

I don't know if I am using a fasta reference genome that is not the correct one, or what the problem is. I hope I have explained well.
And I would like to ask you another question: do you know if there are any restrictions on using this tool with ancient whole genomes?
Thank you very much for your help!
Valeria

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant