You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been testing out the software, and I noticed a discrepancy in HLA-HFE between using a BAM and FASTQ input. Where the FASTQ input would report high abundance and quality, the BAM input would report nothing. I looked through all the read assignments from the FASTQ results, and each one maps to the HFE gene on chr6: https://useast.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000010704;r=6:26087226-26098343.
The reason for this lack of mapping is that the GENCODE and Ensembl GTFs just refer to this gene as HFE. AddGeneCoord.pl was looks for exactly HLA-HFE, comes up with no matches, and leaves the unmapped default.
When I manually alter the coordinates file to have the HLA-HFE mapping match the Ensembl HFE coordinates, the BAM and FASTQ runs agree on the abundance/quality.
To summarize:
When running with FASTQs, reads that were mapped to the HFE gene are assigned to HLA-HFE
Neither GENCODE nor Ensembl GTFs contain reference to a gene named HLA-HFE
Coordinate files do not map HLA-HFE to the HFE gene region
The mapped reads in the HFE gene region are not pulled during BAM extraction
BAM processing assigns no alleles/abundance/quality for HLA-HFE
I don't think this is an issue for any other HLA contig. The only others that remain unmapped are HLA-DRB3, HLA-DRB4, and HLA-Y. As far as I can tell none of these has a corresponding mapping in the genome.
The text was updated successfully, but these errors were encountered:
Thank you very much for finding this problem. I'll check whether there are annotations containing HLA-H/F/E, or create our own list of gene coordinates and use that as the input for AddGeneCoord.pl.
Thank you for finding this issue. I just added an option "--gtf-gene-name-mapping" to the AddGeneCoord.pl script. Its default value is "HFE:HLA-HFE" and we can use comma-split string to represent other gene name mappings. This will internally map the gene name in the GTF to the name specified by the user. Hope this can help resolve this issue.
I've been testing out the software, and I noticed a discrepancy in HLA-HFE between using a BAM and FASTQ input. Where the FASTQ input would report high abundance and quality, the BAM input would report nothing. I looked through all the read assignments from the FASTQ results, and each one maps to the HFE gene on chr6: https://useast.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000010704;r=6:26087226-26098343.
In your manuscript HLA files (https://github.com/mourisl/T1K_manuscript_evaluation/blob/master/hlaidx_3_44_0.tar.gz) as well as ones I made following the documentation directions, HLA-HFE receives no mapping in the coordinate files:
The reason for this lack of mapping is that the GENCODE and Ensembl GTFs just refer to this gene as
HFE
. AddGeneCoord.pl was looks for exactlyHLA-HFE
, comes up with no matches, and leaves the unmapped default.When I manually alter the coordinates file to have the HLA-HFE mapping match the Ensembl HFE coordinates, the BAM and FASTQ runs agree on the abundance/quality.
To summarize:
HFE
gene are assigned toHLA-HFE
HLA-HFE
HLA-HFE
to theHFE
gene regionHFE
gene region are not pulled during BAM extractionHLA-HFE
I don't think this is an issue for any other HLA contig. The only others that remain unmapped are HLA-DRB3, HLA-DRB4, and HLA-Y. As far as I can tell none of these has a corresponding mapping in the genome.
The text was updated successfully, but these errors were encountered: