Cancer virus and human decoy search use cases not supported #429

diekhans · 2015-09-30T16:32:32Z

One cancer use case is to mapping reads to both human and viral reference genomes simultaneous to detect the presences of viral DNA or RNA in samples.

A similar use case in including decoy sequences of known human genomic DNA sequence that have not been incorporated in reference genome. This is to help prevent incorrect mapping of READS to homologous DNA that is in the reference sequence.

The current approach with file based referees is to both of these issues is to create a combined reference file. For instance, TCGA create a composite reference genomes which consistent of GRCh37-lite and number of viral genomes, for example GRCh37-lite_WUGSC_variant_2:

https://browser.cghub.ucsc.edu/help/assemblies/#GRCh37-lite_WUGSC_variant_2

This results in a proliferation of composite references that have no standard method of identification.

A clear and more robust approach would be to support multiple mapping targets in the API. Instead of a single ReferenceSet as mapping targets, a list of ReferenceSets could address this issue.

david4096 · 2017-02-23T20:16:34Z

If we go with this approach then we have to enforce that reference IDs are used when constructing position requests. The reason being, we no longer have uniqueness guarantees of the reference names used. #616

An alternative would be to document that when multiple references are used, the reference names must be unique from the super set.

To immediately support this use case you might prepare a reference set that contains both the viral sequences and the base assembly. Although that link is now dead, I suspect that is what folks do in practice, make a FASTA with everything they want to align to and go.

diekhans added enhancement API Consistency labels Apr 1, 2016

diekhans mentioned this issue Apr 2, 2016

dataset used inconsistently across API #595

Open

kozbo removed the API Consistency label Nov 14, 2016

kozbo mentioned this issue Feb 22, 2017

Integrate GA4GH Use Case review from EBI #822

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cancer virus and human decoy search use cases not supported #429

Cancer virus and human decoy search use cases not supported #429

diekhans commented Sep 30, 2015

david4096 commented Feb 23, 2017 •

edited

Loading

Cancer virus and human decoy search use cases not supported #429

Cancer virus and human decoy search use cases not supported #429

Comments

diekhans commented Sep 30, 2015

david4096 commented Feb 23, 2017 • edited Loading

david4096 commented Feb 23, 2017 •

edited

Loading