Skip to content
This repository has been archived by the owner on Oct 28, 2022. It is now read-only.

Cancer virus and human decoy search use cases not supported #429

Open
diekhans opened this issue Sep 30, 2015 · 1 comment
Open

Cancer virus and human decoy search use cases not supported #429

diekhans opened this issue Sep 30, 2015 · 1 comment

Comments

@diekhans
Copy link
Contributor

One cancer use case is to mapping reads to both human and viral reference genomes simultaneous to detect the presences of viral DNA or RNA in samples.

A similar use case in including decoy sequences of known human genomic DNA sequence that have not been incorporated in reference genome. This is to help prevent incorrect mapping of READS to homologous DNA that is in the reference sequence.

The current approach with file based referees is to both of these issues is to create a combined reference file. For instance, TCGA create a composite reference genomes which consistent of GRCh37-lite and number of viral genomes, for example GRCh37-lite_WUGSC_variant_2:

https://browser.cghub.ucsc.edu/help/assemblies/#GRCh37-lite_WUGSC_variant_2

This results in a proliferation of composite references that have no standard method of identification.

A clear and more robust approach would be to support multiple mapping targets in the API. Instead of a single ReferenceSet as mapping targets, a list of ReferenceSets could address this issue.

@david4096
Copy link
Member

david4096 commented Feb 23, 2017

If we go with this approach then we have to enforce that reference IDs are used when constructing position requests. The reason being, we no longer have uniqueness guarantees of the reference names used. #616

An alternative would be to document that when multiple references are used, the reference names must be unique from the super set.

To immediately support this use case you might prepare a reference set that contains both the viral sequences and the base assembly. Although that link is now dead, I suspect that is what folks do in practice, make a FASTA with everything they want to align to and go.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants