Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt G2P to de novo assembly? #481

Closed
donkirkby opened this issue Oct 9, 2019 · 1 comment
Closed

Adapt G2P to de novo assembly? #481

donkirkby opened this issue Oct 9, 2019 · 1 comment

Comments

@donkirkby
Copy link
Member

One of the items left over from the de novo conversion in #442 is what to do with the G2P analysis.

The current de novo branch does the following with reads in the V3LOOP region:

  1. Merge all read pairs, and look for insert lengths that are much more common than others. These are probably amplicons.
  2. Build a consensus for each amplicon length from all the merged pairs of that length.
  3. Try to align all read pairs to the V3LOOP reference. If they match, trim them and do the G2P analysis.
  4. Any reads that don't align to V3LOOP go through the de novo assembly process, then we try to map them to the amplicon consensus sequences and any assembled contigs.
  5. Finally, we combine all the reads that aligned to V3LOOP with any that mapped.

If we just build the amplicon consensus and embed that in an HIV reference, then use that as the reference to map all the reads, could we drop steps 3 and 5? The G2P analysis could be run on any reads that map to the V3LOOP region instead of aligning each read pair to the V3LOOP reference. Would that cause problems with insertions? Maybe with a mixture of amplicon lengths? Can we report V3 overlap? (That's the number of reads that mapped to a V3LOOP position without being usable for G2P analysis.)

@donkirkby
Copy link
Member Author

As part of #486, we ended up reporting the V3LOOP coverage and counts based on the G2P aligned reads, and the GP120 coverage and counts based on the remapped or assembled reads. This was done in commit 0b02ccb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant