You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One of the items left over from the de novo conversion in #442 is what to do with the G2P analysis.
The current de novo branch does the following with reads in the V3LOOP region:
Merge all read pairs, and look for insert lengths that are much more common than others. These are probably amplicons.
Build a consensus for each amplicon length from all the merged pairs of that length.
Try to align all read pairs to the V3LOOP reference. If they match, trim them and do the G2P analysis.
Any reads that don't align to V3LOOP go through the de novo assembly process, then we try to map them to the amplicon consensus sequences and any assembled contigs.
Finally, we combine all the reads that aligned to V3LOOP with any that mapped.
If we just build the amplicon consensus and embed that in an HIV reference, then use that as the reference to map all the reads, could we drop steps 3 and 5? The G2P analysis could be run on any reads that map to the V3LOOP region instead of aligning each read pair to the V3LOOP reference. Would that cause problems with insertions? Maybe with a mixture of amplicon lengths? Can we report V3 overlap? (That's the number of reads that mapped to a V3LOOP position without being usable for G2P analysis.)
The text was updated successfully, but these errors were encountered:
As part of #486, we ended up reporting the V3LOOP coverage and counts based on the G2P aligned reads, and the GP120 coverage and counts based on the remapped or assembled reads. This was done in commit 0b02ccb.
One of the items left over from the de novo conversion in #442 is what to do with the G2P analysis.
The current de novo branch does the following with reads in the V3LOOP region:
If we just build the amplicon consensus and embed that in an HIV reference, then use that as the reference to map all the reads, could we drop steps 3 and 5? The G2P analysis could be run on any reads that map to the V3LOOP region instead of aligning each read pair to the V3LOOP reference. Would that cause problems with insertions? Maybe with a mixture of amplicon lengths? Can we report V3 overlap? (That's the number of reads that mapped to a V3LOOP position without being usable for G2P analysis.)
The text was updated successfully, but these errors were encountered: