Speeding up the duplex consensus caller #493

nh13 · 2019-06-21T23:09:02Z

This can speed up the consensus caller by 1.5-2.15x for 2 and 4 threads respectively. Speeds up by 1.15x for a single thread. If there is very high per-molecule coverage, this can speed up by 20x or more (single threaded).

codecov-io · 2019-07-05T23:58:49Z

Codecov Report

Merging #493 into master will increase coverage by 0.01%.
The diff coverage is 91.24%.

@@            Coverage Diff            @@
##           master    #493      +/-   ##
=========================================
+ Coverage   95.59%   95.6%   +0.01%     
=========================================
  Files          92      91       -1     
  Lines        5448    5512      +64     
  Branches      702     691      -11     
=========================================
+ Hits         5208    5270      +62     
- Misses        240     242       +2

Impacted Files	Coverage Δ
.../scala/com/fulcrumgenomics/util/NumericTypes.scala	`95.94% <ø> (ø)`	⬆️
...cala/com/fulcrumgenomics/umi/ConsensusCaller.scala	`94.44% <100%> (+0.1%)`	⬆️
...crumgenomics/umi/CallMolecularConsensusReads.scala	`100% <100%> (ø)`	⬆️
.../scala/com/fulcrumgenomics/bam/api/SamRecord.scala	`86.55% <100%> (+1.08%)`	⬆️
...om/fulcrumgenomics/umi/DuplexConsensusCaller.scala	`96% <100%> (+0.72%)`	⬆️
...fulcrumgenomics/umi/CallDuplexConsensusReads.scala	`100% <100%> (ø)`	⬆️
...fulcrumgenomics/umi/ConsensusCallingIterator.scala	`100% <100%> (+5.88%)`	⬆️
...ulcrumgenomics/umi/VanillaUmiConsensusCaller.scala	`89.77% <12.5%> (-5.53%)`	⬇️
...a/com/fulcrumgenomics/umi/UmiConsensusCaller.scala	`93.93% <88.23%> (-0.47%)`	⬇️
...om/fulcrumgenomics/umi/SimpleConsensusCaller.scala	`96% <88.88%> (+0.76%)`	⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b4f2d06...da69e3c. Read the comment docs.

tfenne

Lots of comments. In general I'm on board, but there are a couple of places where the optimized implementation is much less readable than the original. In those places I'd really like to either a) revert the implementation, b) find a more readable optimized version, or c) be convinced that the trade-off is worth it.

src/main/scala/com/fulcrumgenomics/umi/CallDuplexConsensusReads.scala

src/main/scala/com/fulcrumgenomics/umi/ConsensusCaller.scala

src/main/scala/com/fulcrumgenomics/umi/ConsensusCallingIterator.scala

src/main/scala/com/fulcrumgenomics/umi/UmiConsensusCaller.scala

tfenne · 2019-07-15T18:55:19Z

src/main/scala/com/fulcrumgenomics/umi/UmiConsensusCaller.scala

@@ -313,36 +320,33 @@ trait UmiConsensusCaller[C <: SimpleRead] {
    * NOTE: filtered out reads are sent to the [[rejectRecords]] method and do not need further handling
    */
  protected[umi] def filterToMostCommonAlignment(recs: Seq[SourceRead]): Seq[SourceRead] = {


I find this new implementation impossible to follow. I've literally been staring at it side-by-side in IntelliJ with the old implementation for ~10 minutes and I still struggle to follow it. Two thoughts:

Unless this makes a very significant different I would just revert to the old method that I think was easier to follow.

If it does make a significant difference, then I think we need to find a clearer implementation. I wonder if something like the following would work:

case class AlignmentGroup(cigar: Cigar, reads: mutable.Buffer[SourceRead]) protected[umi] def filterToMostCommonAlignment(recs: Seq[SourceRead]): Seq[SourceRead] = { val groups = new ArrayBuffer[AlignmentGroup] recs.sortBy(r => -r.length).foreach { rec => val simpleCigar = simplifyCigar(rec.cigar) var found = false groups.foreach { g => if (simpleCigar.isPrefixOf(g.cigar) { g.reads += rec; found = true } } if (!found) { val newGroup = AlignmentGroup(simpleCigar, new ArrayBuffer[SourceRead](recs.size)) newGroup += rec groups += newGroup } } if (groups.isEmpty) { Seq.empty } else { val sorted = groups.sortBy(g => - g.size) val keepers = sorted.head val rejects = recs.filter(r => !keepers.contains(r)) rejectRecords(rejects.flatMap(_.sam), FilterMinorityAlignment) keepers } }

The problem with the previous and your implementation is the keepers.contains method can be really really slow if we have many raw reads with the same cigar (think 20 cigar groups, each with 1000 reads). This is the point of optimizing, and it really does make a difference. I'll try to clean it up.

I see. I'm curious if you tried either a) replacing the recs.filter with a recs.diff(keepers) or be creating a val keepSet = keepers.toSet and then calling contains on that? You'd still pay the cost of the conversion to a set, but then the lookup time would be constant.

src/main/scala/com/fulcrumgenomics/umi/VanillaUmiConsensusCaller.scala

nh13

@tfenne I fixed up everything that I agreed with, and left some of your comments so we can discuss offline as I'd like more input, with all but one a good idea.

src/main/scala/com/fulcrumgenomics/umi/ConsensusCaller.scala

src/main/scala/com/fulcrumgenomics/umi/ConsensusCallingIterator.scala

nh13 · 2019-07-16T18:55:01Z

src/main/scala/com/fulcrumgenomics/umi/ConsensusCallingIterator.scala

+    }
+    else {
+      val pool          = new ForkJoinPool(threads, ForkJoinPool.defaultForkJoinWorkerThreadFactory, null, true)
+      val bufferedIter  = groupingIterator.bufferBetter


I actually tried this, but it slowed things down for some reason! You can see evidence of that via the unused import on line 29. I'd certainly welcome you helping figure out why.

src/main/scala/com/fulcrumgenomics/umi/DuplexConsensusCaller.scala

src/main/scala/com/fulcrumgenomics/umi/VanillaUmiConsensusCaller.scala

nh13 · 2019-07-16T19:19:42Z

src/main/scala/com/fulcrumgenomics/umi/UmiConsensusCaller.scala

@@ -313,36 +320,33 @@ trait UmiConsensusCaller[C <: SimpleRead] {
    * NOTE: filtered out reads are sent to the [[rejectRecords]] method and do not need further handling
    */
  protected[umi] def filterToMostCommonAlignment(recs: Seq[SourceRead]): Seq[SourceRead] = {


The problem with the previous and your implementation is the keepers.contains method can be really really slow if we have many raw reads with the same cigar (think 20 cigar groups, each with 1000 reads). This is the point of optimizing, and it really does make a difference. I'll try to clean it up.

Not faster in my hands and we already depend on apache math

@nh13

Changes to CallDuplexConsensusReads: - added the --threads option to support multi-threading; 4-8 threads seems like a decent trade-off. - added the --max-reads-per-strand option, for when the per-molecule coverage is very high, thus causing the tool to run slowly. Consensus calling API Implemented many performance optmizations found during profiling for consensus calling. Notable examples include: - multi-threaded support in the consensus calling iterator; non-duplex consensus callers could support this in the future. - faster grouping of raw reads based on simplified cigars - caching of the expensive to retrieve per-read molecular identifier - caching some expensive to compute log-probabilities in the core consensus caller Both @nh13 and @tfenne contributed to this commit.

nh13 · 2019-07-23T20:01:24Z

A TODO includes a reference to fulcrumgenomics/commons#51

See: fulcrumgenomics/fgbio#493

* Add new options to CallDuplexConsensusReads See: fulcrumgenomics/fgbio#493

nh13 force-pushed the nh_par_cc branch from 97061a3 to b5cc0af Compare June 21, 2019 23:10

nh13 requested a review from tfenne July 8, 2019 21:29

nh13 assigned tfenne Jul 8, 2019

nh13 marked this pull request as ready for review July 8, 2019 21:30

tfenne requested changes Jul 15, 2019

View reviewed changes

nh13 commented Jul 16, 2019

View reviewed changes

nh13 added 2 commits July 23, 2019 12:19

Deprecating fgibo Writer in favor of commons writer

a0920ca

Using apache FastMath instead of jafama

ea40d22

Not faster in my hands and we already depend on apache math

nh13 force-pushed the nh_par_cc branch 2 times, most recently from 4f3d545 to 2cf9d3f Compare July 23, 2019 19:38

nh13 force-pushed the nh_par_cc branch from 2cf9d3f to da69e3c Compare July 23, 2019 20:00

nh13 merged commit 426a258 into master Jul 24, 2019

nh13 deleted the nh_par_cc branch July 24, 2019 17:59

nh13 added a commit to fulcrumgenomics/dagr that referenced this pull request Jul 24, 2019

Add new options to CallDuplexConsensusReads

e472e50

See: fulcrumgenomics/fgbio#493

nh13 mentioned this pull request Jul 24, 2019

Add new options to CallDuplexConsensusReads fulcrumgenomics/dagr#353

Merged

nh13 added a commit to fulcrumgenomics/dagr that referenced this pull request Jul 24, 2019

Add new options to CallDuplexConsensusReads (#353)

32d96a9

* Add new options to CallDuplexConsensusReads See: fulcrumgenomics/fgbio#493

clintval mentioned this pull request Oct 22, 2023

Fix reference to transient MI tag in DuplexConsensusCaller #946

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speeding up the duplex consensus caller #493

Speeding up the duplex consensus caller #493

nh13 commented Jun 21, 2019 •

edited

Loading

codecov-io commented Jul 5, 2019 •

edited

Loading

tfenne left a comment

tfenne Jul 15, 2019

nh13 Jul 16, 2019

tfenne Jul 16, 2019

nh13 left a comment

nh13 Jul 16, 2019

nh13 Jul 16, 2019

nh13 commented Jul 23, 2019

Speeding up the duplex consensus caller #493

Speeding up the duplex consensus caller #493

Conversation

nh13 commented Jun 21, 2019 • edited Loading

codecov-io commented Jul 5, 2019 • edited Loading

Codecov Report

tfenne left a comment

Choose a reason for hiding this comment

tfenne Jul 15, 2019

Choose a reason for hiding this comment

nh13 Jul 16, 2019

Choose a reason for hiding this comment

tfenne Jul 16, 2019

Choose a reason for hiding this comment

nh13 left a comment

Choose a reason for hiding this comment

nh13 Jul 16, 2019

Choose a reason for hiding this comment

nh13 Jul 16, 2019

Choose a reason for hiding this comment

nh13 commented Jul 23, 2019

nh13 commented Jun 21, 2019 •

edited

Loading

codecov-io commented Jul 5, 2019 •

edited

Loading