Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New: added Barcode qualities keeping in FastqToBam #932

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/main/scala/com/fulcrumgenomics/fastq/FastqToBam.scala
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ class FastqToBam
@arg(flag='s', doc="If true, queryname sort the BAM file, otherwise preserve input order.") val sort: Boolean = false,
@arg(flag='u', doc="Tag in which to store molecular barcodes/UMIs.") val umiTag: String = ConsensusTags.UmiBases,
@arg(flag='q', doc="Tag in which to store molecular barcode/UMI qualities.") val umiQualTag: Option[String] = None,
@arg(flag='b', doc="Tag in which to store sample barcode qualities.") val smpQualTag: Option[String] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we have a short flag, it's not the worst to have a fully qualified long name

Suggested change
@arg(flag='b', doc="Tag in which to store sample barcode qualities.") val smpQualTag: Option[String] = None,
@arg(flag='b', doc="Tag in which to store sample barcode qualities.") val sampleBarcodeQualTag: Option[String] = None,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you thin about just making this a boolean value, and then hard code QT below (like we do for BC) since thats the recommended tag from the SAM spec?

@arg(flag='n', doc="Extract UMI(s) from read names and prepend to UMIs from reads.") val extractUmisFromReadNames: Boolean = false,
@arg( doc="Read group ID to use in the file header.") val readGroupId: String = "A",
@arg( doc="The name of the sequenced sample.") val sample: String,
Expand Down Expand Up @@ -157,6 +158,7 @@ class FastqToBam
try {
val subs = fqs.iterator.zip(structures.iterator).flatMap { case(fq, rs) => rs.extract(fq.bases, fq.quals) }.toIndexedSeq
val sampleBarcode = subs.iterator.filter(_.kind == SampleBarcode).map(_.bases).mkString("-")
val smpQual = subs.iterator.filter(_.kind == SampleBarcode).map(_.quals).mkString(" ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. updated the name as per above
  2. Assuming you'll use QT as per the SAM spec, in which case we want it hyphen delimited:

The recommended implementation concatenates all the barcodes and places a hyphen (‘-’) between the barcodes from the same template.

Suggested change
val smpQual = subs.iterator.filter(_.kind == SampleBarcode).map(_.quals).mkString(" ")
val sampleBarcodeQualTag = subs.iterator.filter(_.kind == SampleBarcode).map(_.quals).mkString("-")

val umi = subs.iterator.filter(_.kind == MolecularBarcode).map(_.bases).mkString("-")
val umiQual = subs.iterator.filter(_.kind == MolecularBarcode).map(_.quals).mkString(" ")
val templates = subs.iterator.filter(_.kind == Template).toList
Expand All @@ -181,6 +183,7 @@ class FastqToBam
}

if (sampleBarcode.nonEmpty) rec("BC") = sampleBarcode
if (smpQual.nonEmpty) smpQualTag.foreach(rec(_) = smpQual)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd hard code to QT as per above


// Set the UMI on the read depending on whether we got UMIs from the read names, reads or both
(umi, umiFromReadName) match {
Expand Down