-
-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New: added Barcode qualities keeping in FastqToBam #932
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@galaxy001: this is fantastic, and you've done a really good job.
I'd change the command line option to be a boolean (default false) to opt-in to adding the sample barcode base qualities, and then hardcode the tag as QT
as per the SAM spec (just like BC
). Thoughts?
And then once we have this approved, I can add a unit test so don't worry about that.
@@ -92,6 +92,7 @@ class FastqToBam | |||
@arg(flag='s', doc="If true, queryname sort the BAM file, otherwise preserve input order.") val sort: Boolean = false, | |||
@arg(flag='u', doc="Tag in which to store molecular barcodes/UMIs.") val umiTag: String = ConsensusTags.UmiBases, | |||
@arg(flag='q', doc="Tag in which to store molecular barcode/UMI qualities.") val umiQualTag: Option[String] = None, | |||
@arg(flag='b', doc="Tag in which to store sample barcode qualities.") val smpQualTag: Option[String] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we have a short flag, it's not the worst to have a fully qualified long name
@arg(flag='b', doc="Tag in which to store sample barcode qualities.") val smpQualTag: Option[String] = None, | |
@arg(flag='b', doc="Tag in which to store sample barcode qualities.") val sampleBarcodeQualTag: Option[String] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you thin about just making this a boolean
value, and then hard code QT
below (like we do for BC
) since thats the recommended tag from the SAM spec?
@@ -157,6 +158,7 @@ class FastqToBam | |||
try { | |||
val subs = fqs.iterator.zip(structures.iterator).flatMap { case(fq, rs) => rs.extract(fq.bases, fq.quals) }.toIndexedSeq | |||
val sampleBarcode = subs.iterator.filter(_.kind == SampleBarcode).map(_.bases).mkString("-") | |||
val smpQual = subs.iterator.filter(_.kind == SampleBarcode).map(_.quals).mkString(" ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- updated the name as per above
- Assuming you'll use
QT
as per the SAM spec, in which case we want it hyphen delimited:
The recommended implementation concatenates all the barcodes and places a hyphen (‘-’) between the barcodes from the same template.
val smpQual = subs.iterator.filter(_.kind == SampleBarcode).map(_.quals).mkString(" ") | |
val sampleBarcodeQualTag = subs.iterator.filter(_.kind == SampleBarcode).map(_.quals).mkString("-") |
@@ -181,6 +183,7 @@ class FastqToBam | |||
} | |||
|
|||
if (sampleBarcode.nonEmpty) rec("BC") = sampleBarcode | |||
if (smpQual.nonEmpty) smpQualTag.foreach(rec(_) = smpQual) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd hard code to QT
as per above
Hard code to |
And for |
Lets follow the spec as closely as possible, and I misread it! I also want to keep backwards compatibility at this point. How about:
|
If we choose to follow the spec recommendation, the I think the So, there are two options:
|
It's probably easier for me to explain in code, how about this: #933? |
It is fine to consider barcode only this time. And barcode is preferred to use hard-coded tag name. I made a comment on L95. |
For issue #931.
This is my first Java code.
The unit test is a bit difficult for me. I tried on real data with and without
-b
, the output bam seems fine.