Read Length fix #78

kachulis · 2023-03-20T20:29:34Z

Max read length is tested against alignment span (end - start), so can end up being incorrect. This PR fixes this, but testing against true read length.

agraubert · 2023-05-17T14:21:58Z

alignmentSize is really only used for checking an edge case in legacy mode.
You may notice that on line 277 we use alignmentSize rather than alignment.Length() for overriding the readLength, which I also admit is strange, but I can't convince myself that it changes the behavior in any meaningful way (other than perhaps overwriting readLength more often than is necessary). readLength itself is then only used at the end where it is reported directly.

kachulis · 2023-05-17T15:03:28Z

@agraubert I think that is correct, IF the data being used has a constant read length. If that is not the case, due to trimming, or a technology which produces variable length reads, then the readLength metric will end up being somewhat randomly chosen from the distribution of observed read lengths.

I guess the question is what a good behavior would be in these variable read length cases. This PR is more intended to address the trimming case, with the thought being that probably the length of the longest observed read is the most meaningful value for "read length", with all shorter reads having been trimmed down from there. In the case where the sequencing tech itself produces variable read length, this still doesn't quite work, but I'm not sure there's really a good option for representing read length with a single metric in that case anyway.

agraubert · 2023-05-17T15:12:33Z

I see your point. It's intended to be the maximum observed read length, but as currently implemented, I suppose it's the read length of the maximum spanning read. We never ran into problems during testing because this was designed against fixed-length 150bp illumina reads. I'll patch that line when I get a chance

readLength

9b26803

kachulis mentioned this pull request Apr 7, 2023

Faulty estimates of Read Length #77

Open

agraubert closed this May 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read Length fix #78

Read Length fix #78

kachulis commented Mar 20, 2023

agraubert commented May 17, 2023

kachulis commented May 17, 2023

agraubert commented May 17, 2023

Read Length fix #78

Read Length fix #78

Conversation

kachulis commented Mar 20, 2023

agraubert commented May 17, 2023

kachulis commented May 17, 2023

agraubert commented May 17, 2023