Negative Value of "Input Read Pairs" and "Both Surviving" in log file #44

lxwgcool · 2022-12-01T16:58:10Z

Hi,

We are using Trimmomatic as a benchmark tool in our pipeline for the trimming of Illumina reads.

We recently found that the metrics of "Input Read Pairs" and "Both Surviving" reported by Trimmomatic log file are negative in one of our flowcells (others are good).

Please check the screenshot below (line 9):

The reads we used are human whole genome sequence reads. The size of each reads is around 160GB.

Based on our previous work, these two metrics should be always positive. I have several questions below:
(1) May I know why we get the negative value for these two metrics?
(2) If it is not a bug, how we understand these negative value?
(3) How we get the real number of "Input Read Pairs" and "Both Surviving"? Should we simply reverse the negative to positive?

Thanks so much for your help.
Best regards
Xin

TonyBolger · 2022-12-02T08:05:16Z

This looks like an 32-bit integer wrap - i guess you have over 2bn read pairs? The real number is (2^32) added to the numbers shown there.

Input: (2^32) + -1953136673 => 2341830623
Both Surviving: (2^32) + -2013735491 => 2281231805

I hadn't really considered the possibility of >2bn reads in a dataset 10 years ago :)

lxwgcool · 2022-12-02T22:28:14Z

Thank you so much for your prompt reply Tony. I think it does make sense.

We are current using Illumina HiSeq platform and generated a lot of sequencing data for different WGS projects. I believe with the development of technology and the strong financial support, we may generate more sequencing data that contain more than 2 billion reads-pair for a single subject.

Thanks for your solution. I will keep using the rule (2^32 + "the negative number") to convert the negative value to the real number of reads-pair in our pipeline. However, with more these big size data generated, I think it may be more convenient for us if you can update the trimmomatic source code and release a new version of trimmomatic to use int 64 to solve this issue.

Thank you so much for your help
Best
Xin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Negative Value of "Input Read Pairs" and "Both Surviving" in log file #44

Negative Value of "Input Read Pairs" and "Both Surviving" in log file #44

lxwgcool commented Dec 1, 2022

TonyBolger commented Dec 2, 2022

lxwgcool commented Dec 2, 2022

Negative Value of "Input Read Pairs" and "Both Surviving" in log file #44

Negative Value of "Input Read Pairs" and "Both Surviving" in log file #44

Comments

lxwgcool commented Dec 1, 2022

TonyBolger commented Dec 2, 2022

lxwgcool commented Dec 2, 2022