-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GUAVA pipeline stops at "Remove duplicates" #7
Comments
Hi @jarleba, would you mind sharing the log file? Mayur |
Here you go. :) It produces an unfinished result file as well. E_C3_TKD181002942_H7NK7DSXX_L2_1_log.txt |
I have checked the log file. Try using multiple cores and extra RAM, and let it finish. By the way, how long did you allow it to run? If you don't mind, could you please share sample data with me via google drive? Thanks, |
Hi Mayur,
The other replicates that worked finished after about 6 hours, using 8 cores and 32 GB RAM. I ran the ones that did not finish for a few more hours, but I don’t remember exactly. I will see if I can share the fastq files on google drive. :)
Thanks
Fra: Mayur Divate<mailto:[email protected]>
Sendt: onsdag 2. januar 2019 kl. 01:24
Til: MayurDivate/GUAVASourceCode<mailto:[email protected]>
Kopi: jarleba<mailto:[email protected]>; Mention<mailto:[email protected]>
Emne: Re: [MayurDivate/GUAVASourceCode] GUAVA pipeline stops at "Remove duplicates" (#7)
@jarleba<https://github.com/jarleba>
I have checked the log file.
I could not find any error, I think it is taking much longer because you are analyzing 50+ million reads.
Plus, it is a 150bp sequencing data.
Try using multiple cores and extra RAM, and let it finish.
By the way, how long did you allow it to run?
If you don't mind, could you please share sample data with me via google drive?
so that I can also test GUAVA on 150bp data.
Thanks,
Mayur
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#7 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ArNdnDMES6vthtU3juFbuMoUpT_vftNiks5u-_xDgaJpZM4Y6f9B>.
|
I think you should wait until it finishes. cheers, |
I let it run for a couple of days now, but I noticed that the CPU stops Processing when the pipeline is at «remove duplicates». It’s really strange. I ran the analysis with a samples of just 10k reads. The whole analysis seems to work with just 10k reads. I will share the 10k.fq files aswell as the log files. https://1drv.ms/f/s!AjzE0xKMj_y5g9Uko3gZQ1WrpdJDdA Cheers, |
Hi Mayur, I also encountered the same problem! It could completely run an analysis using your sample data, and also my "truncated data". The truncated data is generated by randomly extracting 1 million paired reads form the original data containing 45 million paired reads. However, when running the original large data GUAVA silently stopped at "Remove duplicates" step without any error message and the CPU usage showed zero %. And it is kept stopped for one day until I closed it. So, neither _aligned_duplicate_filtered.bam nor _aligned_duplicate_filtered.bam_matrix.txt were generated. I used System Monitor tool on Ubuntu to check the condition of the stopped process, and it showed "xxx/miniconda2/bin/java -Xms512 -Xmx1g -jar xxx/minconda2/share/picard-2.18.7-0/picard.jar MarkDuplicates REMOVE_DUPLICATES=true I=xxx/xxx_aligned_csrt.bam O=xxx/xxx_aligned_duplicate_filtered.bam M=xxx/xxx_aligned_duplicate_filtered.bam_matrix.txt". (xxx is the file path or name) Then I copy this command line and ran it independently. It took only 11.37 minutes to finish and correctly generate filtered.bam and matrix files. My PC is HP Z640 workstation with two CPUs (total 24 threads) and 32G RAM. You may like to download the 45 million paired reads for test. Cheers, |
Hi Roc, When we tested guava on big data from various papers it worked fine. However, when computer use to go sleep mode it used to stop processing data. Please make sure that your computer does not go to sleep and automatically stops guava. Thanks, |
Hi Mayur, I am sure the Automatic Suspend function on my computer is off. I also tried to move the mouse all the time during GUAVA running to prevent any chance for computer sleeping. But it still silently stopped at "Remove duplicates" step. Would you mind to run my data to test it on your computer, or give me other's ATAC-seq .fastq data (> 45 million reads) to let me test it on my computer? Thanks, |
I am having the same problem, has there been any recent fixes for this other than just waiting for the data for finish. My wall time has been 5 days so far. Same thing stops at removing duplicates and shows no memory use from this point onward. |
Hi, everything seems to be working fine with my GUAVA now, but on a couple of samples the pipeline stops at "Remove duplicates". The program doesn't close or anything, it just says "Remove duplicates" forever, and my CPU aren't doing anything. As far as I can tell, the mapping went fine. Have anyone else experienced this?
The text was updated successfully, but these errors were encountered: