Question #8

sekhwal · 2022-07-22T17:20:20Z

I am trying to install BAGEP with the following command but it is running at "solving environment" for very long.

conda env create -f environment.yml

idolawoye · 2022-07-24T14:43:41Z

Hi, can you clone the repo again and try to install it again? I have just updated some dependencies

sekhwal · 2022-07-25T21:19:38Z

I tried, but it was not working so I installed all the dependencies manually one by one. However, it does not allow me to install snippy and centrifuge. I am trying these on Anaconda.
Can you suggest, how to install the pipeline.

sekhwal · 2022-07-27T19:01:11Z

It shows the following error after running a while.

Touching output file fastq/SRR1210481.snippy.
[Wed Jul 27 14:59:49 2022]
Finished job 755.
2 of 1221 steps (0.16%) done

[Wed Jul 27 14:59:49 2022]
Job 512: Taxonomic classification of processed reads using centrifuge

/usr/bin/bash: CENTRIFUGE_DEFAULT_DB: unbound variable
[Wed Jul 27 14:59:49 2022]
Error in rule centrifuge:
jobid: 512
output: taxonomy/fastq/SRR1210481-report.txt, taxonomy/fastq/SRR1210481-result.txt
shell:
centrifuge -p 4 -x $CENTRIFUGE_DEFAULT_DB -1 fastp/fastq/SRR1210481_R1.fastq.gz.fastp -2 fastp/fastq/SRR1210481_R2.fastq.gz.fastp --report-file taxonomy/fastq/SRR1210481-report.txt -S taxonomy/fastq/SRR1210481-result.txt
(exited with non-zero exit code)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /media/mmk6053/Data/Manoj_data/Entrococcus_project/BAGEP/.snakemake/log/2022-07-27T145131.709376.snakemake.log

sekhwal · 2022-07-27T19:55:10Z

I reinstalled snippy nut it is still showing following error.

/usr/bin/bash: CENTRIFUGE_DEFAULT_DB: unbound variable
[Wed Jul 27 15:53:49 2022]
Error in rule centrifuge:
jobid: 512
output: taxonomy/fastq/SRR1210481-report.txt, taxonomy/fastq/SRR1210481-result.txt
shell:
centrifuge -p 4 -x $CENTRIFUGE_DEFAULT_DB -1 fastp/fastq/SRR1210481_R1.fastq.gz.fastp -2 fastp/fastq/SRR1210481_R2.fastq.gz.fastp --report-file taxonomy/fastq/SRR1210481-report.txt -S taxonomy/fastq/SRR1210481-result.txt
(exited with non-zero exit code)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /media/mmk6053/Data/Manoj_data/Entrococcus_project/BAGEP/.snakemake/log/2022-07-27T155254.684652.snakemake.log

idolawoye · 2022-07-27T20:19:54Z

The error message is with Centrifuge, not snippy.

Before running the pipeline, you need to download the centrifuge database, then set it up as shown in the README.md file and also set up Krona taxonomy. Can you confirm that you have completed these steps?

sekhwal · 2022-07-28T16:18:01Z

I performed all these steps, the centrifuge database is installed
Download and install Centrifuge database which is approximately 8 GB with the following steps

wget -c ftp://ftp.ccb.jhu.edu/pub/infphilo/centrifuge/data/p_compressed+h+v.tar.gz
mkdir $HOME/centrifuge-db
tar -C $HOME/centrifuge-db -zxvf p_compressed+h+v.tar.gz
export CENTRIFUGE_DEFAULT_DB=$HOME/centrifuge-db/p_compressed+h+v

sekhwal · 2022-07-28T16:28:31Z

I also setup Krona with the following steps:

rm -rf ~/anaconda3/envs/bagep/opt/krona/taxonomy
mkdir -p ~/krona/taxonomy
ln -s ~/krona/taxonomy/ ~/miniconda3/envs/bagep/opt/krona/taxonomy
ktUpdateTaxonomy.sh ~/krona/taxonomy

snakemake --config ref=enterococcus_genome.fasta

However, it is showing an error:

Error in rule snippy:
jobid: 755
output: fastq/SRR1210481/, fastq/SRR1210481.snippy
shell:
snippy --force --cleanup --outdir fastq/SRR1210481/ --ref enterococcus_genome.fasta --R1 fastp/fastq/SRR1210481_R1.fastq.gz.fastp --R2 fastp/fastq/SRR1210481_R2.fastq.gz.fastp
(exited with non-zero exit code)

Removing output files of failed job snippy since they might be corrupted:
fastq/SRR1210481/
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /media/mmk6053/Data/Manoj_data/Entrococcus_project/BAGEP/.snakemake/log/2022-07-28T122529.877538.snakemake.log

idolawoye · 2022-07-28T17:53:39Z

What version of snippy do you have installed and can you share the snippy log file? Idowu Olawoye about.me/idowu

…

On Thu, 28 Jul 2022, 17:28 Manoj Kumar, ***@***.***> wrote: I also setup Krona with the following steps: rm -rf ~/anaconda3/envs/bagep/opt/krona/taxonomy mkdir -p ~/krona/taxonomy ln -s ~/krona/taxonomy/ ~/miniconda3/envs/bagep/opt/krona/taxonomy ktUpdateTaxonomy.sh ~/krona/taxonomy snakemake --config ref=enterococcus_genome.fasta *However, it is showing an error:* Error in rule snippy: jobid: 755 output: fastq/SRR1210481/, fastq/SRR1210481.snippy shell: snippy --force --cleanup --outdir fastq/SRR1210481/ --ref enterococcus_genome.fasta --R1 fastp/fastq/SRR1210481_R1.fastq.gz.fastp --R2 fastp/fastq/SRR1210481_R2.fastq.gz.fastp (exited with non-zero exit code) Removing output files of failed job snippy since they might be corrupted: fastq/SRR1210481/ Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /media/mmk6053/Data/Manoj_data/Entrococcus_project/BAGEP/.snakemake/log/2022-07-28T122529.877538.snakemake.log — Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJNJOXJ33OE6N3BBWEQK5WLVWKYLTANCNFSM54MHSNWQ> . You are receiving this because you were assigned.Message ID: ***@***.***>

sekhwal · 2022-07-28T18:20:02Z

I am using snippy 4.4.3. Where I can find a snippy log file since the pipeline isn't started the snippy?

idolawoye · 2022-07-28T18:28:46Z

Did other steps run without issues? Also is your reference genome in the directory level where the snakefile is in?

sekhwal · 2022-07-28T18:35:57Z

Pipeline starts with generating a message "Filtering fastQ files by trimming low quality reads using fastp". It generates a folder "fastp" and two R1 and R2 files, after it stops.
I have a working directory called BAGEP, I have put the reference genome and extracted files of bagep.

sekhwal · 2022-07-28T18:38:01Z

Here is the complete run.

(bagep) mmk53@A8-VT-MMK53-U1:/media/Data/Manoj_data/Entrococcus_project/BAGEP$ snakemake --config ref=enterococcus_genome.fasta
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 abricate
1 all
243 centrifuge
243 fastp
243 krona_plot
1 move_files
243 prep_centrifuge_results
243 snippy
1 snippy_core
1 tree
1 vcf_viewer
1221

[Thu Jul 28 14:33:11 2022]
Job 998: Filtering fastQ files by trimming low quality reads using fastp

Read1 before filtering:
total reads: 15802641
total bases: 1544896866
Q20 bases: 1535446430(99.3883%)
Q30 bases: 1456450112(94.2749%)

Read2 before filtering:
total reads: 15802641
total bases: 1530557284
Q20 bases: 1519107818(99.2519%)
Q30 bases: 1434036932(93.6938%)

Read1 after filtering:
total reads: 15802641
total bases: 1543261701
Q20 bases: 1533825402(99.3885%)
Q30 bases: 1455009585(94.2815%)

Read2 after filtering:
total reads: 15802641
total bases: 1528737202
Q20 bases: 1517311178(99.2526%)
Q30 bases: 1432508948(93.7054%)

Filtering result:
reads passed filter: 31605282
reads failed due to low quality: 0
reads failed due to too many N: 0
reads failed due to too short: 0
reads with adapter trimmed: 233652
bases trimmed due to adapters: 3455247

Duplication rate: 0.754108%

Insert size peak (evaluated by paired-end reads): 144

JSON report: fastp.json
HTML report: fastp.html

fastp -i fastq/SRR1210481_R1.fastq.gz -I fastq/SRR1210481_R2.fastq.gz -o fastp/fastq/SRR1210481_R1.fastq.gz.fastp -O fastp/fastq/SRR1210481_R2.fastq.gz.fastp
fastp v0.23.2, time used: 54 seconds
[Thu Jul 28 14:34:05 2022]
Finished job 998.
1 of 1221 steps (0.08%) done

[Thu Jul 28 14:34:05 2022]
Job 512: Taxonomic classification of processed reads using centrifuge

/usr/bin/bash: CENTRIFUGE_DEFAULT_DB: unbound variable
[Thu Jul 28 14:34:05 2022]
Error in rule centrifuge:
jobid: 512
output: taxonomy/fastq/SRR1210481-report.txt, taxonomy/fastq/SRR1210481-result.txt
shell:
centrifuge -p 4 -x $CENTRIFUGE_DEFAULT_DB -1 fastp/fastq/SRR1210481_R1.fastq.gz.fastp -2 fastp/fastq/SRR1210481_R2.fastq.gz.fastp --report-file taxonomy/fastq/SRR1210481-report.txt -S taxonomy/fastq/SRR1210481-result.txt
(exited with non-zero exit code)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /media/Data/Manoj_data/Entrococcus_project/BAGEP/.snakemake/log/2022-07-28T143310.429117.snakemake.log
(bagep) mmk63@A8-VT-MMK63-U1:/media/Data/Manoj_data/Entrococcus_project/BAGEP$

sekhwal · 2022-07-28T18:46:37Z

It seems, it is showing some error in filtering as low quality reads. However, these data I used separately with Snippy and it ran successfully.

idolawoye · 2022-07-28T18:47:41Z

It appears you are running the workflow with 1 core. You can split the job across multiple threads depending on how many you have available. Try: snakemake --cores 4 --config ref=enterococcus_genome.fasta
This will use 4 threads and make it faster.

Also the log message shows that $CENTRIFUGE_DEFAULT_DB has not been bound to the centrifuge-db/p_compressed+h+v database you downloaded. That is why it failed at centrifuge step.

If a stage in the pipeline fails, it will truncate the entire process

idolawoye · 2022-07-30T17:04:11Z

Hi,
Any progress with the analysis?

sekhwal · 2022-08-03T15:25:19Z

Hi, Sorry for slow response. I will come back to my analysis at BAGEP. I got stuck in some other tasks.

sekhwal · 2022-08-05T16:06:15Z

Hi, when I run the pipeline with --core 40, it occupies all my system's memory (~124G). Eventually, the system becomes stop.

snakemake --cores 40 --config ref=enterococcus_genome.fasta

idolawoye · 2022-08-05T17:31:47Z

Hi, I am guessing that's because of the huge number of samples you're running and the intermediary files that were generated. You might want to free up some space or run it on an external drive with extra room. The pipeline would clean up majority of the files at the end of the analysis and you can also delete the files generated by fastp at the end of the run as well Idowu Olawoye about.me/idowu

…

On Fri, 5 Aug 2022, 17:06 Manoj Kumar, ***@***.***> wrote: Hi, when I run the pipeline with --core 40, it occupies all my system's memory (~124G). Eventually, the system becomes stop. snakemake --cores 40 --config ref=enterococcus_genome.fasta — Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJNJOXPI27KJ3MJ6HKPTCZTVXU3YDANCNFSM54MHSNWQ> . You are receiving this because you were assigned.Message ID: ***@***.***>

sekhwal · 2022-08-08T14:12:13Z

Hi,
I got the error in the end of the pipeline running. It generates fastp, krona, taxonomy folder but these are empty. Only fastp has the data.
While running, is shows results like:

Filtering result:
reads passed filter: 16430064
reads failed due to low quality: 0
reads failed due to too many N: 0
reads failed due to too short: 0
reads with adapter trimmed: 33160
bases trimmed due to adapters: 311269

fastp -i fastq/ERR4230412_R1.fastq.gz -I fastq/ERR4230412_R2.fastq.gz -o fastp/fastq/ERR4230412_R1.fastq.gz.fastp -O fastp/fastq/ERR4230412_R2.fastq.gz.fastp
fastp v0.23.2, time used: 392 seconds
[Fri Aug 5 16:56:59 2022]
Finished job 1091.
40 of 1221 steps (3%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /media/Entrococcus_project/BAGEP/.snakemake/log/2022-08-05T165026.563922.snakemake.log

idolawoye · 2022-08-09T10:17:45Z

What does the log file look like? The fastp completed successfully but a rule of thumb is to pinpoint why the pipeline failed

sekhwal · 2022-08-09T14:08:52Z

I could not find the log file. I suspect the pipeline was installed properly. Let me see the installing process again..

idolawoye · 2022-10-11T07:27:41Z

Hi, This error is because you haven't set up your centrifuge database to that variable. Have you downloaded the database to your computer? Idowu Olawoye about.me/idowu

…

On Wed, 27 Jul 2022, 20:01 Manoj Kumar, ***@***.***> wrote: It shows the following error after running a while. Touching output file fastq/SRR1210481.snippy. [Wed Jul 27 14:59:49 2022] Finished job 755. 2 of 1221 steps (0.16%) done [Wed Jul 27 14:59:49 2022] Job 512: Taxonomic classification of processed reads using centrifuge /usr/bin/bash: CENTRIFUGE_DEFAULT_DB: unbound variable [Wed Jul 27 14:59:49 2022] Error in rule centrifuge: jobid: 512 output: taxonomy/fastq/SRR1210481-report.txt, taxonomy/fastq/SRR1210481-result.txt shell: centrifuge -p 4 -x $CENTRIFUGE_DEFAULT_DB -1 fastp/fastq/SRR1210481_R1.fastq.gz.fastp -2 fastp/fastq/SRR1210481_R2.fastq.gz.fastp --report-file taxonomy/fastq/SRR1210481-report.txt -S taxonomy/fastq/SRR1210481-result.txt (exited with non-zero exit code) Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /media/mmk6053/Data/Manoj_data/Entrococcus_project/BAGEP/.snakemake/log/2022-07-27T145131.709376.snakemake.log — Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJNJOXM2ZMDTDMJ3LOI33EDVWGBQFANCNFSM54MHSNWQ> . You are receiving this because you were assigned.Message ID: ***@***.***>

idolawoye self-assigned this Jul 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question #8

Question #8

sekhwal commented Jul 22, 2022

idolawoye commented Jul 24, 2022

sekhwal commented Jul 25, 2022

sekhwal commented Jul 27, 2022

sekhwal commented Jul 27, 2022

idolawoye commented Jul 27, 2022

sekhwal commented Jul 28, 2022 •

edited

Loading

sekhwal commented Jul 28, 2022

idolawoye commented Jul 28, 2022 via email

sekhwal commented Jul 28, 2022

idolawoye commented Jul 28, 2022

sekhwal commented Jul 28, 2022

sekhwal commented Jul 28, 2022

sekhwal commented Jul 28, 2022

idolawoye commented Jul 28, 2022

idolawoye commented Jul 30, 2022

sekhwal commented Aug 3, 2022

sekhwal commented Aug 5, 2022

idolawoye commented Aug 5, 2022 via email

sekhwal commented Aug 8, 2022

idolawoye commented Aug 9, 2022

sekhwal commented Aug 9, 2022

idolawoye commented Oct 11, 2022 via email

Question #8

Question #8

Comments

sekhwal commented Jul 22, 2022

idolawoye commented Jul 24, 2022

sekhwal commented Jul 25, 2022

sekhwal commented Jul 27, 2022

sekhwal commented Jul 27, 2022

idolawoye commented Jul 27, 2022

sekhwal commented Jul 28, 2022 • edited Loading

sekhwal commented Jul 28, 2022

idolawoye commented Jul 28, 2022 via email

sekhwal commented Jul 28, 2022

idolawoye commented Jul 28, 2022

sekhwal commented Jul 28, 2022

sekhwal commented Jul 28, 2022

sekhwal commented Jul 28, 2022

idolawoye commented Jul 28, 2022

idolawoye commented Jul 30, 2022

sekhwal commented Aug 3, 2022

sekhwal commented Aug 5, 2022

idolawoye commented Aug 5, 2022 via email

sekhwal commented Aug 8, 2022

idolawoye commented Aug 9, 2022

sekhwal commented Aug 9, 2022

idolawoye commented Oct 11, 2022 via email

sekhwal commented Jul 28, 2022 •

edited

Loading