Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bins in CheckM summary do not match bins in bin depths summary #603

Open
QingDAI0225 opened this issue Mar 17, 2024 · 7 comments
Open

Bins in CheckM summary do not match bins in bin depths summary #603

QingDAI0225 opened this issue Mar 17, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@QingDAI0225
Copy link

Description of the bug

Nextflow workflow report

Workflow execution completed unsuccessfully!
The exit status of the task that caused the workflow execution to fail was: 1.

The full error message was:

Error executing process > 'NFCORE_MAG:MAG:BIN_SUMMARY (1)'

Caused by:
Process NFCORE_MAG:MAG:BIN_SUMMARY (1) terminated with an error exit status (1)

Command executed:

combine_tables.py --depths_summary bin_depths_summary.tsv --checkm_summary checkm_summary.tsv --quast_summary quast_summary.tsv --gtdbtk_summary gtdbtk_summary.tsv --out bin_summary.tsv

cat <<-END_VERSIONS > versions.yml
"NFCORE_MAG:MAG:BIN_SUMMARY":
python: $(python --version 2>&1 | sed 's/Python //g')
pandas: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('pandas').version)")
END_VERSIONS

Command exit status:
1

Command output:
(empty)

Command error:
INFO: Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
INFO: Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
Bins in CheckM summary do not match bins in bin depths summary!

Command used and terminal output

command used:

nextflow run nf-core/mag -c /work/qd33/nanopore/QD_ptrap_20230908/nextflow.config -profile singularity --input /work/qd33/nanopore/QD_ptrap_20230908/nf_core_mag/samplesheet_102K_2.csv --outdir /work/qd33/nanopore/QD_ptrap_20230908/nf_core_mag/result_102K_2 --skip_spadeshybrid true --skip_concoct true --run_virus_identification true --gtdb_db /work/qd33/nanopore/QD_ptrap_20230908/Mgnify_db/gtdbtk_r214_data.tar.gz --bin_domain_classification true --skip_prokka false --kraken2_db /work/qd33/nanopore/QD_ptrap_20230908/Mgnify_db/k2_pluspfp_20231009.tar.gz --skip_prokka false --skip_metaeuk false --skip_krona false --skip_gtdbtk false --binqc_tool checkm --save_checkm_data true

terminal output:

executor >  slurm (403)
[1b/8f47be] process > NFCORE_MAG:MAG:ARIA2_UNTAR ... [100%] 1 of 1 ✔
[13/d9b9cf] process > NFCORE_MAG:MAG:FASTQC_RAW (... [100%] 1 of 1 ✔
[f3/0bcc19] process > NFCORE_MAG:MAG:FASTP (102k_... [100%] 1 of 1 ✔
[67/a93324] process > NFCORE_MAG:MAG:BOWTIE2_PHIX... [100%] 1 of 1 ✔
[2f/f617c4] process > NFCORE_MAG:MAG:BOWTIE2_PHIX... [100%] 1 of 1 ✔
[74/98b15c] process > NFCORE_MAG:MAG:FASTQC_TRIMM... [100%] 1 of 1 ✔
[-        ] process > NFCORE_MAG:MAG:CAT_FASTQ       -
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_RAW    -
[-        ] process > NFCORE_MAG:MAG:PORECHOP        -
[-        ] process > NFCORE_MAG:MAG:NANOLYSE        -
[-        ] process > NFCORE_MAG:MAG:FILTLONG        -
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_FIL... -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE      -
[84/5ed55d] process > NFCORE_MAG:MAG:KRAKEN2_DB_P... [100%] 1 of 1 ✔
[ba/88c596] process > NFCORE_MAG:MAG:KRAKEN2 (102... [ 66%] 2 of 3, failed: 2...
[88/4b72c6] process > NFCORE_MAG:MAG:KRONA_DB        [100%] 1 of 1 ✔
[-        ] process > NFCORE_MAG:MAG:KRONA           -
[81/be339b] process > NFCORE_MAG:MAG:MEGAHIT (102... [100%] 1 of 1 ✔
[7a/cfaf8e] process > NFCORE_MAG:MAG:SPADES (102k... [100%] 1 of 1 ✔
[e6/12e4c9] process > NFCORE_MAG:MAG:QUAST (SPAde... [100%] 2 of 2 ✔
[58/06ba0b] process > NFCORE_MAG:MAG:PRODIGAL (10... [100%] 2 of 2 ✔
[9c/740921] process > NFCORE_MAG:MAG:VIRUS_IDENTI... [100%] 1 of 1 ✔
[57/3e95e5] process > NFCORE_MAG:MAG:VIRUS_IDENTI... [100%] 2 of 2 ✔
[f5/ea5a04] process > NFCORE_MAG:MAG:BINNING_PREP... [100%] 2 of 2 ✔
[9a/f0b473] process > NFCORE_MAG:MAG:BINNING_PREP... [100%] 2 of 2 ✔
[20/499dcd] process > NFCORE_MAG:MAG:BINNING:META... [100%] 2 of 2 ✔
[ef/7ecf86] process > NFCORE_MAG:MAG:BINNING:CONV... [100%] 2 of 2 ✔
[73/28a4a6] process > NFCORE_MAG:MAG:BINNING:META... [100%] 2 of 2 ✔
[9f/397968] process > NFCORE_MAG:MAG:BINNING:MAXB... [100%] 2 of 2 ✔
[96/a9bc84] process > NFCORE_MAG:MAG:BINNING:ADJU... [100%] 2 of 2 ✔
[09/54440a] process > NFCORE_MAG:MAG:BINNING:SPLI... [100%] 4 of 4 ✔
[e8/ec9f8b] process > NFCORE_MAG:MAG:BINNING:GUNZ... [100%] 166 of 166 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:GUNZ... -
[e6/cbd4a0] process > NFCORE_MAG:MAG:DOMAIN_CLASS... [100%] 2 of 2 ✔
[20/d05229] process > NFCORE_MAG:MAG:DOMAIN_CLASS... [100%] 4 of 4 ✔
[c6/c64c45] process > NFCORE_MAG:MAG:DOMAIN_CLASS... [100%] 4 of 4 ✔
[4c/1b7d2b] process > NFCORE_MAG:MAG:DOMAIN_CLASS... [100%] 1 of 1 ✔
[4d/0bc6e4] process > NFCORE_MAG:MAG:DEPTHS:MAG_D... [100%] 4 of 4 ✔
[-        ] process > NFCORE_MAG:MAG:DEPTHS:MAG_D... -
[fc/1cc9ce] process > NFCORE_MAG:MAG:DEPTHS:MAG_D... [100%] 1 of 1 ✔
[b7/d353e4] process > NFCORE_MAG:MAG:CHECKM_QC:CH... [100%] 4 of 4 ✔
[6d/642c0e] process > NFCORE_MAG:MAG:CHECKM_QC:CH... [100%] 4 of 4 ✔
[7b/29c475] process > NFCORE_MAG:MAG:CHECKM_QC:CO... [100%] 1 of 1 ✔
[74/9c07c8] process > NFCORE_MAG:MAG:QUAST_BINS (... [100%] 7 of 7 ✔
[17/9bb268] process > NFCORE_MAG:MAG:QUAST_BINS_S... [100%] 1 of 1 ✔
[-        ] process > NFCORE_MAG:MAG:CAT             -
[-        ] process > NFCORE_MAG:MAG:CAT_SUMMARY     -
[96/f9f8cd] process > NFCORE_MAG:MAG:GTDBTK:GTDBT... [100%] 1 of 1 ✔
[b6/07b82f] process > NFCORE_MAG:MAG:GTDBTK:GTDBT... [100%] 4 of 4 ✔
[ec/dd8b76] process > NFCORE_MAG:MAG:GTDBTK:GTDBT... [100%] 1 of 1 ✔
[d6/ea361f] process > NFCORE_MAG:MAG:BIN_SUMMARY (1) [100%] 1 of 1, failed: 1 ✘
[ec/fff8c2] process > NFCORE_MAG:MAG:PROKKA (SPAd... [100%] 160 of 160 ✔
[-        ] process > NFCORE_MAG:MAG:CUSTOM_DUMPS... -
[-        ] process > NFCORE_MAG:MAG:MULTIQC         -
-[nf-core/mag] Pipeline completed with errors-
[ba/88c596] NOTE: Process `NFCORE_MAG:MAG:KRAKEN2 (102k_P_1_Pool_2-k2_pluspfp_20231009)` terminated with an error exit status (140) -- Execution is retried (2)
ERROR ~ Error executing process > 'NFCORE_MAG:MAG:BIN_SUMMARY (1)'

Caused by:
  Process `NFCORE_MAG:MAG:BIN_SUMMARY (1)` terminated with an error exit status (1)

Command executed:

  combine_tables.py --depths_summary bin_depths_summary.tsv                                          --checkm_summary checkm_summary.tsv                     --quast_summary quast_summary.tsv                     --gtdbtk_summary gtdbtk_summary.tsv                                          --out bin_summary.tsv
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_MAG:MAG:BIN_SUMMARY":
      python: $(python --version 2>&1 | sed 's/Python //g')
      pandas: $(python -c "import pkg_resources; print(pkg_resources.get_distribution('pandas').version)")
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
  INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
  Bins in CheckM summary do not match bins in bin depths summary!

Work dir:
  /work/qd33/nanopore/QD_ptrap_20230908/nf_working/d6/ea361f0f6ef1ab87281a9f42cd2715

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details

Relevant files

No response

System information

Nextflow version
version 23.04.3, build 5875 (11-08-2023 18:37 UTC)
Hardware
HPC
Executor
slurm
Container engine:
Singularity
Version of nf-core/mag 2.5.4

@QingDAI0225 QingDAI0225 added the bug Something isn't working label Mar 17, 2024
@jfy133
Copy link
Member

jfy133 commented Mar 18, 2024

Hi @QingDAI0225 do you mind sharing with me the files within the /work/qd33/nanopore/QD_ptrap_20230908/nf_working/d6/ea361f0f6ef1ab87281a9f42cd2715 directory? You can send them to me privately if you prefer to [email protected] (will be kept confidential) - I need to understand what is in the various tables

@QingDAI0225
Copy link
Author

Hi @QingDAI0225 do you mind sharing with me the files within the /work/qd33/nanopore/QD_ptrap_20230908/nf_working/d6/ea361f0f6ef1ab87281a9f42cd2715 directory? You can send them to me privately if you prefer to [email protected] (will be kept confidential) - I need to understand what is in the various tables

Already sent you through email. Thank you so much.

@jfy133
Copy link
Member

jfy133 commented Mar 19, 2024

OK I think I've identified the problem: the bin IDs in the checkm_summary file do not have the file extension, while the others do.

I'm not sure why this is at the moment, but I'm currently at a hackathon working on something else - I will try to come back to this next week.

@jfy133
Copy link
Member

jfy133 commented Mar 22, 2024

Trying locally, actually this does not seem to be an issue when running iwth BUSCO at least.

@jfy133
Copy link
Member

jfy133 commented Mar 22, 2024

@QingDAI0225 we also inspected with @maxibor and we discussed why you may be missing the 6 CheckM bins. The hypothesis is that maybe some of your CHECKM jobs failed (e.g, no marker genes could be found), and thus were not exported.

IF this is valid behaviour (assuming CheckM did just not find anything for those bins, rather than CheckM failing for some other reason), we will update the combine_tables.py script.

Could you please send me your .nextflow.log (again via email if you prefer). The file will be wherever you ran the nextflow run command from.

@jfy133
Copy link
Member

jfy133 commented Mar 22, 2024

And also if you can send the work/ of the CheckM process of one of the bins not in the table, that would also be helpful

@jfy133
Copy link
Member

jfy133 commented Mar 22, 2024

So we strongly suspect this is it, and can be confirmed by sending teh work/ of the CheckM process of your 'missing bins'.

Technical details: the CHECKM module has all outputs set to 'optional: true', meaning it will not fail if there is no output file found. We think this happens on purpose: we suspect checkm itself will not fail with an error if it finds nothing, but it will just report in console 'nothing found' and produce no output file. In nextflow terms, if no files are emitted from that process that is also fine that sample 'stops' to conitnue through that subworkflow. In this specific case then, only the output of CHECKM processes that did get emitted will be combined into the table for merging.

If we confirm this is the case, then we just need to update the combine_tables.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants