MultiQC-Additional overview stat table #134

tillenglert · 2024-09-06T11:51:35Z

This PR adds an overview summary table to the multiqc report, so users unexperienced with the pipeline can get an idea what the pipeline output contains.

full size test is currently running and test will be delivered after.

A current multiqc report is attached.
multiqc.zip

PR checklist

…e output to table and add table to multiqc

github-actions · 2024-09-06T11:53:29Z

`nf-core lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit 5f00c9b

+| ✅ 195 tests passed       |+
#| ❔   2 tests were ignored |#
!| ❗   3 tests had warnings |!

❗ Test warnings:

nextflow_config - Config manifest.version should end in dev: 1.0.0
readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
nfcore_yml - nf-core version not set in .nf-core.yml

❔ Tests ignored:

files_exist - File is ignored: conf/igenomes.config
actions_ci - actions_ci

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-metapep_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-metapep_logo_light.png
files_exist - File found: docs/images/nf-core-metapep_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: modules.json
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: docs/images/nf-core-metapep_logo.png
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/NfcoreTemplate.groovy
files_exist - File not found check: lib/Utils.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: lib/WorkflowMain.groovy
files_exist - File not found check: lib/WorkflowMetapep.groovy
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: Singularity
files_exist - File not found check: lib/nfcore_external_java_deps.jar
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.validationShowHiddenParams
nextflow_config - Config variable found: params.validationSchemaIgnoreParams
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
nextflow_config - nextflow.config contains configuration profile test
nextflow_config - Config default value correct: params.custom_config_version= master
nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Config default value correct: params.max_cpus= 16
nextflow_config - Config default value correct: params.max_memory= 128.GB
nextflow_config - Config default value correct: params.max_time= 240.h
nextflow_config - Config default value correct: params.publish_dir_mode= copy
nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
nextflow_config - Config default value correct: params.validate_params= true
nextflow_config - Config default value correct: params.pipelines_testdata_base_path= https://raw.githubusercontent.com/nf-core/test-datasets/
nextflow_config - Config default value correct: params.min_pep_len= 9
nextflow_config - Config default value correct: params.max_pep_len= 11
nextflow_config - Config default value correct: params.pred_method= syfpeithi
nextflow_config - Config default value correct: params.syfpeithi_score_threshold= 0.5
nextflow_config - Config default value correct: params.mhcflurry_mhcnuggets_score_threshold= 0.426
nextflow_config - Config default value correct: params.prodigal_mode= meta
nextflow_config - Config default value correct: params.prediction_chunk_size= 4000000
nextflow_config - Config default value correct: params.pred_chunk_size_scaling= 10
nextflow_config - Config default value correct: params.downstream_chunk_size= 7500000
nextflow_config - Config default value correct: params.max_task_num= 1000
nextflow_config - Config default value correct: params.pred_buffer_files= 1000
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-metapep_logo_light.png matches the template
files_unchanged - docs/images/nf-core-metapep_logo_light.png matches the template
files_unchanged - docs/images/nf-core-metapep_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
pipeline_todos - No TODO strings found
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (168 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: release-announcements.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: download_pipeline.yml
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: branch.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - assets/multiqc_config.yml found and not ignored.
multiqc_config - assets/multiqc_config.yml contains report_section_order
multiqc_config - assets/multiqc_config.yml contains export_plots
multiqc_config - assets/multiqc_config.yml contains report_comment
multiqc_config - assets/multiqc_config.yml follows the ordering scheme of the minimally required plugins.
multiqc_config - assets/multiqc_config.yml contains a matching 'report_comment'.
multiqc_config - assets/multiqc_config.yml contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'
base_config - conf/base.config found and not ignored.
modules_config - conf/modules.config found and not ignored.
modules_config - EPYTOPE_SHOW_SUPPORTED_MODELS found in conf/modules.config and Nextflow scripts.
modules_config - CHECK_SAMPLESHEET_CREATE_TABLES found in conf/modules.config and Nextflow scripts.
modules_config - UNIFY_MODEL_LENGTHS found in conf/modules.config and Nextflow scripts.
modules_config - UNPACK_BIN_ARCHIVES found in conf/modules.config and Nextflow scripts.
modules_config - DOWNLOAD_PROTEINS found in conf/modules.config and Nextflow scripts.
modules_config - PRODIGAL found in conf/modules.config and Nextflow scripts.
modules_config - CREATE_PROTEIN_TSV found in conf/modules.config and Nextflow scripts.
modules_config - SPLIT_PRED_TASKS found in conf/modules.config and Nextflow scripts.
modules_config - PREDICT_EPITOPES found in conf/modules.config and Nextflow scripts.
modules_config - MERGE_PREDICTIONS_BUFFER found in conf/modules.config and Nextflow scripts.
modules_config - MERGE_PREDICTIONS found in conf/modules.config and Nextflow scripts.
modules_config - PREPARE_SCORE_DISTRIBUTION found in conf/modules.config and Nextflow scripts.
modules_config - PLOT_SCORE_DISTRIBUTION found in conf/modules.config and Nextflow scripts.
modules_config - PREPARE_ENTITY_BINDING_RATIOS found in conf/modules.config and Nextflow scripts.
modules_config - PLOT_ENTITY_BINDING_RATIOS found in conf/modules.config and Nextflow scripts.
modules_config - MULTIQC found in conf/modules.config and Nextflow scripts.
nfcore_yml - Repository type in .nf-core.yml is valid: pipeline

Run details

nf-core/tools version 2.14.1
Run at 2024-09-30 05:59:51

bin/collect_stats.py

skrakau · 2024-09-10T12:45:03Z

bin/collect_stats.py


-    print("Done!", flush=True)
+        # peptide_id, condition_name, condition_peptide_count, highest_prediction_score, prediction_score_allele_0, prediction_score_allele_1,...prediction_score_allele_n
+        conditions_peptides = conditions_peptides.set_index("peptide_id").join(best_scored_peptides).join(predictions)


I am not sure I understood all details of the new code, but since it caused quite some memory usage, I have a few questions/suggestions since I am wondering if you really need that many joins:

conditions_peptides: if the resulting df become too large, you could drop the condition_name (anyway the same within this context) and count before joining to the predictions

filter predictions directly after reading in for entries that are above the threshold, and drop the prediction_score to reduce the size

merge/join for each allele to get only peptides that are "binders" and count

Brings me to the next question: you just provide the count of unique binders, right? Maybe that should be stated somewhere more clearly.

I managed to reduce the used Memory to around 50GB which is only 10GB more compared to the previous version not including peptide predictions!

bin/collect_stats.py

…ed to condition

bin/collect_stats.py

skrakau

Minor comment regarding names, other than that it looks great!

Shift COLLECT_STATS module to after predictions, add binders, chang…

0521047

…e output to table and add table to multiqc

tillenglert mentioned this pull request Sep 6, 2024

[Do not merge!] Pseudo PR for first release #133

Closed

change colname

2d42736

skrakau reviewed Sep 10, 2024

View reviewed changes

bin/collect_stats.py Outdated Show resolved Hide resolved

skrakau reviewed Sep 10, 2024

View reviewed changes

tillenglert added 7 commits September 13, 2024 11:54

Review comments and change calculation of binders to save ressources

8d7e5c5

Remove TODO (solved by PR nf-core#135)

eddab30

Add bugfix from nf-core#135 for testing on cfc

4d52218

Update all tests to include the stats.tsv instead of stats.txt

922682c

Update all test snapshots

1d998e0

Merge branch 'dev' into multiqc_additional_stats

879e41b

fix multiqc for model info branch

9cf2794

tillenglert marked this pull request as ready for review September 16, 2024 14:45

tillenglert requested a review from skrakau September 16, 2024 14:46

skrakau reviewed Sep 17, 2024

View reviewed changes

bin/collect_stats.py Outdated Show resolved Hide resolved

skrakau reviewed Sep 17, 2024

View reviewed changes

bin/collect_stats.py Outdated Show resolved Hide resolved

tillenglert added 2 commits September 20, 2024 11:14

Change according to suggestion of @skrakau and fix alleles not assign…

2fb5df0

…ed to condition

drop prediction_score column

529182c

skrakau reviewed Sep 20, 2024

View reviewed changes

bin/collect_stats.py Outdated Show resolved Hide resolved

skrakau reviewed Sep 20, 2024

View reviewed changes

bin/collect_stats.py Outdated Show resolved Hide resolved

skrakau approved these changes Sep 20, 2024

View reviewed changes

tillenglert added 3 commits September 20, 2024 15:21

code and comment refactorings

1dff437

Update all snapshots for new stats module

a9a3d8e

update test bins

5f00c9b

tillenglert merged commit 68258d0 into nf-core:dev Sep 30, 2024
21 checks passed

tillenglert deleted the multiqc_additional_stats branch October 18, 2024 11:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MultiQC-Additional overview stat table #134

MultiQC-Additional overview stat table #134

tillenglert commented Sep 6, 2024

github-actions bot commented Sep 6, 2024 •

edited

Loading

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

skrakau Sep 10, 2024

tillenglert Sep 16, 2024

skrakau left a comment

MultiQC-Additional overview stat table #134

MultiQC-Additional overview stat table #134

Conversation

tillenglert commented Sep 6, 2024

PR checklist

github-actions bot commented Sep 6, 2024 • edited Loading

nf-core lint overall result: Passed ✅ ⚠️

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

skrakau Sep 10, 2024

Choose a reason for hiding this comment

tillenglert Sep 16, 2024

Choose a reason for hiding this comment

skrakau left a comment

Choose a reason for hiding this comment

github-actions bot commented Sep 6, 2024 •

edited

Loading

`nf-core lint` overall result: Passed ✅ ⚠️