Maximum number of prediction jobs #116

tillenglert · 2024-04-19T07:06:13Z

I added a dynamic increase of the chunk sizes to set a limit to the generated PREDICT_EPITOPES processes. The default value for maximum_process_num is 1000, which is adjusted to the CFC2.0 HPC Cluster of Tübingen.

PR checklist

…r remainder

github-actions · 2024-04-19T07:07:40Z

`nf-core lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit b2947e8

+| ✅ 178 tests passed       |+
!| ❗   6 tests had warnings |!

❗ Test warnings:

files_exist - File not found: conf/igenomes.config
readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
pipeline_todos - TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file.
pipeline_todos - TODO string in output.md: Write this documentation describing your workflow's output
pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.
pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierignore
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-metapep_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-metapep_logo_light.png
files_exist - File found: docs/images/nf-core-metapep_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: modules.json
files_exist - File found: pyproject.toml
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: pipeline_template.yml
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-metapep_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: lib/Checks.groovy
files_exist - File not found check: lib/Completion.groovy
files_exist - File not found check: lib/Workflow.groovy
files_exist - File not found check: lib/Utils.groovy
files_exist - File not found check: lib/WorkflowMain.groovy
files_exist - File not found check: lib/NfcoreTemplate.groovy
files_exist - File not found check: lib/WorkflowMetapep.groovy
files_exist - File not found check: lib/nfcore_external_java_deps.jar
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.validationShowHiddenParams
nextflow_config - Config variable found: params.validationSchemaIgnoreParams
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config variable (correctly) not found: params.enable_conda
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: 1.0dev
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
nextflow_config - nextflow.config contains configuration profile test
nextflow_config - Config default value correct: params.custom_config_version= master
nextflow_config - Config default value correct: params.custom_config_base= https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Config default value correct: params.max_cpus= 16
nextflow_config - Config default value correct: params.max_memory= 128.GB
nextflow_config - Config default value correct: params.max_time= 240.h
nextflow_config - Config default value correct: params.publish_dir_mode= copy
nextflow_config - Config default value correct: params.max_multiqc_email_size= 25.MB
nextflow_config - Config default value correct: params.validate_params= true
nextflow_config - Config default value correct: params.min_pep_len= 9
nextflow_config - Config default value correct: params.max_pep_len= 11
nextflow_config - Config default value correct: params.pred_method= syfpeithi
nextflow_config - Config default value correct: params.syfpeithi_score_threshold= 0.5
nextflow_config - Config default value correct: params.mhcflurry_mhcnuggets_score_threshold= 0.426
nextflow_config - Config default value correct: params.prodigal_mode= meta
nextflow_config - Config default value correct: params.prediction_chunk_size= 4000000
nextflow_config - Config default value correct: params.pred_chunk_size_scaling= 10
nextflow_config - Config default value correct: params.downstream_chunk_size= 7500000
nextflow_config - Config default value correct: params.max_task_num= 1000
nextflow_config - Config default value correct: params.pred_buffer_files= 1000
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - assets/nf-core-metapep_logo_light.png matches the template
files_unchanged - docs/images/nf-core-metapep_logo_light.png matches the template
files_unchanged - docs/images/nf-core-metapep_logo_dark.png matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - .gitignore matches the template
files_unchanged - .prettierignore matches the template
files_unchanged - pyproject.toml matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version badge matched config. Badge: 23.04.0, Config: 23.04.0
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (156 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_lint - Input mimetype lint passed: 'text/csv'
schema_params - Schema matched params returned from nextflow config
system_exit - No System.exit calls found
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: ci.yml
actions_schema_validation - Workflow validation passed: clean-up.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: release-announcements.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: download_pipeline.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - 'assets/multiqc_config.yml' contains report_section_order
multiqc_config - 'assets/multiqc_config.yml' contains export_plots
multiqc_config - 'assets/multiqc_config.yml' contains report_comment
multiqc_config - 'assets/multiqc_config.yml' follows the ordering scheme of the minimally required plugins.
multiqc_config - 'assets/multiqc_config.yml' contains a matching 'report_comment'.
multiqc_config - 'assets/multiqc_config.yml' contains 'export_plots: true'.
modules_structure - modules directory structure is correct 'modules/nf-core/TOOL/SUBTOOL'

Run details

nf-core/tools version 2.13.1
Run at 2024-05-08 10:32:08

nextflow.config

nextflow_schema.json

tillenglert · 2024-04-26T13:59:52Z

@skrakau I added your code review comments, but now I needed to implement another check for the parameter max_task_num as using a value below the number of chosen alleles it will fail to due to a divide by zero error.

I originally wanted to add this check within the Nextflow logic e.g. utils subworkflow and general test before any task is executed but as the alleles may not be unique between rows and multiple alleles can be assigned per row I needed to use the alleles.tsv file created by the check_samplesheet_and_create_tables process. As I needed to read in another file I didn't want to obscure the process of checking too much and created a file within the process containing the number of alleles which is now used to throw an error within the Process_Input subworkflow.

If you have suggestions to handle this error differently please let me know! Otherwise this PR is ready for review! 👍

skrakau · 2024-05-03T13:43:34Z

I originally wanted to add this check within the Nextflow logic e.g. utils subworkflow and general test before any task is executed but as the alleles may not be unique between rows and multiple alleles can be assigned per row I needed to use the alleles.tsv file created by the check_samplesheet_and_create_tables process. As I needed to read in another file I didn't want to obscure the process of checking too much and created a file within the process containing the number of alleles which is now used to throw an error within the Process_Input subworkflow.

If you have suggestions to handle this error differently please let me know! Otherwise this PR is ready for review! 👍

Would it somehow be possible to avoid writing another file within check_samplesheet_and_create_tables? For example, by just checking the number of alleles in alleles.tsv directly within PROCESS_INPUT, such as using the countLines operaotr or similar?

tillenglert · 2024-05-03T15:12:16Z

Would it somehow be possible to avoid writing another file within check_samplesheet_and_create_tables? For example, by just checking the number of alleles in alleles.tsv directly within PROCESS_INPUT, such as using the countLines operaotr or similar?

Absolutely true and way cleaner than my solution! I adjusted it to use the already existing alleles.tsv ! Thank you!

skrakau

Thanks, looks good! :)

tillenglert added 5 commits March 26, 2024 09:34

Add maximum number of chunks parameter

1431ed0

Add maximum chunk number in prediciton chunk generation

ea5c060

Add warning that maximum chunk number is reduced

b5fd2e5

Change description of parameter a bit

617e7b4

Move chunk size increase to top of the code to circumvent an error fo…

db30e03

…r remainder

tillenglert changed the title ~~Maximum_chunk_number~~ Maximum number of prediction jobs Apr 19, 2024

Merge branch 'dev' into maximum_chunk_number

2058f83

tillenglert requested a review from skrakau April 19, 2024 07:43

skrakau requested changes Apr 19, 2024

View reviewed changes

nextflow.config Outdated Show resolved Hide resolved

nextflow_schema.json Outdated Show resolved Hide resolved

tillenglert added 2 commits April 22, 2024 08:13

Merge branch 'dev' into maximum_chunk_number

218e00f

Code Review

7ccb3dd

Catch divide by zero error in process_input subworkflow

f5cecb9

tillenglert force-pushed the maximum_chunk_number branch from fc52d45 to f5cecb9 Compare May 8, 2024 07:54

tillenglert added 2 commits May 8, 2024 11:07

Merge branch 'dev' into maximum_chunk_number

1e19b9a

Merge branch 'dev' into maximum_chunk_number

b2947e8

tillenglert requested a review from skrakau May 8, 2024 10:23

skrakau approved these changes May 8, 2024

View reviewed changes

tillenglert merged commit ec9292e into nf-core:dev May 8, 2024
13 checks passed

tillenglert deleted the maximum_chunk_number branch May 8, 2024 15:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maximum number of prediction jobs #116

Maximum number of prediction jobs #116

tillenglert commented Apr 19, 2024

github-actions bot commented Apr 19, 2024 •

edited

Loading

❗ Test warnings:

✅ Tests passed:

Run details

tillenglert commented Apr 26, 2024 •

edited

Loading

skrakau commented May 3, 2024

tillenglert commented May 3, 2024 •

edited

Loading

skrakau left a comment

Maximum number of prediction jobs #116

Maximum number of prediction jobs #116

Conversation

tillenglert commented Apr 19, 2024

PR checklist

github-actions bot commented Apr 19, 2024 • edited Loading

nf-core lint overall result: Passed ✅ ⚠️

❗ Test warnings:

✅ Tests passed:

Run details

tillenglert commented Apr 26, 2024 • edited Loading

skrakau commented May 3, 2024

tillenglert commented May 3, 2024 • edited Loading

skrakau left a comment

Choose a reason for hiding this comment

github-actions bot commented Apr 19, 2024 •

edited

Loading

`nf-core lint` overall result: Passed ✅ ⚠️

tillenglert commented Apr 26, 2024 •

edited

Loading

tillenglert commented May 3, 2024 •

edited

Loading