Pipeline uses cached process despite edited eval script #5470

CormacKinsella · 2024-11-05T10:01:49Z

Bug report

Despite editing an eval script, a cached version of the process will be used under certain circumstances

Expected behavior and actual behavior

Expected: editing an eval script (e.g. from multiqc --version to multiqc --version | sed "s/version//" ) will be picked up as an edit and therefore a cached version won't be used.
Actual behaviour: it works correctly if resume = true is stated within the main workflow script. However if -resume is provided on command line, or resume = true is given in nextflow.config, then the cached version is used after a change to the eval script

Steps to reproduce the problem

Generate nextflow.config and main.nf from below example
Run the pipeline
Edit the eval statement to 'multiqc --version | sed "s/version//"'
Rerun -> nothing changes about the output

resume = true
apptainer.enabled = true
apptainer.autoMounts = true

nextflow.enable.dsl=2
nextflow.preview.topic = true

process FOO {
    tag "${id}"
    container "quay.io/biocontainers/multiqc:1.14--pyhdfd78af_0"

    input:
    val(id)

    output:
    tuple val(task.process), eval('multiqc --version'), topic: versions

    script:
    """
    """
}

workflow {
    input_ch = Channel.from("Sample1") 

    FOO(input_ch)

    channel.topic('versions').view()
}

Program output

Before edit:
[FOO, multiqc, version 1.14]

After edit:
[FOO, multiqc, version 1.14]

Should be:
[FOO, multiqc, 1.14] # Note that this behaviour is achieved by moving resume = true from nextflow.config to main.nf

Environment

Nextflow version: version 24.10.0 build 5928
Java version: openjdk 11.0.24 2024-07-16
Operating system: Ubuntu 22.04.5 LTS
Bash version: GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)

Additional context

The text was updated successfully, but these errors were encountered:

bentsherman · 2024-11-05T11:50:29Z

Good point, the eval script was never added to the task hash

bentsherman · 2024-11-14T13:46:51Z

@jorgee this one should be pretty easy if you'd like to try. Basically we need to add the eval outputs to the task hash here:

nextflow/modules/nextflow/src/main/groovy/nextflow/processor/TaskProcessor.groovy

Lines 2200 to 2205 in 8041a57

    
           // add all the input name-value pairs to the key generator 
        
           for( Map.Entry<InParam,Object> it : task.inputs ) { 
        
               keys.add( it.key.name ) 
        
               keys.add( it.value ) 
        
           }

Something like this:

        // add inputs ...
        // ...

        // add eval outputs
        for( Map.Entry<OutParam,Object> it : task.outputs ) {
            if( it.key instanceof CmdEvalParam )
                keys.add( it.key.getTarget(task.context) )
        }

bentsherman added the bug label Nov 5, 2024

jorgee self-assigned this Nov 14, 2024

jorgee linked a pull request Nov 18, 2024 that will close this issue

Fix hash when CmdEvalParam as output #5517

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline uses cached process despite edited eval script #5470

Pipeline uses cached process despite edited eval script #5470

CormacKinsella commented Nov 5, 2024

bentsherman commented Nov 5, 2024

bentsherman commented Nov 14, 2024

Pipeline uses cached process despite edited eval script #5470

Pipeline uses cached process despite edited eval script #5470

Comments

CormacKinsella commented Nov 5, 2024

Bug report

Expected behavior and actual behavior

Steps to reproduce the problem

Program output

Environment

Additional context

bentsherman commented Nov 5, 2024

bentsherman commented Nov 14, 2024