Fix and enhance job resume functionality #5247

mvdbeek · 2017-12-27T09:51:14Z

This should fix #5222.

The problem was that HDA ids in nested parameters were not always being updated
properly. In the case of bowtie2 the input dataset is provided via a
conditional, but the conditional prefix is not being stored in the
JobToInputDatasetAssociation, and so the update_param function was not able
to update the dataset id in these instances. The approach now is to simply
replace all occurences of the old dataset id with the new dataset id.

The PR includes 2 testcases for this functionality, one for hda input and one for hdca input.

With this PR we also replace failed element of a collection with a job rerun (including a testcase), which fixes #2235.

This should fix galaxyproject#5222. The problem was that HDA ids in nested parameters were not always being updated properly. In the case of bowtie2 the input dataset is provided via a conditional, but the conditional prefix is not being stored in the JobToInputDatasetAssociation, and so the `update_param` function was not able to update the dataset id in these instances. The approach now is to simply replace all occurences of the old dataset id with the new dataset id. TODO: tests, refactor the replacement functionality into a separate function and make this work for JobToInputDatasetCollectionAssociation.

This test runs a workflow whose first step fails, followed by a tool that uses the first step's output as an input, which is behind a nested conditional. This recapitulates the bug described in galaxyproject#5222.

I am slightly surprised that this worked without change, but it appears that the remapping occurs via the HDAs that the HDCA is composed of.

This specifically addresses the problem where some jobs of a mapped-over collection have failed. Instead of filtering the failed collection and restarting the workflow at this position (involving a lot of copy-paste ...) the user can now limit the rerun to the problematic jobs and the workflow should resume from there. Should fix galaxyproject#2235. This is one possible implementation, it would also be feasible to not manipulate the original collection, but to copy the HDCA and then to replace collection elements and replace all references for jobs that depend on the HDCA, as we do for HDAs. This implementation seems simpler, but let me know if you see problems with this approach.

jmchilton · 2017-12-31T14:14:41Z

Wow - this is totally awesome. Great fix and great feature to finally have some test coverage of - this is very exciting! Thanks so much @mvdbeek.

Replacing failed collection elements was already possible when dependent jobs were found (galaxyproject#5247). This commit restructures the remapping so that remapping is possible when no dependent jobs are available. This also simplifies the replacement of HDAs between old and new jobs.

mvdbeek added area/jobs kind/bug status/WIP labels Dec 27, 2017

mvdbeek added this to the 18.01 milestone Dec 27, 2017

mvdbeek added status/review and removed status/WIP labels Dec 29, 2017

mvdbeek changed the title ~~[WIP] Fix job resume functionality for non-prefixed input data~~ Fix job resume functionality for non-prefixed input data Dec 29, 2017

mvdbeek force-pushed the job_rerun_fixes branch from d30e0e2 to 42bbda1 Compare December 30, 2017 19:03

mvdbeek added the kind/enhancement label Dec 30, 2017

mvdbeek changed the title ~~Fix job resume functionality for non-prefixed input data~~ Fix and enhance job resume functionality Dec 30, 2017

mvdbeek added 5 commits December 31, 2017 14:07

Add test for resume job functionality

39073b4

This test runs a workflow whose first step fails, followed by a tool that uses the first step's output as an input, which is behind a nested conditional. This recapitulates the bug described in galaxyproject#5222.

Move job remap functionality in separate function

2610341

Add job resume test with HDCA input to paused dataset

9fdf0a6

I am slightly surprised that this worked without change, but it appears that the remapping occurs via the HDAs that the HDCA is composed of.

mvdbeek force-pushed the job_rerun_fixes branch from 42bbda1 to c0dbace Compare December 31, 2017 12:09

jmchilton merged commit 8e93c2e into galaxyproject:dev Dec 31, 2017

mvdbeek mentioned this pull request Jan 17, 2018

Add option to replace failed elements on job rerun #5321

Merged

mvdbeek deleted the job_rerun_fixes branch June 12, 2018 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix and enhance job resume functionality #5247

Fix and enhance job resume functionality #5247

mvdbeek commented Dec 27, 2017 •

edited

Loading

jmchilton commented Dec 31, 2017

Fix and enhance job resume functionality #5247

Fix and enhance job resume functionality #5247

Conversation

mvdbeek commented Dec 27, 2017 • edited Loading

jmchilton commented Dec 31, 2017

mvdbeek commented Dec 27, 2017 •

edited

Loading