Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix and enhance job resume functionality #5247

Merged
merged 5 commits into from
Dec 31, 2017

Conversation

mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Dec 27, 2017

This should fix #5222.

The problem was that HDA ids in nested parameters were not always being updated
properly. In the case of bowtie2 the input dataset is provided via a
conditional, but the conditional prefix is not being stored in the
JobToInputDatasetAssociation, and so the update_param function was not able
to update the dataset id in these instances. The approach now is to simply
replace all occurences of the old dataset id with the new dataset id.

The PR includes 2 testcases for this functionality, one for hda input and one for hdca input.

With this PR we also replace failed element of a collection with a job rerun (including a testcase), which fixes #2235.

@mvdbeek mvdbeek added this to the 18.01 milestone Dec 27, 2017
@mvdbeek mvdbeek changed the title [WIP] Fix job resume functionality for non-prefixed input data Fix job resume functionality for non-prefixed input data Dec 29, 2017
@mvdbeek mvdbeek changed the title Fix job resume functionality for non-prefixed input data Fix and enhance job resume functionality Dec 30, 2017
This should fix galaxyproject#5222.

The problem was that HDA ids in nested parameters were not always being updated
properly. In the case of bowtie2 the input dataset is provided via a
conditional, but the conditional prefix is not being stored in the
JobToInputDatasetAssociation, and so the `update_param` function was not able
to update the dataset id in these instances.  The approach now is to simply
replace all occurences of the old dataset id with the new dataset id.

TODO: tests, refactor the replacement functionality into a separate function
and make this work for JobToInputDatasetCollectionAssociation.
This test runs a workflow whose first step fails, followed by a tool that uses
the first step's output as an input, which is behind a nested conditional. This
recapitulates the bug described in galaxyproject#5222.
I am slightly surprised that this worked without change,
but it appears that the remapping occurs via the HDAs that
the HDCA is composed of.
This specifically addresses the problem where some jobs of a mapped-over
collection have failed. Instead of filtering the failed collection and
restarting the workflow at this position (involving a lot of copy-paste ...)
the user can now limit the rerun to the problematic jobs and the workflow
should resume from there.
Should fix galaxyproject#2235.

This is one possible implementation, it would also be feasible to not
manipulate the original collection, but to copy the HDCA and then to replace
collection elements and replace all references for jobs that depend on the HDCA,
as we do for HDAs. This implementation seems simpler, but let me know if you
see problems with this approach.
@jmchilton
Copy link
Member

Wow - this is totally awesome. Great fix and great feature to finally have some test coverage of - this is very exciting! Thanks so much @mvdbeek.

@jmchilton jmchilton merged commit 8e93c2e into galaxyproject:dev Dec 31, 2017
mvdbeek added a commit to mvdbeek/galaxy that referenced this pull request Jan 17, 2018
Replacing failed collection elements was already possible when dependent jobs
were found (galaxyproject#5247).
This commit restructures the remapping so that remapping is possible when no
dependent jobs are available. This also simplifies the replacement of HDAs
between old and new jobs.
mvdbeek added a commit to mvdbeek/galaxy that referenced this pull request Jan 17, 2018
Replacing failed collection elements was already possible when dependent jobs
were found (galaxyproject#5247).
This commit restructures the remapping so that remapping is possible when no
dependent jobs are available. This also simplifies the replacement of HDAs
between old and new jobs.
mvdbeek added a commit to mvdbeek/galaxy that referenced this pull request Jan 17, 2018
Replacing failed collection elements was already possible when dependent jobs
were found (galaxyproject#5247).
This commit restructures the remapping so that remapping is possible when no
dependent jobs are available. This also simplifies the replacement of HDAs
between old and new jobs.
@mvdbeek mvdbeek deleted the job_rerun_fixes branch June 12, 2018 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants