Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI & API for Supporting Best Practice Workflows #10988

Merged
merged 6 commits into from
Dec 30, 2020

Conversation

jmchilton
Copy link
Member

Overview

The PR adds a workflow editor panel for displaying information and options aimed at improving workflows and having that conform to best practices.

Quick Walkthrough

Here is a legacy workflow that uses older-style "workflow parameters" and has a disconnected, non-optional input. Both of these things make this workflow inappropriate for the new simplified workflow UI, usage via the API, and Planemo workflow testing.

Screen Shot 2020-12-23 at 3 52 32 PM

The PR adds an option for a best practices panel.

Screen Shot 2020-12-23 at 3 52 58 PM

This panel explains these issues in detail and lets the user fix them in bulk or one at a time.

Screen Shot 2020-12-23 at 3 53 14 PM

After clicking the fix all button (and I cheated and auto laid out the result), I have a workflow with formal workflow inputs and parameters.

Screen Shot 2020-12-23 at 3 53 48 PM

The parameters still don't have annotations and the input doesn't have an annotation or a label - and so we can see there is more manual work to be done in the "Best Practices" panel.

Workflow Parameters

Ostensibly, the goal here is to replace older-style "workflow parameters" with steps of type "parameter_input". The latter works with the API, is easier to reason about based on the workflow representation, and can be marked up with type information, unique label, and annotations - all improving usability (fixes #6961).

Part of the difficulty here is that older style "workflow parameters", that I'm going to refer to now as "legacy parameters", work very differently when used to replace tool inputs versus when used as parameters in post job actions.

"Legacy parameters" when used in post job actions as replacement variables are relatively well behaved. They are added to the database in a structured way and are usable via the API in a very reasonable fashion.

To support this post job action use case with "step parameters", I've modified the workflow runtime to just add these to the workflow replacement dictionary if they aren't explicitly sent via the existing replacement dict submission mechanism. This works pretty transparently. Now post job action variables can reference step parameters. They can now have annotations, etc.. and workflow API consumers just need to understand one uniform way of submitting parameters.

On the other hand, when legacy parameters are used as tool inputs, only the older-style workflow run GUI worked and only worked by replacing the complete state of every step at submission time. We want to get away from that behavior - the UI wasn't performant, the UI was complicated, these workflows couldn't be executed via the API, and for traceability workflows should be submitted only with annotated inputs that are tracked in a structured manner in the database.

The bulk of this work is about providing support to users to replace their legacy parameters with step parameters in the workflow editor. They could have been replaced manually before, but if the parameter was used in multiple places this would have been an error prone process.

To provide this support, this pull request introduces a "Workflow Best Practices" panel capable of performing this complex refactoring, as well as others, via a new fully typed workflow refactoring API.

Fixes #9096

Workflow Best Practices Panel

The workflow menu now has a "Check for Best Practices" option. This will replace the right pane with a series of collapsible sections labeled pass/warn for various non-best practice behavior. (I use the word linting a lot in the code, but I think I have managed to avoid exposing it to the user.)

The sections each have a significant amount of help text explaining issues as well as the justification for they are not best practice.

The sections include:

  • "Parameters" - this ensures no legacy parameters are used and if they are used, they are listed along with an option to convert them to a step input parameter.
  • "Disconnected Inputs" - this ensures no workflow steps have disconnected, non-optional inputs. These should all be attached to formal workflow input data parameters - and an option to convert them is presented for each disconnected input.
  • "Input Metadata" - this displays any workflow inputs that don't have a label or annotation to the user along with which is missing.
  • "Outputs" - this ensures the workflow has at least one explicit workflow output and has no unlabeled workflow outputs. An option is provided to remove all unlabeled outputs as formal workflow outputs.
  • "Metadata - License" - this ensures the workflow has a license and provides the user a link to the attributes page to add one if it isn't provided.
  • "Metadata - Creator" - this ensures the workflow has creator information annotated and provides the user a link to the attributes page to add it if it isn't provided.

All of the modifications this panel can make automatically are made through the type workflow refactoring API. This API makes it very easy to combine multiple refactoring steps into lists of actions - so this panel also has a link at the top to simply make all the automatic changes it knows how to to fix issues if the user wants to quickly fix up everything.
The refactoring API calls are all mitigated through Index.vue that ensures the workflow is saved and such before executing them. The API also has a dry_run parameter - so the UI attempts the refactoring first in dry run mode before actually making the changes.

Implements #8546

Typed Workflow Refactoring API

The workflow refactoring API takes in a list of actions to perform (along with a dry_run parameter) and then simply executes them in order to produce the next version of a Workflow for a StoredWorkflow object.
I've implemented a handful of types and tests for each of them. The ability to compose them and execute them in order is powerful as demonstrated by how clean the ability to just execute all refactorings steps in order from Lint panel is.

The actions are all typed using Pydantic, so the API has incredibly intuitive and uniform validation and should be trivial to create documentation from.

This API is the first big piece toward implementing #9166. Eventually, I'd like all workflow modifications to come through structured, tested steps that we can just store along with the workflow version in the database. Having such structured edits has numerous benefits including allowing us to generate changelogs, implement undo functionality, and apply simultaneous edits to the database and human written workflow YAML files.

Copy link
Member

@mvdbeek mvdbeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really love this work, this is awesome and a great foundation to build on!

I do see some issues from my testing though:
For tool inputs the refactoring is works fine (it doesn't handle {$var1} test yet), but I'm worried about the PJA replacements. I think it would be better if they didn't continue to use the ${input|basename} extra text syntax. There's the minor issue that the best practices UI still wants to fix these, but I also think this is confusing to the user.

From the UI/UX viewpoint I think it would be better to use the same toggle we use for connecting workflow parameters, so that there is just one way to define them. Which also has the advantage that you can see in the editor where your parameters are consumed. For PJAs that can reference their own inputs this could be augmented to a select list of available inputs.

I am unsure that we already want to expose the legacy parameter fixing to users ... I can see this being frustrating if you can't actually fix your workflow yet because you're composing your parameter values in a way we can't fix. Or maybe we can offer this, but divide the fixes between essential (maybe in red) for disconnected inputs, missing labels, no workflow outputs and recommended (in orange/yellow?) for legacy parameters and mention the caveat that this doesn't work for every scenario yet?

@jmchilton
Copy link
Member Author

Would it be useful to break the step parameters reworking and refactoring API commits out into their own PRs?

@mvdbeek
Copy link
Member

mvdbeek commented Dec 28, 2020

I think it makes sense to keep them in one.

Functionality-wise, should refactoring ${param} in a PJA insert the workflow parameter if it doesn't exist yet ? Or should refactoring exclude PJAs?
If you do define a workflow like that the best practices panel wants to fix it, but will fail with Failed to find {target_value} in the tool state or any workflow steps.

@mvdbeek
Copy link
Member

mvdbeek commented Dec 28, 2020

And do you have a plan for dealing with ${param} and some more text here ? That could maybe become a compose_text_param step ?

@jmchilton
Copy link
Member Author

Functionality-wise, should refactoring ${param} in a PJA insert the workflow parameter if it doesn't exist yet ? Or should refactoring exclude PJAs?

I think it should create the parameter, I don't have any particular objection to legacy parameters just used in PJA - they work fine with the API and and the simplified workflow form renders them okay but I suppose they can't be considered best practice without annotations. They will work with workflow test cases better also I think. So yeah... I will make that change.

@jmchilton
Copy link
Member Author

And do you have a plan for dealing with ${param} and some more text here ? That could maybe become a compose_text_param step ?

I was planning to leave these as a TODO, with that being exactly the right fix. In #10989 I had annotating why they can't be replaced, maybe it is worth the time to handle them though? I wish I knew how many people would be using this.

@mvdbeek
Copy link
Member

mvdbeek commented Dec 28, 2020

I'd say this is probably quite common, but I could be wrong there. It was the first thing I tried, but that doesn't mean much.

@jmchilton
Copy link
Member Author

Functionality-wise, should refactoring ${param} in a PJA insert the workflow parameter if it doesn't exist yet ? Or should refactoring exclude PJAs?

Okay - I think it does now! I added that and two new test cases.

I toyed with the idea of compose_text_param but then realized it was in the tool shed and lost interest 😆. It shouldn't be hard per se - it would just introduce a bunch of conditional logic in the UI and I would have to... use the tool shed. I have a bunch of production workflows that make heavy use of these I'm converting - my first scan through them all just had bare parameters - but I didn't pay too close attention. I'll let you know if this dialog proves wildly insufficient.

@mvdbeek
Copy link
Member

mvdbeek commented Dec 28, 2020

We could move compose_text_param into core ... it does provide some important functionality that maybe one would expect to ship with Galaxy? I put it in the IUC because I thought it'd be cool to maintain expression tools that way.

@jmchilton jmchilton mentioned this pull request Dec 28, 2020
13 tasks
@jmchilton
Copy link
Member Author

I would say it isn't worth the effort at this time - to backport the tool or rewrite the parameters? There is some chance we will want to pull it out someday right? And there is some chance we would want to have like native workflow expressions - maybe on inputs or maybe as a stand-alone module type. If you did open the PR to add it to Galaxy though, I'd be happy to merge it.

@mvdbeek
Copy link
Member

mvdbeek commented Dec 30, 2020

This is working great, thanks so much!

@mvdbeek mvdbeek merged commit e5ffb06 into galaxyproject:dev Dec 30, 2020
jmchilton added a commit to jmchilton/galaxy that referenced this pull request Dec 30, 2020
The ever careless @jmchilton did the same thing in two slightly different ways between galaxyproject#10988 and galaxyproject#11001 - creating his own bizarre race condition.
jmchilton added a commit to jmchilton/galaxy that referenced this pull request Dec 30, 2020
The ever careless @jmchilton did the same thing in two slightly different ways between galaxyproject#10988 and galaxyproject#11001 - creating his own bizarre race condition.
jmchilton added a commit to jmchilton/galaxy that referenced this pull request Dec 30, 2020
The ever careless @jmchilton did the same thing in two slightly different ways between galaxyproject#10988 and galaxyproject#11001 - creating his own bizarre race condition.
@nsoranzo nsoranzo deleted the workflow_lint branch June 8, 2022 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants