-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making the orchestrator work with arbitrary CLIs #174
Comments
brianhenn
pushed a commit
that referenced
this issue
Mar 19, 2020
Adds a bunch of edits to the workflow orchestrator to redesign its syntax and argument handling to be more generalizable and enhance the orchestrator maintainability. This includes: general handling of arguments for both inputs and other types of arguments, with ability to specify from/location, optional arguments, and the ability to pass multiple argument values available for all arguments. This resolves #174 . changed the argument passing syntax to reflect the merging of inputs/extra arguments; arguments are now passed in *positional_args + output_location + **{optional_arg: arg_value} order. updated all of the current workflow step submission scripts to work with the new syntax removed all of the intermediate orchestrator_submit.sh scripts as they are no longer needed by the orchestrator (all orchestrator calls are now directly to python), while maintaining any single-step shell script functionality in individual steps. added the ability to launch Dataflow jobs from the orchestrator without an intermediate script using the --runner argument These changes are included in this PR because they (among many other things) allow for the prognostic run to be called with the ML model path argument as an optional input from the model training step, to support running the ML and baseline prognostic runs together under the orchestrator. Also introduces some changes to allow running baseline prognostic simulations with ML included.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
After considering how to configure a baseline simulation, I realized a maintainability issue with the current strategy of the orchestrator not working with general command line interfaces.
To submit a prognostic baseline simulation without an ML model, I would like to use the same script as for the prognostic runs. Perhaps the simplest way to do this is to make "model_url" an optional argument.
Unfortunately, I don't think the orchestrator "from" capability will work with "extra_args". It would be nice to do something like this:
Of course, I could write another wrapper which calls orchestrate_submit_job.py, but adding wrapper layers causes a significant maintainability problem. Any changes to the prognostic run CLI would require editing all the wrappers calling it. I also think the indirection with e.g. bash wrappers calling python scripts which submit K8s jobs is making the code pretty complicated and hard to follow.
To sum up, I think we should move towards having 1 command line interface per workflow, and the orchestrator should be expanded to work with most command line interfaces. I don't think this should be too hard. For simplicity, I think it is probably still OK to assume that the output is the last positional argument. I also think "extra_args" could be merged with "input" since IIRC the only real difference is the ordering of the generate command line arguments.
cc @frodre @brianhenn
The text was updated successfully, but these errors were encountered: