Allow configuring object stores on a per-job-destination basis. #6552
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Setup a hierarchical objectstore with ids - these ids can be zero weight for objectstores that should only be used on a per-job-destination basis. Job destinations can then set object_store_id as a param to force a particular object store from the hierarchical configuration to be used. This can be set in job_conf.xml directly for static mappings or combined with dynamic job destinations for dynamic mapping to object stores. Dynamic job destinations can dispatch on cluster conditions, user, user preferences, etc.. Additionally, job resource parameters can be used with dynamic job destinations to allow user input on the object store to use.
An integration test case is included that tests and demonstrates all these different pieces. It sets up a Galaxy with a job and object store configuration that routes one tool to a specific disk store (called "static"), and routes another tool dynamically using job resource parameters that allow the user to pick between "slow, cheap" and "fast, expensive" storage options (for the test it is all just disk stores used - but you can imagine choosing between EBS and S3 or something this way).
I'd be happy to see something more formal than job resource parameters or user preferences developed to help users route job outputs to particular object stores as well as more formal support in the object store configuration file for selecting by ID and such - this is just a demonstration of what can be done today after these changes using extension points already being used by admins and users.