Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuring object stores on a per-job-destination basis. #6552

Merged
merged 1 commit into from
Aug 22, 2018

Conversation

jmchilton
Copy link
Member

Setup a hierarchical objectstore with ids - these ids can be zero weight for objectstores that should only be used on a per-job-destination basis. Job destinations can then set object_store_id as a param to force a particular object store from the hierarchical configuration to be used. This can be set in job_conf.xml directly for static mappings or combined with dynamic job destinations for dynamic mapping to object stores. Dynamic job destinations can dispatch on cluster conditions, user, user preferences, etc.. Additionally, job resource parameters can be used with dynamic job destinations to allow user input on the object store to use.

An integration test case is included that tests and demonstrates all these different pieces. It sets up a Galaxy with a job and object store configuration that routes one tool to a specific disk store (called "static"), and routes another tool dynamically using job resource parameters that allow the user to pick between "slow, cheap" and "fast, expensive" storage options (for the test it is all just disk stores used - but you can imagine choosing between EBS and S3 or something this way).

I'd be happy to see something more formal than job resource parameters or user preferences developed to help users route job outputs to particular object stores as well as more formal support in the object store configuration file for selecting by ID and such - this is just a demonstration of what can be done today after these changes using extension points already being used by admins and users.

Setup a hierarchical objectstore with ids - these ids can be zero weight for objectstores that should only be used on a per-job-destination basis. Job destinations can then set object_store_id as a param to force a particular object store from the hierarchical configuration to be used. This can be set in job_conf.xml directly for static mappings or combined with dynamic job destinations for dynamic mapping to object stores. Dynamic job destinations can dispatch on cluster conditions, user, user preferences, etc.. Additionally, job resource parameters can be used with dynamic job destinations to allow user input on the object store to use.

An integration test case is included that tests and demonstrates all these different pieces. It sets up a Galaxy with a job and object store configuration that routes one tool to a specific disk store (called "static"), and routes another tool dynamically using job resource parameters that allow the user to pick between "slow, cheap" and "fast, expensive" storage options (for the test it is all just disk stores used - but you can imagine choosing between EBS and S3 or something this way).

I'd be happy to see something more formal than job resource parameters or user preferences developed to help users route job outputs to particular object stores - this is just a demonstration of what can be done today after these changes using extension points already being used by admins and users.
@jmchilton jmchilton force-pushed the per_destination_object_store branch from 2801657 to 7882a06 Compare August 6, 2018 15:29
Copy link
Member

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! In the end its a few lines of code? John magic!

@bgruening
Copy link
Member

ping @erasche can you have a look as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants