-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][STANDING] Standing PR to track job setup/finish optimizations. #6548
Closed
jmchilton
wants to merge
13
commits into
galaxyproject:dev
from
jmchilton:random_job_submission_opts_good
Closed
[WIP][STANDING] Standing PR to track job setup/finish optimizations. #6548
jmchilton
wants to merge
13
commits into
galaxyproject:dev
from
jmchilton:random_job_submission_opts_good
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Previously object_store_id on datasets was set as part of create - and create requires an object ID and therefore by extension a database flush. Together, this required a flush per output dataset while creating jobs. This commit splits the object_store_id setting out from create - so that datasets can set an object_store_id during dataset creation and before flushing. The upshot is that flushing the database can now be done once for all datasets instead of once per dataset. This optimization will in particular help executing tools with many outputs. For instance, the time to create a job for the sample tool create_10.xml which has 10 outputs dropped from 1086.074 ms to 230.539 ms with this change. Before: ``` galaxy.tools DEBUG 2016-03-21 01:28:51,785 Validated and populated state for tool request (37.990 ms) galaxy.tools.actions INFO 2016-03-21 01:28:51,885 Handled output named out_file1 for tool create_10 (85.515 ms) galaxy.tools.actions INFO 2016-03-21 01:28:51,972 Handled output named out_file2 for tool create_10 (86.611 ms) galaxy.tools.actions INFO 2016-03-21 01:28:52,045 Handled output named out_file3 for tool create_10 (72.657 ms) galaxy.tools.actions INFO 2016-03-21 01:28:52,135 Handled output named out_file4 for tool create_10 (90.179 ms) galaxy.tools.actions INFO 2016-03-21 01:28:52,210 Handled output named out_file5 for tool create_10 (74.831 ms) galaxy.tools.actions INFO 2016-03-21 01:28:52,277 Handled output named out_file6 for tool create_10 (66.720 ms) galaxy.tools.actions INFO 2016-03-21 01:28:52,349 Handled output named out_file7 for tool create_10 (71.524 ms) galaxy.tools.actions INFO 2016-03-21 01:28:52,429 Handled output named out_file8 for tool create_10 (79.768 ms) galaxy.tools.actions INFO 2016-03-21 01:28:52,517 Handled output named out_file9 for tool create_10 (87.794 ms) galaxy.tools.actions INFO 2016-03-21 01:28:52,627 Handled output named out_file10 for tool create_10 (110.615 ms) galaxy.tools.actions INFO 2016-03-21 01:28:52,720 Verified access to datasets for Job[unflushed,tool_id=create_10] (7.573 ms) galaxy.tools.execute DEBUG 2016-03-21 01:28:52,845 Tool [create_10] created job [19] (1060.070 ms) galaxy.tools.execute DEBUG 2016-03-21 01:28:52,871 Executed 1 job(s) for tool create_10 request: (1086.074 ms) ``` After: ``` galaxy.tools DEBUG 2016-03-21 01:30:05,284 Validated and populated state for tool request (44.121 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,299 Handled output named out_file1 for tool create_10 (1.674 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,301 Handled output named out_file2 for tool create_10 (0.983 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,302 Handled output named out_file3 for tool create_10 (1.046 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,302 Handled output named out_file4 for tool create_10 (0.490 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,303 Handled output named out_file5 for tool create_10 (0.469 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,304 Handled output named out_file6 for tool create_10 (0.512 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,304 Handled output named out_file7 for tool create_10 (0.465 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,305 Handled output named out_file8 for tool create_10 (0.462 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,305 Handled output named out_file9 for tool create_10 (0.488 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,306 Handled output named out_file10 for tool create_10 (0.446 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,393 Created output datasets in object store (0.958 ms) galaxy.tools.actions INFO 2016-03-21 01:30:05,395 Verified access to datasets for Job[unflushed,tool_id=create_10] (0.359 ms) galaxy.tools.execute DEBUG 2016-03-21 01:30:05,491 Tool [create_10] created job [20] (207.226 ms) galaxy.tools.execute DEBUG 2016-03-21 01:30:05,515 Executed 1 job(s) for tool create_10 request: (230.539 ms) ``` Conflicts: lib/galaxy/tools/actions/__init__.py
We were checking permissions twice it seems like to me, only do it once. Avoid loading dataset permissions for each individual dataset in a collection - work with summary sets of role ids.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Work on #4959 - like #6380 but a better set of optimizations - dropped a few things that didn't pan out and added some nice newer optimizations that are better.