You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With metadata handled as part of the remote process - Galaxy still struggles a bit when creating and finishing hundreds of jobs a time. I've spent a good amount of time working on this problem in 2016 as part of scaling up collections but I think I need to take some more passes. Here are some things I think should be done:
TODO:
job_wrapper.change_state should just be one database statement execute - at least in some threads sometimes. There is probably no need to load multiple objects and do things in SA. (EDIT: done in Implement HDCA update_time #9722)
Implement a number of pre-created threads to work on creating jobs for a single workflow thread - this is almost ready but we should implement the last few things, have test cases, and document the entry point. Multi-threaded job scheduling in workflows. #3903
job.mark_deleted should be made an atomic database method when change_state above is - but I guess it isn't used in the main creation and finish threads.
I'm gonna call this one complete (even though we didn't do batch inserts yet). I think the next step is to break down pieces into tasks we can schedule with celery, where we're not limited by the GIL. Job handlers are often close to 100% in CPU utilization, so I don't think threading is going to help much there. It'll also be much easier to scale celery tasks based on load, and they can be retried and resumed more easily.
With metadata handled as part of the remote process - Galaxy still struggles a bit when creating and finishing hundreds of jobs a time. I've spent a good amount of time working on this problem in 2016 as part of scaling up collections but I think I need to take some more passes. Here are some things I think should be done:
TODO:
job_wrapper.change_state
should just be one database statement execute - at least in some threads sometimes. There is probably no need to load multiple objects and do things in SA. (EDIT: done in Implement HDCA update_time #9722)job.mark_deleted
should be made an atomic database method whenchange_state
above is - but I guess it isn't used in the main creation and finish threads.DONE:
FAILED PRS:
The text was updated successfully, but these errors were encountered: