Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate Batch client usage and decrease API hits #147

Merged
merged 8 commits into from
Feb 24, 2023

Conversation

lossyrob
Copy link
Contributor

This PR provides improvements on how PCTasks utilizes Azure Batch. This includes reducing the number of Batch clients created and the number of times the API is hit during a workflow run.

It modifies the RemoteWorkflowExecutor and TaskRunner to be context managers so that things like client connections can be managed by the class. The BatchTaskRunner now creates a batch client and uses a single client instead of creating and closing clients per call. This client is used across multiple threads, and thread safety from the client is assumed. This has been tested in multiple runs of a ~50 partition workflow.

In addition, a thread-locked cache is used to minimize the Batch API calls in the method that find failed Batch tasks. This method was being called once per thread, normally at roughly the same moment, and so would put a lot of pressure on the API. With the cache, the Batch API is hit at most a couple of times every 5 seconds for this purpose.

This commit converts the RemoteWorkflowExecutor and TaskRunner
to context managers, and re-uses the batch client in the
BatchTaskRunner.
Prior to this change, terminating a Batch job through
the Batch API would cause the workflow to hang. This
checks the job status to ensure that the job is still
running, and if not, considers it failed.
New release of azure-storage-queue broke mypy without
explicit return type annotation
@lossyrob lossyrob force-pushed the fix/rde/batch-remote branch from 08992da to 2fe5a11 Compare February 23, 2023 15:26
@lossyrob lossyrob merged commit 38bd6a4 into main Feb 24, 2023
@TomAugspurger TomAugspurger deleted the fix/rde/batch-remote branch March 7, 2023 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants