-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make cancel() synchronous in DBMAsyncJob #14717
Conversation
Codecov Report
Flags with carried forward coverage won't be shown. Click here to find out more. |
@@ -809,9 +809,6 @@ def test_async_job_cancel(aggregator, dd_run_check, dbm_instance): | |||
mysql_check = MySql(common.CHECK_NAME, {}, [dbm_instance]) | |||
dd_run_check(mysql_check) | |||
mysql_check.cancel() | |||
# wait for it to stop and make sure it doesn't throw any exceptions | |||
mysql_check._statement_samples._job_loop_future.result() | |||
mysql_check._statement_metrics._job_loop_future.result() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
self._cancel_event.set() | ||
|
||
if self._job_loop_future is not None: | ||
self._job_loop_future.result(10) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @jmeunier28 @edengorevoy 👋 Keep in mind that you'll have to bump the minimum base package version in your integrations if they now rely on this change 🙂 (e.g. https://github.com/DataDog/integrations-core/blob/master/mysql/pyproject.toml#L33)
What does this PR do?
This PR adds some logic in the
cancel()
method ofDMBAsyncJob
to ensure that cancel() is a synchronous operation.This PR also edits some tests in the postgres, MySQL, and sqlserver agent integrations (those who inherit from
DMBAsyncJob
) to try to make them less flaky. The change adds arun_one_check
function in the postgres test utils to run a check and then cancel it + wait for all threads to cancel, so that assertions on metrics happen in a single-threaded environment.Motivation
We are running into some testing issues in this PR. The errors are potentially race conditions that arise at runtime due to the metrics list in the Aggregator being iterated over at the same time as they're being added to in separate threads owned by DBM.
We should make sure cancellation is synchronous in DBMAsyncJob to further debug this issue and verify that it is the source of these errors.
DBMAsyncJob is only used by objects owned by DBM
, so making this change should not cause issues in other teams.
Additional Notes
I intend to rebase the above PR on this branch so I can see how changing the condition for starting the job loop affects the tests, and I want to use this modified cancel() function to facilitate that investigation.
Review checklist (to be filled by reviewers)
changelog/
andintegration/
labels attachedqa/skip-qa
label.