Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky BWC test #296

Merged
merged 1 commit into from
Feb 8, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -59,18 +59,17 @@ class AlertingBackwardsCompatibilityIT : AlertingRestTestCase() {
ClusterType.MIXED -> {
assertTrue(pluginNames.contains("opensearch-alerting"))
verifyMonitorExists(LEGACY_OPENDISTRO_ALERTING_BASE_URI)
// Waiting a minute to ensure the Monitor ran again at least once before checking if the job is running
// on time
// TODO: Should probably change the next execution time of the Monitor manually instead since this inflates
// the test execution by a lot
Thread.sleep(60000)
// TODO: Need to move the base URI being used here into a constant and rename ALERTING_BASE_URI to
// MONITOR_BASE_URI
verifyMonitorStats("/_opendistro/_alerting")
}
ClusterType.UPGRADED -> {
assertTrue(pluginNames.contains("opensearch-alerting"))
verifyMonitorExists(ALERTING_BASE_URI)
// TODO: Change the next execution time of the Monitor manually instead since this inflates
// the test execution by a lot (might have to wait for Job Scheduler plugin integration first)
// Waiting a minute to ensure the Monitor ran again at least once before checking if the job is running
// on time
Thread.sleep(60000)
Comment on lines +69 to 73
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually a consistent way to avoid sleeps is by using assertBusy, given if we expect some change of state within a stipulated time threshold. It allows to runs the code block for the provided interval, waiting for no assertions to trip, and return as soon as all the assertions are met. So wondering of there exists some state related to monitor execution, such as count, that we can rely upon as a trigger here, and then verify the stats response.

Copy link
Contributor Author

@qreshi qreshi Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for the sleep was that the minimum Monitor execution interval is 1 minute and we were going to ensure the Monitor executed again before checking the stats to make sure that any late running Monitor manifested itself. The test itself is not perfect and we'd normally want to just fast forward the job execution like we do for the Index Management test but Alerting uses an older job scheduling implementation where the next job execution time is not stored on the document but rather in some ConcurrentHashMap object that is not test friendly (is not easily mutable in written tests).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying @qreshi. So my suggestion was, is there is a way (API/Object) which can specify if monitor was run again (such as a running-counter). If yes, then we can do assertBusy, which provides an exponential backoff mechanism to wait until the condition is met.
If not, I understand having sleep is the only valid option.

verifyMonitorStats("/_plugins/_alerting")
}
Expand Down