Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky SearchCancellationIT tests to avoid race condition #5656

Merged
merged 4 commits into from
Dec 29, 2022

Conversation

dbwiddis
Copy link
Member

Signed-off-by: Daniel Widdis [email protected]

Description

Thread.sleep() does not sleep for precise amounts:

Causes the currently executing thread to sleep (temporarily cease execution) for the specified number of milliseconds, subject to the precision and accuracy of system timers and schedulers.

Tests in the SearchCancellationIT class relied on sleeping for exactly the cancellation timeout, which creates potential race conditions and occasional test failures.

This PR adds an extra 100ms of sleep time before checking the cancellation status.

Issues Resolved

Fixes #2242
Fixes #2311
Fixes #2763

Check List

  • New functionality includes testing.
    • All tests pass
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -241,7 +241,8 @@ public void testCancellationDuringQueryPhaseUsingRequestParameter() throws Excep
.execute();
awaitForBlock(plugins);
// sleep for cancellation timeout to ensure scheduled cancellation task is actually executed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is no longer accurate. How about a static sleepForTimeout() or similar method so you only have to write the comment once?

Also, any idea why 2 seconds is used everywhere here? Would be great to make that something shorter (like 100ms) to speed up the test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I was having so much fun copying and pasting with The Key. Sure, I'll make a static call. Not sure what the best minimum should be, but I'm guessing I could make it 1 second at least?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. You'll make it twice as fast!

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

codecov-commenter commented Dec 29, 2022

Codecov Report

Merging #5656 (243a102) into main (f2b5044) will increase coverage by 0.14%.
The diff coverage is n/a.

@@             Coverage Diff              @@
##               main    #5656      +/-   ##
============================================
+ Coverage     70.96%   71.10%   +0.14%     
- Complexity    58550    58648      +98     
============================================
  Files          4760     4760              
  Lines        279515   279515              
  Branches      40348    40348              
============================================
+ Hits         198367   198762     +395     
+ Misses        65037    64576     -461     
- Partials      16111    16177      +66     
Impacted Files Coverage Δ
...pensearch/client/cluster/RemoteConnectionInfo.java 0.00% <0.00%> (-73.18%) ⬇️
...a/org/opensearch/client/cluster/SniffModeInfo.java 0.00% <0.00%> (-58.83%) ⬇️
...a/org/opensearch/client/cluster/ProxyModeInfo.java 0.00% <0.00%> (-55.00%) ⬇️
.../opensearch/client/indices/CloseIndexResponse.java 42.50% <0.00%> (-48.75%) ⬇️
.../admin/cluster/reroute/ClusterRerouteResponse.java 60.00% <0.00%> (-40.00%) ⬇️
.../opensearch/client/cluster/RemoteInfoResponse.java 61.53% <0.00%> (-38.47%) ⬇️
...luster/routing/allocation/RoutingExplanations.java 62.06% <0.00%> (-37.94%) ⬇️
...ations/bucket/terms/heuristic/ScriptHeuristic.java 5.55% <0.00%> (-31.49%) ⬇️
...cluster/routing/allocation/RerouteExplanation.java 70.00% <0.00%> (-30.00%) ⬇️
...pensearch/client/core/MultiTermVectorsRequest.java 41.17% <0.00%> (-29.42%) ⬇️
... and 468 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@dbwiddis dbwiddis requested a review from andrross December 29, 2022 18:49
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      2 org.opensearch.cluster.service.MasterServiceTests.classMethod
      1 org.opensearch.cluster.service.MasterServiceTests.testThrottlingForMultipleTaskTypes

Signed-off-by: Daniel Widdis <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@andrross andrross merged commit d248643 into opensearch-project:main Dec 29, 2022
@andrross andrross added the backport 2.x Backport to 2.x branch label Dec 29, 2022
opensearch-trigger-bot bot pushed a commit that referenced this pull request Dec 29, 2022
* Add waiting time to account for Thread.sleep inaccuracy

Signed-off-by: Daniel Widdis <[email protected]>
(cherry picked from commit d248643)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
andrross pushed a commit that referenced this pull request Dec 29, 2022
…5657)

* Add waiting time to account for Thread.sleep inaccuracy

Signed-off-by: Daniel Widdis <[email protected]>
(cherry picked from commit d248643)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Signed-off-by: Daniel Widdis <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@dbwiddis dbwiddis deleted the searchCancellation branch December 29, 2022 21:48
kotwanikunal pushed a commit that referenced this pull request Jan 25, 2023
…5657)

* Add waiting time to account for Thread.sleep inaccuracy

Signed-off-by: Daniel Widdis <[email protected]>
(cherry picked from commit d248643)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Signed-off-by: Daniel Widdis <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
3 participants