Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky test SegmentReplicationWithRemoteStorePressureIT.testAddReplicaWhileWritesBlocked #9501

Merged
merged 1 commit into from
Aug 23, 2023

Conversation

mch2
Copy link
Member

@mch2 mch2 commented Aug 23, 2023

Description

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations. The block works by mocking transport calls to prevent segrep from completing until released. This will prevent force-sync recovery of the replica from completing, block recovery, and eventually result in ensureGreen timing out. Fixed by moving the ensureGreen until after releasing blockOperations.

This PR also reduces the doc count that is used while indexing down from max 200. Writes wit h the remote store version of this test take a much longer time to execute whe n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Related Issues

Resolves #8887

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…licaWhileWritesBlocked.

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Signed-off-by: Marc Handalian <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:

Checks if related components are compatible with change 5d3633c

Incompatible components

Incompatible components: [https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/security-analytics.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@mch2
Copy link
Member Author

mch2 commented Aug 23, 2023

Gradle Check (Jenkins) Run Completed with:

Task :test:fixtures:krb5kdc-fixture:composeBuild FAILED

@mch2
Copy link
Member Author

mch2 commented Aug 23, 2023

Gradle Check (Jenkins) Run Completed with:

[org.opensearch.indices.replication.SegmentReplicationSuiteIT.testDropRandomNodeDuringReplication](https://build.ci.opensearch.org/job/gradle-check/23276/testReport/junit/org.opensearch.indices.replication/SegmentReplicationSuiteIT/testDropRandomNodeDuringReplication/)
[org.opensearch.indices.replication.SegmentReplicationSuiteIT.testFullRestartDuringReplication](https://build.ci.opensearch.org/job/gradle-check/23276/testReport/junit/org.opensearch.indices.replication/SegmentReplicationSuiteIT/testFullRestartDuringReplication/)
[org.opensearch.indices.replication.SegmentReplicationSuiteIT.testDeleteIndexWhileReplicating](https://build.ci.opensearch.org/job/gradle-check/23276/testReport/junit/org.opensearch.indices.replication/SegmentReplicationSuiteIT/testDeleteIndexWhileReplicating/)
[org.opensearch.indices.replication.SegmentReplicationSuiteIT.classMethod](https://build.ci.opensearch.org/job/gradle-check/23276/testReport/junit/org.opensearch.indices.replication/SegmentReplicationSuiteIT/classMethod/)

#9499

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov
Copy link

codecov bot commented Aug 23, 2023

Codecov Report

Merging #9501 (67bd594) into main (ebdffbb) will decrease coverage by 0.08%.
Report is 5 commits behind head on main.
The diff coverage is n/a.

@@             Coverage Diff              @@
##               main    #9501      +/-   ##
============================================
- Coverage     71.20%   71.12%   -0.08%     
+ Complexity    57474    57466       -8     
============================================
  Files          4776     4776              
  Lines        270815   270815              
  Branches      39584    39584              
============================================
- Hits         192835   192625     -210     
- Misses        61741    61983     +242     
+ Partials      16239    16207      -32     

see 451 files with indirect coverage changes

Copy link
Collaborator

@tlfeng tlfeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix with detailed explanation!

@tlfeng tlfeng added :test Adding or fixing a test backport 2.x Backport to 2.x branch labels Aug 23, 2023
@mch2 mch2 merged commit d1678ba into opensearch-project:main Aug 23, 2023
19 of 43 checks passed
@mch2 mch2 deleted the 8887 branch August 23, 2023 18:50
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 23, 2023
…licaWhileWritesBlocked. (#9501)

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Signed-off-by: Marc Handalian <[email protected]>
(cherry picked from commit d1678ba)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
neetikasinghal pushed a commit to neetikasinghal/OpenSearch that referenced this pull request Aug 23, 2023
…licaWhileWritesBlocked. (opensearch-project#9501)

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Signed-off-by: Marc Handalian <[email protected]>
mch2 pushed a commit that referenced this pull request Aug 23, 2023
…licaWhileWritesBlocked. (#9501) (#9518)

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.


(cherry picked from commit d1678ba)

Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
austintlee pushed a commit to austintlee/OpenSearch that referenced this pull request Aug 25, 2023
…licaWhileWritesBlocked. (opensearch-project#9501)

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Signed-off-by: Marc Handalian <[email protected]>
Gaganjuneja pushed a commit to Gaganjuneja/OpenSearch that referenced this pull request Aug 28, 2023
…licaWhileWritesBlocked. (opensearch-project#9501)

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Signed-off-by: Marc Handalian <[email protected]>
Gaganjuneja pushed a commit to Gaganjuneja/OpenSearch that referenced this pull request Aug 28, 2023
…licaWhileWritesBlocked. (opensearch-project#9501)

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Gagan Juneja <[email protected]>
kkmr pushed a commit to kkmr/OpenSearch that referenced this pull request Aug 28, 2023
…licaWhileWritesBlocked. (opensearch-project#9501)

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Kiran Reddy <[email protected]>
kaushalmahi12 pushed a commit to kaushalmahi12/OpenSearch that referenced this pull request Sep 12, 2023
…licaWhileWritesBlocked. (opensearch-project#9501)

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Kaushal Kumar <[email protected]>
brusic pushed a commit to brusic/OpenSearch that referenced this pull request Sep 25, 2023
…licaWhileWritesBlocked. (opensearch-project#9501)

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Ivan Brusic <[email protected]>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…licaWhileWritesBlocked. (opensearch-project#9501)

This test fails on ensureGreen after adding a replica. This is run inside of the try with resources that blocks operations.  The block works by mocking transport calls to prevent segrep from completing until released.

Fixed by moving the ensureGreen until after releasing blockOperations. Also re
duced the doc count that is used while indexing down from max 200.  Writes wit
h the remote store version of this test take a much longer time to execute whe
n performed serially, and we don't need this many docs indexed to create needed checkpoints.

Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Shivansh Arora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog :test Adding or fixing a test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] SegmentReplicationWithRemoteStorePressureIT.testAddReplicaWhileWritesBlocked flaky test failure
3 participants