-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SegmentReplicationIT.testReplicaHasDiffFilesThanPrimary for node-node replication #8912
Conversation
Gradle Check (Jenkins) Run Completed with:
|
|
Gradle Check (Jenkins) Run Completed with:
|
|
server/src/main/java/org/opensearch/indices/replication/OngoingSegmentReplications.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/indices/replication/OngoingSegmentReplications.java
Show resolved
Hide resolved
Could we also start specifying number of successful runs we did for flaky tests before merging them in for confidence? |
Gradle Check (Jenkins) Run Completed with:
|
ResourceAwareTasksTests.testBasicTaskResourceTracking test failure. #8213
|
Gradle Check (Jenkins) Run Completed with:
|
This test is now failing for node-node replication. On the primary shard the prepareSegmentReplication method should cancel any ongoing replication if it is running and start a new sync. Thisis incorrectly using Map.compute which will not replace the existing handler entry in the allocationIdToHandlers map. It will only cancel the existing source handler. As a result this can leave the copyState map with an entry and hold an index commit while the test is cleaning up. The copyState is only cleared when a handler is cancelled directly or from a cluster state update. Signed-off-by: Marc Handalian <[email protected]>
Signed-off-by: Marc Handalian <[email protected]>
Gradle Check (Jenkins) Run Completed with:
|
Not sure of failure, retrying again.
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #8912 +/- ##
============================================
- Coverage 71.05% 70.95% -0.10%
+ Complexity 57198 57160 -38
============================================
Files 4758 4758
Lines 269863 269869 +6
Branches 39484 39487 +3
============================================
- Hits 191755 191491 -264
- Misses 61982 62236 +254
- Partials 16126 16142 +16
|
FWIW I've run this ~500 times locally and don't see it failing in other ways, also have a unit for the specific way this was failing. |
Precommit is failing here on windows with:
|
…node replication (#8912) * Fix SegmentReplicationIT.testReplicahasDiffFilesThanPrimary This test is now failing for node-node replication. On the primary shard the prepareSegmentReplication method should cancel any ongoing replication if it is running and start a new sync. Thisis incorrectly using Map.compute which will not replace the existing handler entry in the allocationIdToHandlers map. It will only cancel the existing source handler. As a result this can leave the copyState map with an entry and hold an index commit while the test is cleaning up. The copyState is only cleared when a handler is cancelled directly or from a cluster state update. Signed-off-by: Marc Handalian <[email protected]> * PR feedback. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]> (cherry picked from commit 4ad4182) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…node replication (#8912) (#8936) * Fix SegmentReplicationIT.testReplicahasDiffFilesThanPrimary This test is now failing for node-node replication. On the primary shard the prepareSegmentReplication method should cancel any ongoing replication if it is running and start a new sync. Thisis incorrectly using Map.compute which will not replace the existing handler entry in the allocationIdToHandlers map. It will only cancel the existing source handler. As a result this can leave the copyState map with an entry and hold an index commit while the test is cleaning up. The copyState is only cleared when a handler is cancelled directly or from a cluster state update. * PR feedback. --------- (cherry picked from commit 4ad4182) Signed-off-by: Marc Handalian <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…node replication (opensearch-project#8912) * Fix SegmentReplicationIT.testReplicahasDiffFilesThanPrimary This test is now failing for node-node replication. On the primary shard the prepareSegmentReplication method should cancel any ongoing replication if it is running and start a new sync. Thisis incorrectly using Map.compute which will not replace the existing handler entry in the allocationIdToHandlers map. It will only cancel the existing source handler. As a result this can leave the copyState map with an entry and hold an index commit while the test is cleaning up. The copyState is only cleared when a handler is cancelled directly or from a cluster state update. Signed-off-by: Marc Handalian <[email protected]> * PR feedback. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]>
…node replication (opensearch-project#8912) * Fix SegmentReplicationIT.testReplicahasDiffFilesThanPrimary This test is now failing for node-node replication. On the primary shard the prepareSegmentReplication method should cancel any ongoing replication if it is running and start a new sync. Thisis incorrectly using Map.compute which will not replace the existing handler entry in the allocationIdToHandlers map. It will only cancel the existing source handler. As a result this can leave the copyState map with an entry and hold an index commit while the test is cleaning up. The copyState is only cleared when a handler is cancelled directly or from a cluster state update. Signed-off-by: Marc Handalian <[email protected]> * PR feedback. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]>
…node replication (opensearch-project#8912) * Fix SegmentReplicationIT.testReplicahasDiffFilesThanPrimary This test is now failing for node-node replication. On the primary shard the prepareSegmentReplication method should cancel any ongoing replication if it is running and start a new sync. Thisis incorrectly using Map.compute which will not replace the existing handler entry in the allocationIdToHandlers map. It will only cancel the existing source handler. As a result this can leave the copyState map with an entry and hold an index commit while the test is cleaning up. The copyState is only cleared when a handler is cancelled directly or from a cluster state update. Signed-off-by: Marc Handalian <[email protected]> * PR feedback. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]> Signed-off-by: Kaushal Kumar <[email protected]>
…node replication (opensearch-project#8912) * Fix SegmentReplicationIT.testReplicahasDiffFilesThanPrimary This test is now failing for node-node replication. On the primary shard the prepareSegmentReplication method should cancel any ongoing replication if it is running and start a new sync. Thisis incorrectly using Map.compute which will not replace the existing handler entry in the allocationIdToHandlers map. It will only cancel the existing source handler. As a result this can leave the copyState map with an entry and hold an index commit while the test is cleaning up. The copyState is only cleared when a handler is cancelled directly or from a cluster state update. Signed-off-by: Marc Handalian <[email protected]> * PR feedback. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]> Signed-off-by: Ivan Brusic <[email protected]>
…node replication (opensearch-project#8912) * Fix SegmentReplicationIT.testReplicahasDiffFilesThanPrimary This test is now failing for node-node replication. On the primary shard the prepareSegmentReplication method should cancel any ongoing replication if it is running and start a new sync. Thisis incorrectly using Map.compute which will not replace the existing handler entry in the allocationIdToHandlers map. It will only cancel the existing source handler. As a result this can leave the copyState map with an entry and hold an index commit while the test is cleaning up. The copyState is only cleared when a handler is cancelled directly or from a cluster state update. Signed-off-by: Marc Handalian <[email protected]> * PR feedback. Signed-off-by: Marc Handalian <[email protected]> --------- Signed-off-by: Marc Handalian <[email protected]> Signed-off-by: Shivansh Arora <[email protected]>
Description
This test is now failing for node-node replication and is a bug. On the primary shard the prepareSegmentReplication method should cancel any ongoing replication if it is running and start a new sync. This is incorrectly using Map.compute which will cancel the existing handler but not replace its entry in the allocationIdToHandlers map. As a result this can leave the copyState map with an entry which holds an index commit while the test is cleaning up. The copyState is only cleared when a handler is cancelled directly or from a cluster state update.
Related Issues
Resolves #7643
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.