Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Segment Replication] Remove Doc Parsing for segment replication enabled replica shard during translog replay from recovery #9002

Conversation

Rishikesh1159
Copy link
Member

Description

This PR removes removes Doc parsing step for segment replication enabled replica shard, which is called during translog replay in phase 2 of recovery.

There are two ways an operation can be written into replica shard's translog:
-> First Way: This is from applyIndexOperationOnReplica(), where we avoid doc parsing and we set routing to null.
-> Second way: This is from applyTranslogOperation(), here we don't avoid doc parsing and parse actual routing value from input.

When phase 2 of recovery is started for a replica shard, it opens it's engine to write new translog operations (First way mentioned above )and then later we replay translog ops from primary shard (Second way mentioned above) during recovery. Sometimes a translog operation might be already written into replica's shard but it is still replayed from primary. Usually translog writer will be able to recognize that same operation is happening and does a noop if it is already present.
But sometimes in case of segrep replica shard, the translog writer is not able to recognize both of them as same operation because the routing value in one operation is null and routing value in another will be an actual parsed routing value. So we hit the following error #8848 sometimes.

To avoid this situation, this PR removes doc parsing even in second way on a segrep enabled replica shard. This way we completely remove doc parsing step for a segrep enabled replica shard.

Related Issues

Resolves #8848

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…ng translog replay from recovery.

Signed-off-by: Rishikesh1159 <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT.test {yaml=pit/10_basic/Delete all}

@Rishikesh1159
Copy link
Member Author

Gradle Check (Jenkins) Run Completed with:

#8887

@mch2
Copy link
Member

mch2 commented Aug 1, 2023

Thanks for catching this @Rishikesh1159. While we are still working on these tests randomly using segrep, can we add a segrep version of the IT that caught this that always runs?

…cation enabled replica shard.

Signed-off-by: Rishikesh1159 <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Aug 2, 2023

Gradle Check (Jenkins) Run Completed with:

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/security-analytics.git]
Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 27m 14s

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/performance-analyzer.git]
Compatible components: [https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 36m 48s

@github-actions
Copy link
Contributor

github-actions bot commented Aug 2, 2023

Gradle Check (Jenkins) Run Completed with:

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/security-analytics.git]
Compatible components: [https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 35m 2s

@github-actions
Copy link
Contributor

github-actions bot commented Aug 3, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Aug 3, 2023

Gradle Check (Jenkins) Run Completed with:

@codecov
Copy link

codecov bot commented Aug 3, 2023

Codecov Report

Merging #9002 (b0ebb6f) into main (8afb22a) will increase coverage by 0.08%.
Report is 14 commits behind head on main.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #9002      +/-   ##
============================================
+ Coverage     71.03%   71.12%   +0.08%     
- Complexity    57233    57313      +80     
============================================
  Files          4765     4765              
  Lines        270334   270335       +1     
  Branches      39538    39539       +1     
============================================
+ Hits         192040   192278     +238     
+ Misses        62089    61842     -247     
- Partials      16205    16215      +10     
Files Changed Coverage Δ
...in/java/org/opensearch/index/shard/IndexShard.java 69.11% <100.00%> (-0.15%) ⬇️

... and 468 files with indirect coverage changes

Copy link
Member

@mch2 mch2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding this @Rishikesh1159 ! Only a nit so approved.

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:


> Task :checkCompatibility
Checking compatibility for: https://github.com/opensearch-project/reporting.git with ref: main
Incompatible components: [https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/ml-commons.git]
Compatible components: [https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git]

BUILD SUCCESSFUL in 22m 38s

@github-actions
Copy link
Contributor

github-actions bot commented Aug 6, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Aug 6, 2023

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      3 org.opensearch.indices.replication.SegmentReplicationIT.testDropPrimaryDuringReplication
      1 org.opensearch.snapshots.DedicatedClusterSnapshotRestoreIT.testIndexDeletionDuringSnapshotCreationInQueue

@Rishikesh1159 Rishikesh1159 added the backport 2.x Backport to 2.x branch label Aug 6, 2023
@Rishikesh1159 Rishikesh1159 merged commit 435408b into opensearch-project:main Aug 6, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 6, 2023
…led replica shard during translog replay from recovery (#9002)

* Remove Doc Parsing for segment replication enabled replica shard during translog replay from recovery.

Signed-off-by: Rishikesh1159 <[email protected]>

* Adding unit test to verify document is not parsed on an segment replication enabled replica shard.

Signed-off-by: Rishikesh1159 <[email protected]>

* remove unnecessary unit tests and address comments.

Signed-off-by: Rishikesh1159 <[email protected]>

* address comments on PR.

Signed-off-by: Rishikesh1159 <[email protected]>

---------

Signed-off-by: Rishikesh1159 <[email protected]>
(cherry picked from commit 435408b)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Rishikesh1159 pushed a commit that referenced this pull request Aug 7, 2023
…led replica shard during translog replay from recovery (#9002) (#9140)

* Remove Doc Parsing for segment replication enabled replica shard during translog replay from recovery.



* Adding unit test to verify document is not parsed on an segment replication enabled replica shard.



* remove unnecessary unit tests and address comments.



* address comments on PR.



---------


(cherry picked from commit 435408b)

Signed-off-by: Rishikesh1159 <[email protected]>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
kaushalmahi12 pushed a commit to kaushalmahi12/OpenSearch that referenced this pull request Sep 12, 2023
…led replica shard during translog replay from recovery (opensearch-project#9002)

* Remove Doc Parsing for segment replication enabled replica shard during translog replay from recovery.

Signed-off-by: Rishikesh1159 <[email protected]>

* Adding unit test to verify document is not parsed on an segment replication enabled replica shard.

Signed-off-by: Rishikesh1159 <[email protected]>

* remove unnecessary unit tests and address comments.

Signed-off-by: Rishikesh1159 <[email protected]>

* address comments on PR.

Signed-off-by: Rishikesh1159 <[email protected]>

---------

Signed-off-by: Rishikesh1159 <[email protected]>
Signed-off-by: Kaushal Kumar <[email protected]>
brusic pushed a commit to brusic/OpenSearch that referenced this pull request Sep 25, 2023
…led replica shard during translog replay from recovery (opensearch-project#9002)

* Remove Doc Parsing for segment replication enabled replica shard during translog replay from recovery.

Signed-off-by: Rishikesh1159 <[email protected]>

* Adding unit test to verify document is not parsed on an segment replication enabled replica shard.

Signed-off-by: Rishikesh1159 <[email protected]>

* remove unnecessary unit tests and address comments.

Signed-off-by: Rishikesh1159 <[email protected]>

* address comments on PR.

Signed-off-by: Rishikesh1159 <[email protected]>

---------

Signed-off-by: Rishikesh1159 <[email protected]>
Signed-off-by: Ivan Brusic <[email protected]>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…led replica shard during translog replay from recovery (opensearch-project#9002)

* Remove Doc Parsing for segment replication enabled replica shard during translog replay from recovery.

Signed-off-by: Rishikesh1159 <[email protected]>

* Adding unit test to verify document is not parsed on an segment replication enabled replica shard.

Signed-off-by: Rishikesh1159 <[email protected]>

* remove unnecessary unit tests and address comments.

Signed-off-by: Rishikesh1159 <[email protected]>

* address comments on PR.

Signed-off-by: Rishikesh1159 <[email protected]>

---------

Signed-off-by: Rishikesh1159 <[email protected]>
Signed-off-by: Shivansh Arora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
3 participants