Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable indexing optimization using sequence numbers on replicas #43616

Merged
merged 13 commits into from
Jul 5, 2019

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Jun 26, 2019

This PR enables the indexing optimization using sequence numbers on replicas. With this optimization, indexing on replicas should be faster and use less memory as it can forgo the version lookup when possible. This change also deactivates the append-only optimization on replicas.

Relates #34099

@dnhatn dnhatn added >enhancement :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. v8.0.0 v7.3.0 labels Jun 26, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@ywelsch
Copy link
Contributor

ywelsch commented Jun 26, 2019

Thanks for tackling this. I wonder if we should go one step further and remove the maxSeqNoOfNonAppendOnlyOperations optimization for replicas at the same time. I think that adding the optimization here justifies removing the other optimization, as it will largely cover the functionality of that one, and substantially simplify the InternalEngine implementation (which is too complex IMO). I've prototyped that change here: https://github.com/elastic/elasticsearch/compare/master...ywelsch:replica-index-opt?expand=1

@dnhatn
Copy link
Member Author

dnhatn commented Jun 26, 2019

I wonder if we should go one step further and remove the maxSeqNoOfNonAppendOnlyOperations optimization for replicas at the same time.

I remember that we discussed and agreed to proceed with this change. However, when I was working on this PR I realized we should not remove the append-only optimization on 7.x as 6.x indices don't have soft-deletes. Are you okay if we remove the append-only optimization from 8.x only in a follow-up?

@ywelsch
Copy link
Contributor

ywelsch commented Jun 26, 2019

However, when I was working on this PR I realized we should not remove the append-only optimization on 7.x as 6.x indices don't have soft-deletes.

Can't we activate the new optimization also for the case where there are no soft-deletes?

@dnhatn
Copy link
Member Author

dnhatn commented Jun 26, 2019

Can't we activate the new optimization also for the case where there are no soft-deletes?

Sadly not. The new optimization requires the ability to test if an operation has been processed. This method, in turn, requires soft-deletes to restore the local checkpoint tracker.

@dnhatn
Copy link
Member Author

dnhatn commented Jun 26, 2019

With #41161 where we initialize max_seq_no_of_updates in the constructor of InternalEngine, we don't strictly need the LocalCheckpointTracker containing all operations when we open an engine. Thus, we can activate the new optimization without soft-deletes.

@dnhatn
Copy link
Member Author

dnhatn commented Jun 26, 2019

@ywelsch I have deactivated the append-only optimization and removed the related code on replicas. Sadly we can't remove the logic that transfers auto_id timestamp from primary to replicas in peer recovery or resync as this will be required (when a replica becomes primary) if we switch peer recovery to use soft-deletes instead of translog (see #33693). Please take another look when you have some cycles. Thank you!

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ywelsch I have deactivated the append-only optimization and removed the related code on replicas. Sadly we can't remove the logic that transfers auto_id timestamp from primary to replicas in peer recovery or resync as this will be required (when a replica becomes primary) if we switch peer recovery to use soft-deletes instead of translog (see #33693).

I'm not sure I'm following this. Would it only be a problem if the original append-only request would go the newly promoted primary whereas the retry had gone to the previous primary? Is that a scenario that can happen?

Perhaps we should disable the append-only optimization for any other case than where origin == Primary?

@dnhatn
Copy link
Member Author

dnhatn commented Jun 27, 2019

Would it only be a problem if the original append-only request would go the newly promoted primary whereas the retry had gone to the previous primary? Is that a scenario that can happen?

Yeah, I was thinking of that scenario. If we are sure that can't happen, we can remove the transfer timestamp logic.

@ywelsch
Copy link
Contributor

ywelsch commented Jun 27, 2019

Yeah, I was thinking of that scenario. If we are sure that can't happen, we can remove the transfer timestamp logic.

It's very difficult to construct such a scenario at least. On the off-chance, however, best keep it in for the time being...

Not in this PR, but I wonder if we should simplify the append-only optimization logic so that when we generate the timestamp, we also attach it to the identity of the primary shard instance that we want to send this write to, and only apply the optimization if this matches. This would make our life easier, as there is no need to persist or transfer the max unsafe timestamp. Writes would only be deoptimized during the short time of primary failover or relocation. WDYT?

@dnhatn
Copy link
Member Author

dnhatn commented Jun 27, 2019

I wonder if we should simplify the append-only optimization logic so that when we generate the timestamp, we also attach it to the identity of the primary shard instance that we want to send this write to, and only apply the optimization if this matches.

This is a great idea :).

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let's benchmark this for both an append-only workload with _id and without _id.

Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 2

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jpountz jpountz added v7.4.0 and removed v7.3.0 labels Jul 3, 2019
@dnhatn
Copy link
Member Author

dnhatn commented Jul 5, 2019

This PR increases the indexing throughput around 13% for append-only indices with external Ids. There's no regression for append-only indices with auto-generated Ids. Below is the detail.

1. geopoint with external Ids

  • System: 2 GCP instances with 4 cores, 8GB RAM, OpenJDK12, 4GB JVM heap
  • Track: geopoint, challenge: append-fast-with-conflicts
  • Params: conflict_probability=0, conflicts=sequential, number_of_replicas=1
|                                                        Metric |         Task |    Baseline |   Contender |     Diff |   Unit |
|--------------------------------------------------------------:|-------------:|------------:|------------:|---------:|-------:|
|                    Cumulative indexing time of primary shards |              |     219.947 |     218.072 | -1.87513 |    min |
|             Min cumulative indexing time across primary shard |              |           0 |     6.64352 |  6.64352 |    min |
|          Median cumulative indexing time across primary shard |              |     3.35715 |     7.39208 |  4.03492 |    min |
|             Max cumulative indexing time across primary shard |              |     8.03352 |     7.85798 | -0.17553 |    min |
|           Cumulative indexing throttle time of primary shards |              |           0 |           0 |        0 |    min |
|    Min cumulative indexing throttle time across primary shard |              |           0 |           0 |        0 |    min |
| Median cumulative indexing throttle time across primary shard |              |           0 |           0 |        0 |    min |
|    Max cumulative indexing throttle time across primary shard |              |           0 |           0 |        0 |    min |
|                       Cumulative merge time of primary shards |              |     34.9673 |     32.9505 | -2.01678 |    min |
|                      Cumulative merge count of primary shards |              |         189 |         140 |      -49 |        |
|                Min cumulative merge time across primary shard |              |           0 |    0.839783 |  0.83978 |    min |
|             Median cumulative merge time across primary shard |              |    0.498025 |     1.10948 |  0.61146 |    min |
|                Max cumulative merge time across primary shard |              |      1.3509 |      1.3867 |   0.0358 |    min |
|              Cumulative merge throttle time of primary shards |              |     9.06393 |      8.0764 | -0.98753 |    min |
|       Min cumulative merge throttle time across primary shard |              |           0 |    0.195433 |  0.19543 |    min |
|    Median cumulative merge throttle time across primary shard |              |    0.132317 |    0.268858 |  0.13654 |    min |
|       Max cumulative merge throttle time across primary shard |              |    0.428483 |     0.33205 | -0.09643 |    min |
|                     Cumulative refresh time of primary shards |              |      17.168 |     16.6416 | -0.52637 |    min |
|                    Cumulative refresh count of primary shards |              |        1430 |         865 |     -565 |        |
|              Min cumulative refresh time across primary shard |              | 0.000116667 |      0.4789 |  0.47878 |    min |
|           Median cumulative refresh time across primary shard |              |    0.259625 |    0.549917 |  0.29029 |    min |
|              Max cumulative refresh time across primary shard |              |    0.689217 |    0.639017 |  -0.0502 |    min |
|                       Cumulative flush time of primary shards |              |           0 |           0 |        0 |    min |
|                      Cumulative flush count of primary shards |              |          30 |           0 |      -30 |        |
|                Min cumulative flush time across primary shard |              |           0 |           0 |        0 |    min |
|             Median cumulative flush time across primary shard |              |           0 |           0 |        0 |    min |
|                Max cumulative flush time across primary shard |              |           0 |           0 |        0 |    min |
|                                            Total Young Gen GC |              |     1376.83 |     900.146 | -476.684 |      s |
|                                              Total Old Gen GC |              |      15.015 |      11.213 |   -3.802 |      s |
|                                                    Store size |              |     12.7345 |     6.55953 | -6.17501 |     GB |
|                                                 Translog size |              |     9.23828 |     6.61937 | -2.61891 |     GB |
|                                        Heap used for segments |              |     14.5722 |     10.5155 | -4.05669 |     MB |
|                                      Heap used for doc values |              |   0.0798683 |   0.0133057 | -0.06656 |     MB |
|                                           Heap used for terms |              |      11.388 |     8.51849 | -2.86954 |     MB |
|                                           Heap used for norms |              |   0.0881958 |           0 |  -0.0882 |     MB |
|                                          Heap used for points |              |     1.07124 |    0.855252 | -0.21599 |     MB |
|                                   Heap used for stored fields |              |      1.9402 |     1.12296 | -0.81725 |     MB |
|                                                 Segment count |              |         220 |         109 |     -111 |        |
|                                                Min Throughput | index-update |       58880 |     66893.9 |  8013.86 | docs/s |
|                                             Median Throughput | index-update |     64387.5 |     73326.7 |  8939.19 | docs/s |
|                                                Max Throughput | index-update |     79102.5 |     88066.6 |  8964.11 | docs/s |
|                                       50th percentile latency | index-update |      591.14 |     504.508 | -86.6321 |     ms |
|                                       90th percentile latency | index-update |     927.495 |     828.247 | -99.2477 |     ms |
|                                       99th percentile latency | index-update |     1750.14 |     1811.14 |  61.0053 |     ms |
|                                     99.9th percentile latency | index-update |     2362.94 |     2423.28 |  60.3359 |     ms |
|                                    99.99th percentile latency | index-update |     2870.79 |     3074.58 |  203.788 |     ms |
|                                      100th percentile latency | index-update |     3856.69 |      3501.5 | -355.186 |     ms |
|                                  50th percentile service time | index-update |      591.14 |     504.508 | -86.6321 |     ms |
|                                  90th percentile service time | index-update |     927.495 |     828.247 | -99.2477 |     ms |
|                                  99th percentile service time | index-update |     1750.14 |     1811.14 |  61.0053 |     ms |
|                                99.9th percentile service time | index-update |     2362.94 |     2423.28 |  60.3359 |     ms |
|                               99.99th percentile service time | index-update |     2870.79 |     3074.58 |  203.788 |     ms |
|                                 100th percentile service time | index-update |     3856.69 |      3501.5 | -355.186 |     ms |
|                                                    error rate | index-update |           0 |           0 |        0 |      % |

2. geonames with external Ids

  • System: 2 GCP instances with 8 cores, 32GB RAM, OpenJDK12, 12GB JVM heap
  • Track: geonames, challenge: append-fast-with-conflicts
  • Params: conflict_probability=0, conflicts=sequential, number_of_replicas=1
|                                                        Metric |         Task |   Baseline |   Contender |     Diff |   Unit |
|--------------------------------------------------------------:|-------------:|-----------:|------------:|---------:|-------:|
|                    Cumulative indexing time of primary shards |              |    120.002 |     114.185 | -5.81718 |    min |
|             Min cumulative indexing time across primary shard |              |    3.77007 |     3.57435 | -0.19572 |    min |
|          Median cumulative indexing time across primary shard |              |    3.98953 |     3.81185 | -0.17768 |    min |
|             Max cumulative indexing time across primary shard |              |    4.38675 |     4.07968 | -0.30707 |    min |
|           Cumulative indexing throttle time of primary shards |              |          0 |           0 |        0 |    min |
|    Min cumulative indexing throttle time across primary shard |              |          0 |           0 |        0 |    min |
| Median cumulative indexing throttle time across primary shard |              |          0 |           0 |        0 |    min |
|    Max cumulative indexing throttle time across primary shard |              |          0 |           0 |        0 |    min |
|                       Cumulative merge time of primary shards |              |    4.36247 |     7.32043 |  2.95797 |    min |
|                      Cumulative merge count of primary shards |              |          7 |          14 |        7 |        |
|                Min cumulative merge time across primary shard |              |          0 |           0 |        0 |    min |
|             Median cumulative merge time across primary shard |              |          0 |           0 |        0 |    min |
|                Max cumulative merge time across primary shard |              |   0.715967 |       0.768 |  0.05203 |    min |
|              Cumulative merge throttle time of primary shards |              |     0.8336 |     1.32458 |  0.49098 |    min |
|       Min cumulative merge throttle time across primary shard |              |          0 |           0 |        0 |    min |
|    Median cumulative merge throttle time across primary shard |              |          0 |           0 |        0 |    min |
|       Max cumulative merge throttle time across primary shard |              |   0.134833 |    0.152633 |   0.0178 |    min |
|                     Cumulative refresh time of primary shards |              |    12.8071 |     11.9964 | -0.81068 |    min |
|                    Cumulative refresh count of primary shards |              |        271 |         281 |       10 |        |
|              Min cumulative refresh time across primary shard |              |    0.38185 |      0.3405 | -0.04135 |    min |
|           Median cumulative refresh time across primary shard |              |   0.415742 |    0.400575 | -0.01517 |    min |
|              Max cumulative refresh time across primary shard |              |   0.528183 |    0.503033 | -0.02515 |    min |
|                       Cumulative flush time of primary shards |              |          0 |           0 |        0 |    min |
|                      Cumulative flush count of primary shards |              |          0 |           0 |        0 |        |
|                Min cumulative flush time across primary shard |              |          0 |           0 |        0 |    min |
|             Median cumulative flush time across primary shard |              |          0 |           0 |        0 |    min |
|                Max cumulative flush time across primary shard |              |          0 |           0 |        0 |    min |
|                                            Total Young Gen GC |              |    231.216 |     182.913 |  -48.303 |      s |
|                                              Total Old Gen GC |              |      2.795 |       2.631 |   -0.164 |      s |
|                                                    Store size |              |     6.6438 |     6.94349 |  0.29968 |     GB |
|                                                 Translog size |              |     6.3776 |     6.37725 | -0.00035 |     GB |
|                                        Heap used for segments |              |     3.5388 |     3.55654 |  0.01773 |     MB |
|                                      Heap used for doc values |              |  0.0130653 |   0.0349731 |  0.02191 |     MB |
|                                           Heap used for terms |              |    2.38223 |     2.37961 | -0.00262 |     MB |
|                                           Heap used for norms |              |  0.0844116 |   0.0836182 | -0.00079 |     MB |
|                                          Heap used for points |              |   0.279282 |    0.279658 |  0.00038 |     MB |
|                                   Heap used for stored fields |              |   0.784355 |    0.784592 |  0.00024 |     MB |
|                                                 Segment count |              |        107 |         107 |        0 |        |
|                                                Min Throughput | index-update |      35887 |     36820.5 |  933.447 | docs/s |
|                                             Median Throughput | index-update |      40416 |     45718.2 |  5302.27 | docs/s |
|                                                Max Throughput | index-update |      45436 |     50353.2 |  4917.21 | docs/s |
|                                       50th percentile latency | index-update |    819.383 |     715.413 | -103.969 |     ms |
|                                       90th percentile latency | index-update |    1129.09 |     1099.37 | -29.7251 |     ms |
|                                       99th percentile latency | index-update |    5367.39 |     5563.97 |   196.58 |     ms |
|                                     99.9th percentile latency | index-update |    6791.13 |     6884.51 |  93.3792 |     ms |
|                                      100th percentile latency | index-update |     9474.6 |     8217.37 | -1257.23 |     ms |
|                                  50th percentile service time | index-update |    819.383 |     715.413 | -103.969 |     ms |
|                                  90th percentile service time | index-update |    1129.09 |     1099.37 | -29.7251 |     ms |
|                                  99th percentile service time | index-update |    5367.39 |     5563.97 |   196.58 |     ms |
|                                99.9th percentile service time | index-update |    6791.13 |     6884.51 |  93.3792 |     ms |
|                                 100th percentile service time | index-update |     9474.6 |     8217.37 | -1257.23 |     ms |
|                                                    error rate | index-update |          0 |           0 |        0 |      % |

3. geonames with auto-generated Ids

  • System: 2 GCP instances with 4 cores, 8GB RAM, OpenJDK12, 4GB JVM heap
  • Track: geonames, challenge: append-no-conflicts-index-only
  • Params: number_of_replicas=1
|                                                        Metric |         Task |   Baseline |   Contender |     Diff |   Unit |
|--------------------------------------------------------------:|-------------:|-----------:|------------:|---------:|-------:|
|                    Cumulative indexing time of primary shards |              |    134.761 |     135.954 |  1.19237 |    min |
|             Min cumulative indexing time across primary shard |              |    5.04263 |     5.09677 |  0.05413 |    min |
|          Median cumulative indexing time across primary shard |              |    5.39588 |      5.4205 |  0.02462 |    min |
|             Max cumulative indexing time across primary shard |              |    5.63322 |     5.96473 |  0.33152 |    min |
|           Cumulative indexing throttle time of primary shards |              |          0 |           0 |        0 |    min |
|    Min cumulative indexing throttle time across primary shard |              |          0 |           0 |        0 |    min |
| Median cumulative indexing throttle time across primary shard |              |          0 |           0 |        0 |    min |
|    Max cumulative indexing throttle time across primary shard |              |          0 |           0 |        0 |    min |
|                       Cumulative merge time of primary shards |              |     26.929 |     28.2222 |  1.29322 |    min |
|                      Cumulative merge count of primary shards |              |         37 |          37 |        0 |        |
|                Min cumulative merge time across primary shard |              |      0.664 |    0.178233 | -0.48577 |    min |
|             Median cumulative merge time across primary shard |              |     1.0443 |     1.14372 |  0.09942 |    min |
|                Max cumulative merge time across primary shard |              |    1.57423 |     1.67512 |  0.10088 |    min |
|              Cumulative merge throttle time of primary shards |              |    2.18337 |     2.16043 | -0.02293 |    min |
|       Min cumulative merge throttle time across primary shard |              |  0.0653167 |           0 | -0.06532 |    min |
|    Median cumulative merge throttle time across primary shard |              |     0.0866 |   0.0884333 |  0.00183 |    min |
|       Max cumulative merge throttle time across primary shard |              |   0.101433 |    0.104583 |  0.00315 |    min |
|                     Cumulative refresh time of primary shards |              |    20.2955 |     21.0669 |  0.77143 |    min |
|                    Cumulative refresh count of primary shards |              |        404 |         406 |        2 |        |
|              Min cumulative refresh time across primary shard |              |   0.708917 |    0.685433 | -0.02348 |    min |
|           Median cumulative refresh time across primary shard |              |   0.823067 |    0.830633 |  0.00757 |    min |
|              Max cumulative refresh time across primary shard |              |    0.90535 |     0.96625 |   0.0609 |    min |
|                       Cumulative flush time of primary shards |              |    2.28273 |      2.0981 | -0.18463 |    min |
|                      Cumulative flush count of primary shards |              |         25 |          25 |        0 |        |
|                Min cumulative flush time across primary shard |              |     0.0262 |   0.0286167 |  0.00242 |    min |
|             Median cumulative flush time across primary shard |              |    0.08835 |     0.08305 |  -0.0053 |    min |
|                Max cumulative flush time across primary shard |              |    0.16055 |    0.158867 | -0.00168 |    min |
|                                            Total Young Gen GC |              |    350.314 |     365.575 |   15.261 |      s |
|                                              Total Old Gen GC |              |     10.839 |       9.583 |   -1.256 |      s |
|                                                    Store size |              |    6.31862 |     6.77183 |   0.4532 |     GB |
|                                                 Translog size |              |    5.58504 |     5.58571 |  0.00067 |     GB |
|                                        Heap used for segments |              |    4.94931 |     4.88006 | -0.06926 |     MB |
|                                      Heap used for doc values |              |  0.0471001 |   0.0530624 |  0.00596 |     MB |
|                                           Heap used for terms |              |    3.72886 |     3.66448 | -0.06437 |     MB |
|                                           Heap used for norms |              |  0.0928955 |    0.093689 |  0.00079 |     MB |
|                                          Heap used for points |              |   0.279241 |    0.284407 |  0.00517 |     MB |
|                                   Heap used for stored fields |              |   0.803864 |    0.798782 | -0.00508 |     MB |
|                                                 Segment count |              |        119 |         120 |        1 |        |
|                                                Min Throughput | index-append |    21640.9 |       21332 | -308.918 | docs/s |
|                                             Median Throughput | index-append |    23251.7 |     23059.6 | -192.051 | docs/s |
|                                                Max Throughput | index-append |    25633.3 |     25914.7 |  281.398 | docs/s |
|                                       50th percentile latency | index-append |       1599 |     1539.38 | -59.6191 |     ms |
|                                       90th percentile latency | index-append |    2433.62 |     2441.67 |  8.04555 |     ms |
|                                       99th percentile latency | index-append |    6787.85 |     7086.41 |  298.559 |     ms |
|                                     99.9th percentile latency | index-append |    9226.66 |     10113.8 |   887.16 |     ms |
|                                      100th percentile latency | index-append |    14138.6 |     16591.3 |  2452.65 |     ms |
|                                  50th percentile service time | index-append |       1599 |     1539.38 | -59.6191 |     ms |
|                                  90th percentile service time | index-append |    2433.62 |     2441.67 |  8.04555 |     ms |
|                                  99th percentile service time | index-append |    6787.85 |     7086.41 |  298.559 |     ms |
|                                99.9th percentile service time | index-append |    9226.66 |     10113.8 |   887.16 |     ms |
|                                 100th percentile service time | index-append |    14138.6 |     16591.3 |  2452.65 |     ms |
|                                                    error rate | index-append |          0 |           0 |        0 |      % |

@dnhatn
Copy link
Member Author

dnhatn commented Jul 5, 2019

Unrelated test failure (tracked at #43889).

@elasticmachine run elasticsearch-ci/2

@dnhatn
Copy link
Member Author

dnhatn commented Jul 5, 2019

Thanks everyone!

@dnhatn dnhatn merged commit 688cf83 into elastic:master Jul 5, 2019
@dnhatn dnhatn deleted the grad_msu branch July 5, 2019 22:55
dnhatn added a commit that referenced this pull request Jul 6, 2019
This PR enables the indexing optimization using sequence numbers on
replicas. With this optimization, indexing on replicas should be faster
and use less memory as it can forgo the version lookup when possible.
This change also deactivates the append-only optimization on replicas.

Relates #34099
@conicliu
Copy link

conicliu commented Apr 2, 2022

Hi~I hava some question to ask:

When a replica receives an index operation O, it first ensure its own MSU at least MSU(O), and then compares its MSU to its local checkpoint (LCP).

  • If LCP>= MSU, then we process the index operation as append.

  • If LCP < MSU then we will try to search versionMap and Lucene segment.

But if there is a gap, which mean LCP < MSU(O) < seqNo(O), what errors will be caused by directly append? Any response would be appreciated!

@henningandersen
Copy link
Contributor

@LiuDui it is not clear whether you are after extending the optimization to handle more cases or whether you think there is a problem here? Can you clarify the intentions of your question>

But if there is a gap, which mean LCP < MSU(O) < seqNo(O), what errors will be caused by directly append?

If LCP < MSU, a check will be made. If we were to direct append in the case where the operation has already been updated, we risk having two documents for the same _id in lucene, i.e., an inconsistency.

@conicliu
Copy link

conicliu commented Apr 2, 2022

@henningandersen Thanks so much for your reply! I can understand that when LCP >= MSU, we can append operation without any problems on replica. It's safe and is the current logic.

According to the documentation, If LCP < MSU then there's a gap: there may be some operations that act on docID(O) about which we do not yet know, so we cannot perform an add.

https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/index/engine/Engine.java/#L1904

But I can't find a situation which will keep two documents for the same _id if we were to direct append in the case of seqNo(O) > MSU(O) > LCP. Maybe I have some misunderstanding. After thinking for a long time without getting an answer, I hope to consult it.

If we were to direct append in the case where the operation has already been updated.

What's the meaning about the operation has already been updated? Is there some example?

@conicliu
Copy link

conicliu commented Apr 2, 2022

@henningandersen I've suddenly figured it out! Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v7.4.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants