[Segment Replication] Reduce/remove document parsing on replicas #7164

andrross · 2023-04-14T18:05:01Z

When using segment replication, TransportShardBulkAction.performOpOnReplica() is invoked on replicas in order to persist the document into the translog on the replica and give the same durability guarantees as document replication. Here are two snapshots of CPU profiling for the same workload for both docrep and segrep:

Document Replication

Segment Replication

It clearly shows that NRTReplicationEngine does not invoke IndexWriter, whereas InternalEngine does. This is expected since the replicas do not need to index the document. However, a lot of work is spent doing DocumentMapper#parse in both cases. The question is: can this be refactored so that with segment replication the replicas avoid doing some or all of that document parsing?

The text was updated successfully, but these errors were encountered:

anasalkouz · 2023-04-25T20:19:18Z

@mch2 Can you share new profiler after fixing the issue?

andrross added enhancement Enhancement or improvement to existing feature or request untriaged labels Apr 14, 2023

minalsha added the distributed framework label Apr 15, 2023

mch2 mentioned this issue Apr 23, 2023

[Segment Replication] - Remove redundant replica doc parsing on writes. #7279

Merged

6 tasks

nknize closed this as completed in #7279 Apr 24, 2023

anasalkouz removed the untriaged label Apr 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Segment Replication] Reduce/remove document parsing on replicas #7164

[Segment Replication] Reduce/remove document parsing on replicas #7164

andrross commented Apr 14, 2023

anasalkouz commented Apr 25, 2023

[Segment Replication] Reduce/remove document parsing on replicas #7164

[Segment Replication] Reduce/remove document parsing on replicas #7164

Comments

andrross commented Apr 14, 2023

Document Replication

Segment Replication

anasalkouz commented Apr 25, 2023