Make org.elasticsearch.action.admin.cluster.state.ClusterStateResponse Compress the Cluster State #79906

original-brownbear · 2021-10-27T12:52:22Z

Unlike publication which uses a org.elasticsearch.transport.BytesTransportRequest that contains the cluster state as a compressed bytes reference, org.elasticsearch.action.admin.cluster.state.ClusterStateResponse does not compress the cluster state.

This is not ideal when it comes to very large cluster states that are requested via the get-cluster-state REST API from another node as these messages can become quite sizable. We should align the behavior here and use the same serialization approach in this message that we use for publication to limit the size of a full cluster state transport message (which will generally compress very well because settings and mappings tend to be duplicated heavily across indices).

relates #77466

The text was updated successfully, but these errors were encountered:

elasticmachine · 2021-10-27T12:52:25Z

Pinging @elastic/es-data-management (Team:Data Management)

elasticmachine · 2021-10-27T12:52:25Z

Pinging @elastic/es-distributed (Team:Distributed)

DaveCTurner · 2021-10-29T16:57:43Z

Discussed this in another channel, we think the simplest approach would be to do this using transport compression: if we always compress cluster-state requests then the responses will also always be compressed, which means it'll always be materialised in compressed form before sending, and also this'll avoid compressing it twice if transport compression is in use. Even better, this is already automatically wire-compatible. Something like this looks like the right sort of idea:

diff --git a/server/src/main/java/org/elasticsearch/transport/TcpTransport.java b/server/src/main/java/org/elasticsearch/transport/TcpTransport.java
index 6bed6c7fe61..23905b68ca7 100644
--- a/server/src/main/java/org/elasticsearch/transport/TcpTransport.java
+++ b/server/src/main/java/org/elasticsearch/transport/TcpTransport.java
@@ -268,7 +268,8 @@ public abstract class TcpTransport extends AbstractLifecycleComponent implements
             final boolean shouldCompress = compress == Compression.Enabled.TRUE
                 || (compress == Compression.Enabled.INDEXING_DATA
                     && request instanceof RawIndexingDataTransportRequest
-                    && ((RawIndexingDataTransportRequest) request).isRawIndexingData());
+                    && ((RawIndexingDataTransportRequest) request).isRawIndexingData())
+                || (request instanceof ClusterStateRequest);
             final Compression.Scheme schemeToUse = shouldCompress ? compressionScheme : null;
             outboundHandler.sendRequest(node, channel, requestId, action, request, options, getVersion(), schemeToUse, false);
         }

We send cluster states over the wire in response to cluster state requests and when validating join requests. Today there's no special treatment of these messages so we just materialize the whole cluster state into a sequence of buffers which can be tens or hundreds of MBs in a big cluster, and it's quite possible that a really big cluster would exceed the 2GiB limit on message size. This commit forces transport compression on these messages even if it's usually disabled, which means that the materialized cluster state is always compressed. It tends to compress well, so this gives us a bit more headroom in terms of maximum cluster size. Relates elastic#77466 Closes elastic#79906

DaveCTurner · 2022-08-18T09:56:42Z

Today we compress the cluster state sent for join validation (#85380) and publication, and we can avoid the need to send cluster states over the wire within a cluster for other reasons (#86888), and to remote clusters (#89456), so I think we can close this.

original-brownbear added >enhancement :Data Management/Stats Statistics tracking and retrieval APIs :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. labels Oct 27, 2021

elasticmachine added Team:Data Management Meta label for data/management team Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. labels Oct 27, 2021

original-brownbear mentioned this issue Oct 27, 2021

Fix Large Shard Count Scalability Issues #77466

Open

97 tasks

DaveCTurner self-assigned this Oct 29, 2021

DaveCTurner mentioned this issue Oct 29, 2021

Always compress cluster state on transport layer #80104

Closed

DaveCTurner closed this as completed Aug 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make org.elasticsearch.action.admin.cluster.state.ClusterStateResponse Compress the Cluster State #79906

Make org.elasticsearch.action.admin.cluster.state.ClusterStateResponse Compress the Cluster State #79906

original-brownbear commented Oct 27, 2021 •

edited

Loading

elasticmachine commented Oct 27, 2021

elasticmachine commented Oct 27, 2021

DaveCTurner commented Oct 29, 2021

DaveCTurner commented Aug 18, 2022

Make org.elasticsearch.action.admin.cluster.state.ClusterStateResponse Compress the Cluster State #79906

Make org.elasticsearch.action.admin.cluster.state.ClusterStateResponse Compress the Cluster State #79906

Comments

original-brownbear commented Oct 27, 2021 • edited Loading

elasticmachine commented Oct 27, 2021

elasticmachine commented Oct 27, 2021

DaveCTurner commented Oct 29, 2021

DaveCTurner commented Aug 18, 2022

original-brownbear commented Oct 27, 2021 •

edited

Loading