Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of chunked REST response infrastructure in more APIs #89838

Open
15 of 19 tasks
Tracked by #77466
original-brownbear opened this issue Sep 6, 2022 · 6 comments
Open
15 of 19 tasks
Tracked by #77466
Assignees
Labels
:Distributed Coordination/Network Http and internode communication implementations Meta Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Comments

@original-brownbear
Copy link
Member

original-brownbear commented Sep 6, 2022

Now that #88311 has landed and we have the infrastructure for serializing chunked REST responses, we should make use of it to fix the massive memory usage of APIs that are known to return large responses:

@original-brownbear original-brownbear added :Distributed Coordination/Network Http and internode communication implementations Meta labels Sep 6, 2022
@original-brownbear original-brownbear self-assigned this Sep 6, 2022
@elasticsearchmachine elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Sep 6, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Sep 12, 2022
Turning this into a chunked response since it can become very large.

relates elastic#89838
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Sep 12, 2022
These can be huge, make them chunked to be nice to the coordinating node.

relates elastic#89838
original-brownbear added a commit that referenced this issue Sep 12, 2022
Turning this into a chunked response since it can become very large.

relates #89838
original-brownbear added a commit that referenced this issue Sep 12, 2022
These can be huge, make them chunked to be nice to the coordinating node.

relates #89838
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this issue Nov 21, 2022
These responses can become huge, lets chunk them by index.

relates elastic#89838
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Nov 24, 2022
This response can reach a few MiB in size in an overwhelmed cluster,
let's use chunking so as not to make things worse than they already are.

Relates elastic#89838
elasticsearchmachine pushed a commit that referenced this issue Nov 24, 2022
This response can reach a few MiB in size in an overwhelmed cluster,
let's use chunking so as not to make things worse than they already are.

Relates #89838
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Nov 24, 2022
This response can reach many MiB in size in a large and busy cluster,
let's use chunking here.

Relates elastic#89838
DaveCTurner added a commit that referenced this issue Nov 28, 2022
This response can reach many MiB in size in a large and busy cluster,
let's use chunking here.

Relates #89838
@DaveCTurner
Copy link
Contributor

DaveCTurner commented Nov 28, 2022

I took a look at adding chunking to the cluster state API and it is not a small task. Yet I think it's important, we still have clients that monitor things by requesting the whole routing table sometimes and they're not going away any time soon. I don't think we can just add chunking to the routing table part, making everything else a single chunk, since you pointed out elsewhere that this moves all the serialization work back onto the transport threads. So we have to do it properly. I think we can do it in stages tho, starting at the bottom and using wrapAsXContentObject as needed to keep the work off of transport threads until we get to the top:

  • Make Metadata.Custom implement ChunkedToXContent instead of ToXContentFragment. (25 implementations of toXContent as I write this, some of them pretty large).
  • Make Metadata implement ChunkedToXContent (likely impacts PersistedClusterStateService).
  • Make ClusterState.Custom implement ChunkedToXContent instead of ToXContentFragment. (Add chunking to ClusterState.Custom impls #91963)
  • Finally make ClusterState fully chunked.

DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Nov 28, 2022
Still combines the chunks together at the upper level, but this is a
step towards full chunking support for `GET _cluster/state`.

Relates elastic#89838
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Dec 7, 2022
Various places that implement response chunking involving some kind of
nested structure (e.g. a list of lists) need to call something like
`flatMap`, and today this means they must must convert between iterators
and streams to make use of `Stream#flatMap`. This is noisy and awkward,
so with this commit we introduce a way to `flatMap` directly on
iterators.

Relates elastic#89838
elasticsearchmachine pushed a commit that referenced this issue Dec 7, 2022
Various places that implement response chunking involving some kind of
nested structure (e.g. a list of lists) need to call something like
`flatMap`, and today this means they must must convert between iterators
and streams to make use of `Stream#flatMap`. This is noisy and awkward,
so with this commit we introduce a way to `flatMap` directly on
iterators.

Relates #89838
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Dec 12, 2022
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Dec 22, 2022
The CCR info and stats APIs can send fairly sizeable responses that
scale as `O(#shards)`. This commit makes them chunked.

Relates elastic#89838
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Dec 22, 2022
Most chunking implementations have a corresponding test regarding the
number of chunks emitted, and they're all pretty much copies of each
other. Extracting a test utility to capture this pattern is long
overdue.

Relates elastic#89838
elasticsearchmachine pushed a commit that referenced this issue Dec 22, 2022
The CCR info and stats APIs can send fairly sizeable responses that
scale as `O(#shards)`. This commit makes them chunked.

Relates #89838
elasticsearchmachine pushed a commit that referenced this issue Jan 3, 2023
Most chunking implementations have a corresponding test regarding the
number of chunks emitted, and they're all pretty much copies of each
other. Extracting a test utility to capture this pattern is long
overdue.

Relates #89838
@DaveCTurner
Copy link
Contributor

I think that's everything but the various search APIs, for which I think we should ask for help from the search team. Should we open a separate issue for the search team about that, and then close this one?

@javanna javanna added the :Search/Search Search-related issues that do not fall into other categories label Mar 10, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Mar 10, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@javanna
Copy link
Member

javanna commented Mar 10, 2023

Heya, I added the Search label so that this is on our radar, given that the only remaining task is ours.

@javanna
Copy link
Member

javanna commented May 3, 2023

We now have a Search meta issue (#95661), so I am removing the search area label from this one.

@javanna javanna removed :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team labels May 3, 2023
romseygeek added a commit that referenced this issue May 12, 2023
This commit adds xcontent chunking to SearchResponse and MultiSearchResponse
by making SearchHits implement ChunkedToXContent.

Relates to #89838
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Network Http and internode communication implementations Meta Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Projects
None yet
Development

No branches or pull requests

4 participants