-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up toXContent Collection Serialization in some Spots #78742
Speed up toXContent Collection Serialization in some Spots #78742
Conversation
Found this when benchmarking large cluster states. When serializing collections we'd mostly not take any advantage of what we know about the collection contents (like we do in `StreamOutput`). This PR adds a couple of helpers to the x-content-builder similar to what we have on `StreamOutput` to allow for faster serializing by avoiding the writer lookup and some self-reference checks.
Pinging @elastic/es-distributed (Team:Distributed) |
Pinging @elastic/es-core-infra (Team:Core/Infra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
How much does this save us out of interest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
public XContentBuilder xContentList(String name, ToXContent... values) throws IOException { | ||
startArray(name); | ||
for (ToXContent value : values) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe check for non null array here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh right TIL, I I always figured I'd get [null]
here if passed a null
but turns out you actually get the null
:)
@@ -921,6 +923,42 @@ private XContentBuilder value(ToXContent value, ToXContent.Params params) throws | |||
// Maps & Iterable | |||
////////////////////////////////// | |||
|
|||
public XContentBuilder stringListField(String name, Collection<String> values) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe just array(String name, Collection<String> values)
? Same remark for the other new methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't do that unfortunately because we have:
public XContentBuilder field(String name, Iterable<?> values) throws IOException {
return field(name).value(values);
}
which collides with it. Same with the other cases, these super generic ?
or Object
type methods collide with everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, I did not see this one. Let's keep like it is then.
Thanks David & Tanguy!
It's not entirely trivial to isolate the effect because this changes the profile quite a bit and we seemingly also see inlining in more spots now, but I'd say we're at roughly O(5%) savings + I have a follow-up based on this in the pipeline that should have a bigger impact :) |
…8742) Found this when benchmarking large cluster states. When serializing collections we'd mostly not take any advantage of what we know about the collection contents (like we do in `StreamOutput`). This PR adds a couple of helpers to the x-content-builder similar to what we have on `StreamOutput` to allow for faster serializing by avoiding the writer lookup and some self-reference checks.
…78755) Found this when benchmarking large cluster states. When serializing collections we'd mostly not take any advantage of what we know about the collection contents (like we do in `StreamOutput`). This PR adds a couple of helpers to the x-content-builder similar to what we have on `StreamOutput` to allow for faster serializing by avoiding the writer lookup and some self-reference checks.
@original-brownbear is this related to #26907 as well? |
@dliappis yea to some degree for sure. A good chunk of the speedup here is from simply not having to do the self-reference check now in the changed scenarios. |
Found this when benchmarking large cluster states. When serializing collections we'd mostly
not take any advantage of what we know about the collection contents (like we do in
StreamOutput
).This PR adds a couple of helpers to the x-content-builder similar to what we have on
StreamOutput
to allow for faster serializing by avoiding the writer lookup and some self-reference checks and uses them
in obvious spots I could quickly identify in profiling or via the IDE.
relates #77466