Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use "jsonv2"/go-json-experimental to marshal OpenAPI v2 #325

Merged
merged 1 commit into from
Feb 24, 2023

Conversation

apelisse
Copy link
Member

@apelisse apelisse commented Oct 28, 2022

Performance boost is also pretty impressive:

name                                    old time/op    new time/op    delta
SwaggerSpec_ExperimentalMarshal/json-8     102ms ± 1%      25ms ± 1%  -75.12%  (p=0.016 n=4+5)

name                                    old alloc/op   new alloc/op   delta
SwaggerSpec_ExperimentalMarshal/json-8    52.4MB ± 4%    19.9MB ± 2%  -62.02%  (p=0.008 n=5+5)

name                                    old allocs/op  new allocs/op  delta
SwaggerSpec_ExperimentalMarshal/json-8      210k ± 0%       76k ± 0%  -63.82%  (p=0.008 n=5+5)

Mostly because this removes the need to deserialize json into buffers that are then thrown away when the jsons are concatenated together (due to lots of embedded objects).

@dsnet @alexzielenski @liggitt

Fixes #324

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 28, 2022
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 28, 2022
@apelisse
Copy link
Member Author

Running this in Kubernetes to see if it works, already found a couple of changes that need to be updated in this PR.

kubernetes/kubernetes#113445

@apelisse
Copy link
Member Author

apelisse commented Nov 2, 2022

Updated with WIP that breaks tests because marshal doesn't serialize in the same order. There's also a slight different change in how unicodes are encoded apparently which is causing the length to not always be strictly equal (though unmarshaling seems to restore to the previous state).

Also, so far I've only found one location in kubernetes/kubernetes that need to be updated: staging/src/k8s.io/apiserver/pkg/util/openapi/proto.go

@dsnet
Copy link

dsnet commented Nov 2, 2022

If you're not already, you could consider a semantic comparison of JSON objects with:

var got, want []byte = ... // where x and y are raw JSON values
if d := cmp.Diff(got, want, cmp.Transform(in []byte) (out any) {
    // Note that integers larger than 2^53 will lose precision
    // since they will be represented using a Go float64.
    // There are workarounds for this, but calling it out.
    if err := json.Unmarshal(in, &out); err != nil {
        return in
    }
    return out
}); d != "" {
    t.Errorf("JSON mismatch (-got +want):\n%s", d)
}

@liggitt
Copy link
Member

liggitt commented Nov 2, 2022

There's also a slight different change in how unicodes are encoded apparently which is causing the length to not always be strictly equal (though unmarshaling seems to restore to the previous state).

o_O

@dsnet
Copy link

dsnet commented Nov 2, 2022

In v1, the characters in "<>&" were escaped for dubious security reasons where a server embeds JSON within HTML without the proper content encoding. In v2, we stop doing that, and follow RFC recommendations to use the minimal escaped string output.

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 6, 2022
@apelisse
Copy link
Member Author

apelisse commented Nov 6, 2022

Tests should be passing, it's a little bit yucky how the structs are duplicated (more than once), suggestions welcome, ready to merge other than that. Thanks!

pkg/validation/spec/swagger.go Outdated Show resolved Hide resolved
pkg/validation/spec/swagger.go Outdated Show resolved Hide resolved
pkg/util/jsontesting/json_roundtrip.go Outdated Show resolved Hide resolved
pkg/validation/spec/header.go Outdated Show resolved Hide resolved
pkg/validation/spec/header.go Outdated Show resolved Hide resolved
@apelisse apelisse force-pushed the marshal-v2 branch 3 times, most recently from 0392d9d to ba8c768 Compare November 22, 2022 05:53
@apelisse
Copy link
Member Author

OK I think that's good, PTAL!

pkg/validation/spec/swagger_test.go Show resolved Hide resolved
pkg/util/jsontesting/json_roundtrip.go Outdated Show resolved Hide resolved
pkg/validation/spec/info.go Show resolved Hide resolved
)

func JsonCompare(got, want []byte) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there were normalization differences with unicode and escaping, but we're still marshaling deterministically, right? I'm thinking specifically of map marshal order. Switching to use this semantic comparison would mask deterministic marshaling issues if they crept in, so I wanted to check

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you ask ...

According to the doc

JSONv2 has several semantic changes relative to JSONv1 that impacts performance:

  • When marshaling, JSONv2 no longer sorts the keys of a Go map. This will improve performance.

This doesn't mean it's not deterministic, we just know it's not sorted. Also it says "go map", not sure about structures. Go maps are definitely not deterministic. @dsnet what are the actual guarantees here?

"Definitions" is a map, so they will get reshuffled every-time, let's see if we can use the Canonicalize method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline with @apelisse, the new json library explicitly does not marshal maps in sorted order. I think we need our json marshaling to remain deterministic, so I don't think this should merge until we can upstream an optional "output maps in sorted-key order" behavior (which I think is pretty important, actually) or modify our invocation of the marshaling to accomplish the same thing (in a way that doesn't modify the internal copy of the json-experimental lib).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsnet any chance we can have a marshal option to get maps sorted? I have a tentative unoptimized piece of code that does the sorting and it's still a massive improvement compared to json v1.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a use case we need at @tailscale as well, so I might be able to implement it in O(weeks).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @dsnet, any update on this? Can I help somehow? I'd love to get this in the next k8s version, code freeze date is March 14th FYI.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies for the delay. I've been on paternity leave. My return date in January 30th, so I'll start looking at these things more in depth then.

pkg/validation/spec/operation.go Outdated Show resolved Hide resolved
pkg/validation/spec/paths.go Outdated Show resolved Hide resolved
pkg/validation/spec/paths.go Outdated Show resolved Hide resolved
Comment on lines 92 to 97
// sanitized either uses:
// - the map that was passed as an argument
// - or, if the map is nil, returns the extension map if all the extensions were valid
// - or allocates a new map.
// In other words, if you specify a nil map as a parameter, don't edit the returned map.
func (e Extensions) sanitized(m map[string]any) Extensions {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this signature is too convoluted, I should probably split that into sanitized and sanitizedWithMap ... But I don't have a strong enough opinion ...

@@ -4,7 +4,10 @@

package json

import "reflect"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsnet FYI Here and below is my very sad attempt at getting deterministic behavior. It's definitely not great performance-wise (see below), but I'm assuming a few of these arrays could be re-used via a pool.

> benchstat unsorted sorted
name                                                    old time/op    new time/op    delta
SwaggerSpec_ExperimentalMarshal/jsonv2_via_jsonv1-8       47.8ms ± 6%    51.7ms ± 8%   +8.21%  (p=0.001 n=9+10)
SwaggerSpec_ExperimentalMarshal/jsonv2-8                  31.7ms ± 4%    33.3ms ± 2%   +5.23%  (p=0.000 n=10+9)
SwaggerSpec_ExperimentalMarshal/jsonv2-canonicalized-8    49.9ms ± 4%    49.6ms ± 5%     ~     (p=0.481 n=10+10)

name                                                    old alloc/op   new alloc/op   delta
SwaggerSpec_ExperimentalMarshal/jsonv2_via_jsonv1-8       36.9MB ±11%    40.0MB ±12%     ~     (p=0.075 n=10+10)
SwaggerSpec_ExperimentalMarshal/jsonv2-8                  22.1MB ± 2%    24.7MB ± 2%  +11.78%  (p=0.000 n=10+10)
SwaggerSpec_ExperimentalMarshal/jsonv2-canonicalized-8    29.7MB ±16%    29.3MB ± 0%     ~     (p=0.220 n=10+6)

name                                                    old allocs/op  new allocs/op  delta
SwaggerSpec_ExperimentalMarshal/jsonv2_via_jsonv1-8        91.3k ± 0%    111.8k ± 0%  +22.40%  (p=0.000 n=10+10)
SwaggerSpec_ExperimentalMarshal/jsonv2-8                   91.3k ± 0%    111.8k ± 0%  +22.41%  (p=0.000 n=10+10)
SwaggerSpec_ExperimentalMarshal/jsonv2-canonicalized-8     92.9k ± 0%    113.4k ± 0%  +22.08%  (p=0.000 n=10+8)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies for the delay. I just merged go-json-experiment/json@540f014, which adds MarshalOptions.Deterministic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much! That's absolutely great

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 7, 2022
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 16, 2023
@apelisse apelisse force-pushed the marshal-v2 branch 2 times, most recently from 9e3d10f to cab3aa9 Compare February 17, 2023 18:40
@apelisse
Copy link
Member Author

apelisse commented Feb 17, 2023

OK, tests pass, should be working now, pretty good improvement considering this is definitely the worst thing going on in an empty api-server right now. Numbers at the top have been updated.

@liggitt
Copy link
Member

liggitt commented Feb 17, 2023

/lgtm
/approve
/hold if you want any other eyes

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 17, 2023
@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Feb 17, 2023
@apelisse
Copy link
Member Author

@Jefftree @alexzielenski either of you has something to add? If not, feel free to cancel the hold, thanks!

Summary string `json:"summary,omitempty"`
ExternalDocs *ExternalDocumentation `json:"externalDocs,omitzero"`
ID string `json:"operationId,omitempty"`
Deprecated bool `json:"deprecated,omitempty,omitzero"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the behavior here with both omitempty and omitzero?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In "go-json-experiment", the semantics of omitempty was changed to mean "omit any empty JSON value" which is currently defined as a JSON null, empty string, object, or array; note that it excludes false. It also supports omitzero which means "omit any zero Go value" according to the zero value definition in the Go language; which includes false.

Have both omitzero and omitempty specified takes the OR of the two and provides a way for this to operate the same under both "encoding/json" and "go-json-experiment".

@@ -39,7 +55,12 @@ type RoundTripTestCase struct {
ExpectedUnmarshalError string
}

func (t RoundTripTestCase) RoundTripTest(example json.Unmarshaler) error {
type MarshalerUnmarsaheler interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Performance boost is also pretty impressive:
```
name                                    old time/op    new time/op    delta
SwaggerSpec_ExperimentalMarshal/json-8     102ms ± 1%      25ms ± 1%  -75.12%  (p=0.016 n=4+5)

name                                    old alloc/op   new alloc/op   delta
SwaggerSpec_ExperimentalMarshal/json-8    52.4MB ± 4%    19.9MB ± 2%  -62.02%  (p=0.008 n=5+5)

name                                    old allocs/op  new allocs/op  delta
SwaggerSpec_ExperimentalMarshal/json-8      210k ± 0%       76k ± 0%  -63.82%  (p=0.008 n=5+5)
```

Mostly because this removes the need to deserialize json into buffers
that are then thrown away when the jsons are concatenated together (due
to lots of embedded objects).
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 24, 2023
@Jefftree
Copy link
Member

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 24, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: apelisse, Jefftree, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@apelisse
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 24, 2023
@k8s-ci-robot k8s-ci-robot merged commit 66828de into kubernetes:master Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

json.Marshal allocates a lot of unnecessary memory
5 participants