-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use "jsonv2"/go-json-experimental to marshal OpenAPI v2 #325
Conversation
Running this in Kubernetes to see if it works, already found a couple of changes that need to be updated in this PR. |
Updated with WIP that breaks tests because marshal doesn't serialize in the same order. There's also a slight different change in how unicodes are encoded apparently which is causing the length to not always be strictly equal (though unmarshaling seems to restore to the previous state). Also, so far I've only found one location in kubernetes/kubernetes that need to be updated: staging/src/k8s.io/apiserver/pkg/util/openapi/proto.go |
If you're not already, you could consider a semantic comparison of JSON objects with: var got, want []byte = ... // where x and y are raw JSON values
if d := cmp.Diff(got, want, cmp.Transform(in []byte) (out any) {
// Note that integers larger than 2^53 will lose precision
// since they will be represented using a Go float64.
// There are workarounds for this, but calling it out.
if err := json.Unmarshal(in, &out); err != nil {
return in
}
return out
}); d != "" {
t.Errorf("JSON mismatch (-got +want):\n%s", d)
} |
o_O |
In v1, the characters in "<>&" were escaped for dubious security reasons where a server embeds JSON within HTML without the proper content encoding. In v2, we stop doing that, and follow RFC recommendations to use the minimal escaped string output. |
Tests should be passing, it's a little bit yucky how the structs are duplicated (more than once), suggestions welcome, ready to merge other than that. Thanks! |
0392d9d
to
ba8c768
Compare
OK I think that's good, PTAL! |
) | ||
|
||
func JsonCompare(got, want []byte) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know there were normalization differences with unicode and escaping, but we're still marshaling deterministically, right? I'm thinking specifically of map marshal order. Switching to use this semantic comparison would mask deterministic marshaling issues if they crept in, so I wanted to check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad you ask ...
According to the doc
JSONv2 has several semantic changes relative to JSONv1 that impacts performance:
- When marshaling, JSONv2 no longer sorts the keys of a Go map. This will improve performance.
This doesn't mean it's not deterministic, we just know it's not sorted. Also it says "go map", not sure about structures. Go maps are definitely not deterministic. @dsnet what are the actual guarantees here?
"Definitions" is a map, so they will get reshuffled every-time, let's see if we can use the Canonicalize
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked offline with @apelisse, the new json library explicitly does not marshal maps in sorted order. I think we need our json marshaling to remain deterministic, so I don't think this should merge until we can upstream an optional "output maps in sorted-key order" behavior (which I think is pretty important, actually) or modify our invocation of the marshaling to accomplish the same thing (in a way that doesn't modify the internal copy of the json-experimental lib).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsnet any chance we can have a marshal option to get maps sorted? I have a tentative unoptimized piece of code that does the sorting and it's still a massive improvement compared to json v1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a use case we need at @tailscale as well, so I might be able to implement it in O(weeks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @dsnet, any update on this? Can I help somehow? I'd love to get this in the next k8s version, code freeze date is March 14th FYI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies for the delay. I've been on paternity leave. My return date in January 30th, so I'll start looking at these things more in depth then.
pkg/validation/spec/info.go
Outdated
// sanitized either uses: | ||
// - the map that was passed as an argument | ||
// - or, if the map is nil, returns the extension map if all the extensions were valid | ||
// - or allocates a new map. | ||
// In other words, if you specify a nil map as a parameter, don't edit the returned map. | ||
func (e Extensions) sanitized(m map[string]any) Extensions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this signature is too convoluted, I should probably split that into sanitized
and sanitizedWithMap
... But I don't have a strong enough opinion ...
@@ -4,7 +4,10 @@ | |||
|
|||
package json | |||
|
|||
import "reflect" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsnet FYI Here and below is my very sad attempt at getting deterministic behavior. It's definitely not great performance-wise (see below), but I'm assuming a few of these arrays could be re-used via a pool.
> benchstat unsorted sorted
name old time/op new time/op delta
SwaggerSpec_ExperimentalMarshal/jsonv2_via_jsonv1-8 47.8ms ± 6% 51.7ms ± 8% +8.21% (p=0.001 n=9+10)
SwaggerSpec_ExperimentalMarshal/jsonv2-8 31.7ms ± 4% 33.3ms ± 2% +5.23% (p=0.000 n=10+9)
SwaggerSpec_ExperimentalMarshal/jsonv2-canonicalized-8 49.9ms ± 4% 49.6ms ± 5% ~ (p=0.481 n=10+10)
name old alloc/op new alloc/op delta
SwaggerSpec_ExperimentalMarshal/jsonv2_via_jsonv1-8 36.9MB ±11% 40.0MB ±12% ~ (p=0.075 n=10+10)
SwaggerSpec_ExperimentalMarshal/jsonv2-8 22.1MB ± 2% 24.7MB ± 2% +11.78% (p=0.000 n=10+10)
SwaggerSpec_ExperimentalMarshal/jsonv2-canonicalized-8 29.7MB ±16% 29.3MB ± 0% ~ (p=0.220 n=10+6)
name old allocs/op new allocs/op delta
SwaggerSpec_ExperimentalMarshal/jsonv2_via_jsonv1-8 91.3k ± 0% 111.8k ± 0% +22.40% (p=0.000 n=10+10)
SwaggerSpec_ExperimentalMarshal/jsonv2-8 91.3k ± 0% 111.8k ± 0% +22.41% (p=0.000 n=10+10)
SwaggerSpec_ExperimentalMarshal/jsonv2-canonicalized-8 92.9k ± 0% 113.4k ± 0% +22.08% (p=0.000 n=10+8)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies for the delay. I just merged go-json-experiment/json@540f014, which adds MarshalOptions.Deterministic
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much! That's absolutely great
9e3d10f
to
cab3aa9
Compare
OK, tests pass, should be working now, pretty good improvement considering this is definitely the worst thing going on in an empty api-server right now. Numbers at the top have been updated. |
/lgtm |
@Jefftree @alexzielenski either of you has something to add? If not, feel free to cancel the hold, thanks! |
Summary string `json:"summary,omitempty"` | ||
ExternalDocs *ExternalDocumentation `json:"externalDocs,omitzero"` | ||
ID string `json:"operationId,omitempty"` | ||
Deprecated bool `json:"deprecated,omitempty,omitzero"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the behavior here with both omitempty and omitzero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In "go-json-experiment", the semantics of omitempty
was changed to mean "omit any empty JSON value" which is currently defined as a JSON null, empty string, object, or array; note that it excludes false. It also supports omitzero
which means "omit any zero Go value" according to the zero value definition in the Go language; which includes false.
Have both omitzero
and omitempty
specified takes the OR of the two and provides a way for this to operate the same under both "encoding/json" and "go-json-experiment".
@@ -39,7 +55,12 @@ type RoundTripTestCase struct { | |||
ExpectedUnmarshalError string | |||
} | |||
|
|||
func (t RoundTripTestCase) RoundTripTest(example json.Unmarshaler) error { | |||
type MarshalerUnmarsaheler interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Performance boost is also pretty impressive: ``` name old time/op new time/op delta SwaggerSpec_ExperimentalMarshal/json-8 102ms ± 1% 25ms ± 1% -75.12% (p=0.016 n=4+5) name old alloc/op new alloc/op delta SwaggerSpec_ExperimentalMarshal/json-8 52.4MB ± 4% 19.9MB ± 2% -62.02% (p=0.008 n=5+5) name old allocs/op new allocs/op delta SwaggerSpec_ExperimentalMarshal/json-8 210k ± 0% 76k ± 0% -63.82% (p=0.008 n=5+5) ``` Mostly because this removes the need to deserialize json into buffers that are then thrown away when the jsons are concatenated together (due to lots of embedded objects).
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: apelisse, Jefftree, liggitt The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
Performance boost is also pretty impressive:
Mostly because this removes the need to deserialize json into buffers that are then thrown away when the jsons are concatenated together (due to lots of embedded objects).
@dsnet @alexzielenski @liggitt
Fixes #324