Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserializing into spec.Swagger is almost 20x slower than deserializing into map[string]interface{} #315

Closed
Tracked by #118136
apelisse opened this issue Sep 6, 2022 · 17 comments · Fixed by kubernetes/kubernetes#112988
Assignees

Comments

@apelisse
Copy link
Member

apelisse commented Sep 6, 2022

Running the following benchmark on my laptop:

const openapipath = "api/openapi-spec/swagger.json" // https://github.com/kubernetes/kubernetes/blob/master/api/openapi-spec/swagger.json

func BenchmarkJsonUnmarshalSwagger(b *testing.B) {
	content, err := os.ReadFile(openapipath)
	if err != nil {
		b.Fatalf("Failed to open file: %v", err)
	}

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		t := spec.Swagger{}
		err := json.Unmarshal(content, &t)
		if err != nil {
			b.Fatalf("Failed to unmarshal: %v", err)
		}
	}
}

func BenchmarkJsonUnmarshalInterface(b *testing.B) {
	content, err := os.ReadFile(openapipath)
	if err != nil {
		b.Fatalf("Failed to open file: %v", err)
	}

	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		t := map[string]interface{}{}
		err := json.Unmarshal(content, &t)
		if err != nil {
			b.Fatalf("Failed to unmarshal: %v", err)
		}
	}
}

Yields the following results:

apelisse ~/code/kubernetes/bench_api_parsing> go test -bench=. .
goos: darwin
goarch: amd64
pkg: k8s.io/kubernetes/bench_api_parsing
cpu: Intel(R) Core(TM) i7-8559U CPU @ 2.70GHz
BenchmarkJsonUnmarshalSwagger-8     	       3	 529833040 ns/op
BenchmarkJsonUnmarshalInterface-8   	      37	  29475772 ns/op
PASS
ok  	k8s.io/kubernetes/bench_api_parsing	5.013s

For readability purposes, that's 529ms and 29ms respectively.

For context, this is about spec.Swagger, the OpenAPI v2 definition which is mostly a clone of go-openapi

After a short investigation, the problem seems fairly obvious: the arbitrary vendor extensions (as defined by OpenAPI) forces the json to be deserialized multiple times, at many different levels within the object, causing the deserialization into spec.Swagger to reach O(n²) complexity (my maths is probably dubious).

Vendor extensions can appear at many different layers in the OpenAPI object, e.g. in:

The problem, or lack of good solutions, comes from the rigid API (UnmarshalJSON(data []byte) error) that forces the custom unmarshaler to receive a byte slice rather than an already decoded, or temporary format. Deserializing methods that do use more flexible APIs, like them YAML v3 parser (UnmarshalYAML(value *yaml.Node) error), do not suffer of the same problem, as highlighted through #279 from @alexzielenski.

This bug, which was improperly understood until now, has had various consequences on the entire Kubernetes ecosystem for the last 5 years:

  1. Because deserializing into spec.Swagger was unacceptably slow for frequently invoked command-line tools, kubectl decided to use gnostic/protobuf even though the gnostic type is grossly unusable.
  2. Add direct conversion from Gnostic v2 types to spec.Swagger #283 was written to transform gnostic into spec.Swagger efficiently, but the Swagger to gnostic would also be needed, as well as a OpenAPI v3 version.
  3. So many issues have been written about poor Kubernetes apiserver performance related to parsing/serializing OpenAPI into/from json within the server, with various work-arounds like lazy-marshaling. e.g. Lazy marshaling for OpenAPI v2 spec #251

Many of this was noticed by customers, users and Kubernetes providers, as the evidence can show:

Note: The Custom Resource Definition suggested maximum limit was selected not due to the above SLI/SLOs, but instead due to the latency OpenAPI publishing, which is a background process that occurs asychroniously each time a Custom Resource Definition schema is updated. For 500 Custom Resource Definitions it takes slightly over 35 seconds for a definition change to be visible via the OpenAPI spec endpoint.

For now, the solution discussed with @liggitt is to create a new UnmarshalUnstructured(interface{}) error interface that could replace the slow UnmarshalJSON interface, maybe like the following:

// UnmarshalJSON unmarshals a swagger spec from json
func (s *Swagger) UnmarshalJSON(data []byte) error {
	var sw Swagger
	var i interface{}{}
	if err := json.Unmarshal(data, &i); err != nil {
		return err
	}
	if err := FromUnstructured(i, &sw.SwaggerProps); err != nil {
		return err
	}
	if err := FromUnstructured(i, &sw.VendorExtensible); err != nil {
		return err
	}
	*s = sw
	return nil
}

And FromUnstructured would automatically call the UnmarshalUnstructured methods when available. One drawback is that it forces it to deserialize into a map[string]interface{} first and then copy, which is possibly slower than deserializing into the object directly.

A remark for the end, the exact same problem also applies to serialization/marshaling, though it is less critical.

@alexzielenski
Copy link
Contributor

/assign @alexzielenski

@alexzielenski
Copy link
Contributor

I will try out Jordan's idea with your benchmark for spec.Swagger and see post results back here

@rsc
Copy link

rsc commented Sep 6, 2022

Can you please attach the api/openapi-spec/swagger.json file to the issue? Or link to it?

@alexzielenski
Copy link
Contributor

@rsc

https://github.com/kubernetes/kubernetes/blob/master/api/openapi-spec/swagger.json

@justinsb
Copy link
Member

justinsb commented Sep 6, 2022

protobuf has a nice JSON streaming API, albeit hidden in an internal package .... maybe we should benchmark & consider using (a copy of) that?

https://github.com/protocolbuffers/protobuf-go/blob/6875c3d7242d1a3db910ce8a504f124cb840c23a/encoding/protojson/decode.go#L116

@alexzielenski
Copy link
Contributor

alexzielenski commented Sep 7, 2022

Tried experiment with Jordan's suggestion. Here is a comparison of my changes along with Antoine's benchmark:

master...alexzielenski:kube-openapi:80e24238cea362c6977be5ab3b0bf025e5a6d9e9

On my machine runs in 72ms to unmarshal JSON to spec.Swagger via map[string]interface{}.

goos: darwin
goarch: arm64
pkg: k8s.io/kube-openapi/pkg/validation/spec
BenchmarkJsonUnmarshalUnstructured-20    	      14	  72451152 ns/op	21621302 B/op	  260473 allocs/op
PASS
ok  	k8s.io/kube-openapi/pkg/validation/spec	1.229s

@liggitt
Copy link
Member

liggitt commented Sep 7, 2022

On my machine runs in 72ms to unmarshal JSON to spec.Swagger via map[string]interface{}.

that seems way faster... how does that compare on your machine to the unmarshal times for:

  • old spec.Swagger via UnmarshalJSON
  • directly into map[string]interface{}

@alexzielenski
Copy link
Contributor

on my same machine

goos: darwin
goarch: arm64
pkg: k8s.io/kube-openapi/pkg/validation/spec
BenchmarkJsonUnmarshalSwagger-20    	       3	 423300958 ns/op	93727725 B/op	 1323738 allocs/op
PASS
ok  	k8s.io/kube-openapi/pkg/validation/spec	2.804s

goos: darwin
goarch: arm64
pkg: k8s.io/kube-openapi/pkg/validation/spec
BenchmarkJsonUnmarshalInterface-20    	      40	  26009425 ns/op	11194959 B/op	  173655 allocs/op
PASS
ok  	k8s.io/kube-openapi/pkg/validation/spec	1.206s

423ms to spec.Swagger
26ms to map[string]interface{}

@apelisse
Copy link
Member Author

apelisse commented Sep 7, 2022

Yeah, I get about the same thing

> go test -bench=BenchmarkJsonUnmarshal -run=NONE .
goos: darwin
goarch: amd64
pkg: k8s.io/kube-openapi/pkg/validation/spec
cpu: Intel(R) Core(TM) i7-8559U CPU @ 2.70GHz
BenchmarkJsonUnmarshalUnstructured-8   	      12	  84184019 ns/op
BenchmarkJsonUnmarshalInterface-8      	      40	  28336500 ns/op
PASS
ok  	k8s.io/kube-openapi/pkg/validation/spec	3.059s

We're still 3x slower, much better, we can probably live with that.

@rsc
Copy link

rsc commented Sep 7, 2022

FWIW, a more precise, less noisy way to present and compare benchmark results is:

go install golang.org/x/perf/benchstat@latest
go test -bench=. -count 10 |tee bench.txt
benchstat bench.txt

@dsnet
Copy link

dsnet commented Sep 7, 2022

I got nerd-sniped by @rsc into looking at this.

The behavior of JSON unmarshaling in Swagger is even worse than O(n^2) since it
recursively unmarshals the same JSON input up to 5 times in some places.

There's an experimental JSON module that permits custom types to unmarshal themselves in a truly streaming manner.
I modified the swagger code to use that experimental module.

With the experimental JSON API, we can unmarshal spec.Swagger ~38.6x faster than the current implementation.

Benchmarks:

name                        time/op
UnmarshalJSON/V1/Swagger    1070ms ± 1%
UnmarshalJSON/V1/Interface  52.4ms ± 0%
UnmarshalJSON/V2/Swagger    27.7ms ± 1%
UnmarshalJSON/V2/Interface  15.2ms ± 1%

EDIT: I realized that I modified the wrong module. Here's the same thing applied to k8s.io/kube-openapi.
The performance is about the same as reported earlier.

@dsnet
Copy link

dsnet commented Sep 8, 2022

@justinsb regarding #315 (comment):

The go-json-experiment module is based on lessons learned from implementing a tokenized JSON parser in the protobuf module (where the JSON implementation in protobuf itself was re-implemented 2 or 3 times during its lifetime). go-json-experiment is faster and more general purpose.

A major challenge with go-json-experiment is that the API is still unstable. We have confidence in the correctness and performance of the logic and are using it internally at @tailscale in various production services. However, the API changes occasionally, which would wreck havoc on the module ecosystem if it were used widely in open-source code.

@apelisse
Copy link
Member Author

apelisse commented Sep 8, 2022

Posting with memory allocations since that's also super critical in k8s:

> benchstat bench_v1.txt bench_v2.txt
name                       old time/op    new time/op    delta
UnmarshalJSON/Swagger-8       538ms ± 8%      38ms ±25%  -92.93%  (p=0.000 n=10+10)
UnmarshalJSON/Interface-8    33.5ms ±13%    25.3ms ± 9%  -24.37%  (p=0.000 n=10+8)

name                       old alloc/op   new alloc/op   delta
UnmarshalJSON/Swagger-8      88.5MB ± 0%    21.7MB ± 0%  -75.53%  (p=0.000 n=9+10)
UnmarshalJSON/Interface-8    10.6MB ± 0%    10.0MB ± 0%   -5.72%  (p=0.000 n=10+10)

name                       old allocs/op  new allocs/op  delta
UnmarshalJSON/Swagger-8       1.25M ± 0%     0.11M ± 0%  -91.49%  (p=0.000 n=10+10)
UnmarshalJSON/Interface-8      164k ± 0%      140k ± 0%  -14.38%  (p=0.000 n=10+10)

@dsnet
Copy link

dsnet commented Sep 8, 2022

It'd be interesting to figure out how many allocations is needed at minimum to allocate the entire Swagger value. This way we can tell how many allocations are being done by JSON that could possibly be eliminated.

@apelisse
Copy link
Member Author

I think we can close this now?

@apelisse
Copy link
Member Author

/close

@k8s-ci-robot
Copy link
Contributor

@apelisse: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants