Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json.Marshal allocates a lot of unnecessary memory #324

Closed
Tracked by #118136
apelisse opened this issue Oct 27, 2022 · 10 comments · Fixed by #325
Closed
Tracked by #118136

json.Marshal allocates a lot of unnecessary memory #324

apelisse opened this issue Oct 27, 2022 · 10 comments · Fixed by #325

Comments

@apelisse
Copy link
Member

This is a follow-up of #315/#319. Now that Unmarshaling has been fixed, we need to worry about the next big thing, Marshaling.

As we can see from the pprof heap profile (alloc_space) below, json.Marshal is allocating a lot of memory, certainly because it serializes in buffers that are quickly discarded as these buffers are "concatenated" (e.g. https://github.com/kubernetes/kube-openapi/blob/master/pkg/validation/spec/header.go#L44-L63, https://github.com/kubernetes/kube-openapi/blob/master/pkg/validation/spec/schema.go#L435-L466)
profile002

There's clearly no great way to do that with json.Marshal v1, since when you have multiple embedded types (which we have everywhere in the spec), only one of the custom marshaller is called. That's also true for json.Marshal v2, I don't see how it could be different anyways.

I've been looking at go-json-experiment, but I can't find a good way to "embed" multiple values, all the methods require that you serialize exactly one json "value", but in most cases, we want to serialize a list of fields for each of these embedded sub-structures. I think we could give each of these types a "MarshalKeysAndValues", and construct the object itself, but then we will have to rewrite a lot of logic to extract the keys and values.

Anyways, I'm stuck. Suggestions welcome @dsnet @alexzielenski @liggitt

@dsnet
Copy link

dsnet commented Oct 27, 2022

Couldn't you do the opposite as UnmarshalNextJSON?

func (h *Header) MarshalNextJSON(opts jsonv2.MarshalOptions, enc *jsonv2.Encoder) error {
	// TODO: This mutates h.Extensions, which may be unacceptable.
	// Might need to clone the map if there are unsanitized keys.
	// or maybe we don't care for marshaling purposes?
	h.Extensions.sanitize()
	if len(h.Extensions) == 0 {
		h.Extensions = nil
	}

	var x struct {
		CommonValidations
		SimpleSchema
		Extensions
		HeaderProps
	} {
		CommonValidations: h.CommonValidations,
		SimpleSchema:      h.SimpleSchema,
		Extensions:        h.Extensions,
		HeaderProps:       h.HeaderProps,
	}
	return opts.MarshalNext(enc, &x)
}

@apelisse
Copy link
Member Author

For Header it works because none of the types in X have a custom MarshalJSON/MarshalNextJSON function, but if they do, then only one will get called AFAICT.

@dsnet
Copy link

dsnet commented Oct 27, 2022

If I recall most of them were fine. Which type wasn't?

@alexzielenski
Copy link
Contributor

@dsnet Operation embeds OperationProps which needs custom marshaling (so omit empty distinguishes nil from 0-length)

we did not see a way to get the inlining to work without somehow duplicating OperationProps' marshaling logic into Operation itself which is undesired.

@dsnet
Copy link

dsnet commented Oct 27, 2022

That's unfortunate. It seems that the only reason why OperationProps.MarshalJSON exists is because v1 doesn't support a omitzero tag that omits empty only nil slices.

How about?

func (o *Operation) MarshalNextJSON(opts jsonv2.MarshalOptions, enc *jsonv2.Encoder) error {
	// TODO: Should this be done?
	x.Extensions.sanitize()
	if len(x.Extensions) == 0 {
		x.Extensions = nil
	}

	// OperationPropsAlt is identical to OperationProps,
	// but lacks a MarshalJSON method and
	// uses `omitzero` instead of `omitempty` on the
	// Security and Deprecated fields.
	type OperationPropsAlt struct {
		Description  string                 `json:"description,omitempty"`
		Consumes     []string               `json:"consumes,omitempty"`
		Produces     []string               `json:"produces,omitempty"`
		Schemes      []string               `json:"schemes,omitempty"`
		Tags         []string               `json:"tags,omitempty"`
		Summary      string                 `json:"summary,omitempty"`
		ExternalDocs *ExternalDocumentation `json:"externalDocs,omitempty"`
		ID           string                 `json:"operationId,omitempty"`
		Deprecated   bool                   `json:"deprecated,omitzero"`
		Security     []map[string][]string  `json:"security,omitzero"`
		Parameters   []Parameter            `json:"parameters,omitempty"`
		Responses    *Responses             `json:"responses,omitempty"`
	}
	x := struct {
		Extensions
		*OperationPropsAlt
	}{
		Extension:         o.Extensions,
		OperationPropsAlt: (*OperationPropsAlt)(&o.OperationProps), // see https://go.dev/issue/16085
	}
	return opts.MarshalNext(enc, &x)
}

See https://pkg.go.dev/github.com/go-json-experiment/json#hdr-JSON_Representation_of_Go_structs for the semantics of omitzero vs omitempty.

@dsnet
Copy link

dsnet commented Oct 27, 2022

It's unfortunate that we need to declare the OperationPropsAlt type, but fortunately it's not a significant hit to maintainability since the language semantics of golang/go#16085 requires that OperationProps and OperationPropsAlt have the same field layout, so added/removed fields in OperationProps will result in a build failure until the equivalent change is made in OperationPropsAlt.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 25, 2023
@apelisse
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 25, 2023
@dsnet
Copy link

dsnet commented Sep 6, 2023

FYI, there's been some upstream activity that may help you.

We added jsonv1.DefaultOptionsV1 which can be passed to jsonv2.Marshal to marshal with v1 "encoding/json" semantics. The implementation is not 100% identical to v1 under this option as it is still a work-in-progress, but we will gradually get it as close to 100% as possible.

That said, there's been a few breaking changes at HEAD:

Summary of breaking changes:

  • We split the v2 "json" package apart into "json" and "jsontext", where the syntactic handling of JSON was moved over to the "jsontext" pacakge. The purpose of this change is to provide a package that does purely syntactic operations on JSON with a minimal dependency tree.
  • We removed MarshalOptions, UnmarshalOptions, EncodeOptions, and DecodeOptions struct types and switched over to a variadic options API. The purpose of this to pave the way for implementing v1 "encoding/json" entirely in terms of v2.

This will remove the need for gross hacks like this:

type schemaPropsOmitZero struct {
ID string `json:"id,omitempty"`
Ref Ref `json:"-"`
Schema SchemaURL `json:"-"`
Description string `json:"description,omitempty"`
Type StringOrArray `json:"type,omitzero"`
Nullable bool `json:"nullable,omitzero"`
Format string `json:"format,omitempty"`
Title string `json:"title,omitempty"`
Default interface{} `json:"default,omitzero"`
Maximum *float64 `json:"maximum,omitempty"`
ExclusiveMaximum bool `json:"exclusiveMaximum,omitzero"`
Minimum *float64 `json:"minimum,omitempty"`
ExclusiveMinimum bool `json:"exclusiveMinimum,omitzero"`
MaxLength *int64 `json:"maxLength,omitempty"`
MinLength *int64 `json:"minLength,omitempty"`
Pattern string `json:"pattern,omitempty"`
MaxItems *int64 `json:"maxItems,omitempty"`
MinItems *int64 `json:"minItems,omitempty"`
UniqueItems bool `json:"uniqueItems,omitzero"`
MultipleOf *float64 `json:"multipleOf,omitempty"`
Enum []interface{} `json:"enum,omitempty"`
MaxProperties *int64 `json:"maxProperties,omitempty"`
MinProperties *int64 `json:"minProperties,omitempty"`
Required []string `json:"required,omitempty"`
Items *SchemaOrArray `json:"items,omitzero"`
AllOf []Schema `json:"allOf,omitempty"`
OneOf []Schema `json:"oneOf,omitempty"`
AnyOf []Schema `json:"anyOf,omitempty"`
Not *Schema `json:"not,omitzero"`
Properties map[string]Schema `json:"properties,omitempty"`
AdditionalProperties *SchemaOrBool `json:"additionalProperties,omitzero"`
PatternProperties map[string]Schema `json:"patternProperties,omitempty"`
Dependencies Dependencies `json:"dependencies,omitempty"`
AdditionalItems *SchemaOrBool `json:"additionalItems,omitzero"`
Definitions Definitions `json:"definitions,omitempty"`
}

@apelisse
Copy link
Member Author

apelisse commented Sep 6, 2023

That's awesome, thanks Joe for the update! I'm pretty excited about removing all these ugly type declaration :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants