Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory improvements first pass #1293

Merged

Conversation

jaronoff97
Copy link
Contributor

@jaronoff97 jaronoff97 commented Nov 30, 2022

Closes #1257

This PR is the first in a series of two. In this PR, we add new fields to the allocators and servers to cache the values they need to return to prevent excessive allocations on http calls. This also removes the intermediary of the targetGroupJSON and instead just uses the target item with json hints. This also removes the allocations needed in the map diffing. This also fixes a potential bug in the map which would have caused targets to not update when processed. Finally, this has the server keep track of when it needs to recompute the scrape configs response it has to return.

NOTES!

  • When this PR is merged, we will need to make two issues:
    • First one is to change how the diffs are generated to just make lists and not maps
    • The other is look in to a better way to fix the potential (but unlikely) race generated from saving the extra allocation

}

return tgs
func GetAllTargetsByCollectorAndJob(allocator Allocator, collector string, job string) []*target.Item {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in a future PR i'm going to remove this file altogether, for now i'm trying to keep changes to a minimum

jsonConfig, err := yaml.YAMLToJSON(configBytes)
if err != nil {
s.errorHandler(w, err)
// if the hashes are different, we need to recompute the scrape config
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic is similar to what the collector is doing on the receiving end i.e. it should only update this byte array when there's an updated scrape config list

}

func (t Item) Hash() string {
return t.JobName + t.TargetURL + t.Label.Fingerprint().String()
return t.hash
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by precomputing the hash, we're saving a lot of allocations

@jaronoff97 jaronoff97 marked this pull request as ready for review November 30, 2022 16:57
@jaronoff97 jaronoff97 requested a review from a team November 30, 2022 16:57
Copy link
Member

@secustor secustor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, only one possible improvement.

How has this been tested?
Do you have compared the performance?

cmd/otel-allocator/allocation/http.go Show resolved Hide resolved
@jaronoff97
Copy link
Contributor Author

@secustor i've tested this in cluster and it has been running successfully for over a week now. I did some benchmarking before and after for the number of allocations made and the reduction was over 99.9%. I haven't tested this with a cluster at scale, but I can do that before we release this if that would give us more confidence?

Copy link
Contributor

@kristinapathak kristinapathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look like great improvements!

cmd/otel-allocator/allocation/consistent_hashing.go Outdated Show resolved Hide resolved
cmd/otel-allocator/allocation/consistent_hashing.go Outdated Show resolved Hide resolved
cmd/otel-allocator/target/target.go Show resolved Hide resolved
cmd/otel-allocator/target/target.go Show resolved Hide resolved
additions := map[string]T{}
removals := map[string]T{}
// Used as a set to check for removed items
newMembership := map[string]bool{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 😎

Copy link
Member

@secustor secustor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think it is fine if you have running this changes for a while. I'm looking forward to see the second PR.

@pavolloffay pavolloffay merged commit 0a30663 into open-telemetry:main Dec 6, 2022
@jaronoff97 jaronoff97 deleted the 1257-memory-improvements-first branch December 6, 2022 17:02
ihalaij1 pushed a commit to ihalaij1/opentelemetry-operator that referenced this pull request Dec 8, 2022
* Memory improvements first pass

* Comments, store hash

* Fix linting and tests

* Update, more tests and benchmarks, notes

* linting
ItielOlenick pushed a commit to ItielOlenick/opentelemetry-operator that referenced this pull request May 1, 2024
* Memory improvements first pass

* Comments, store hash

* Fix linting and tests

* Update, more tests and benchmarks, notes

* linting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[target-allocator] Investigate Target Allocator memory usage
5 participants