[target-allocator] Investigate Target Allocator memory usage #1257

jaronoff97 · 2022-11-16T22:28:23Z

@seankhliao reported in the otel-operator channel in slack that the TA was using around 200 gigs of memory for 4k+ pods. This is definitely excessive and makes me think that we either have a memory leak and/or poor memory usage. I'm planning on going through the TA code, running some benchmarks/profiling, and doing some refactoring to improve the performance of the service.

seankhliao · 2022-11-17T22:31:22Z

I think most of what I experienced is in that slack thread.
Some additional info:

I think it's mostly from the map copies generated from TargetItems() in allocation. Running the allocator without clients doesn't produce the increasing memory usage profile.

I made some attempt at using sync.Pool and/or some rate limiting on TargetItems(), it only slowed but didn't prevent the runaway memory usage.

I ran with 10 collector instances with a 10s refresh interval (later reduced to 30s).

Setting GOMEMLIMIT did not prevent it from exceeding the soft memory limit.

jaronoff97 added the area:target-allocator Issues for target-allocator label Nov 16, 2022

jaronoff97 self-assigned this Nov 16, 2022

jaronoff97 mentioned this issue Nov 23, 2022

[target-allocator] Fix and run unit tests #1272

Closed

jaronoff97 mentioned this issue Nov 30, 2022

Memory improvements first pass #1293

Merged

pavolloffay closed this as completed in #1293 Dec 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[target-allocator] Investigate Target Allocator memory usage #1257

[target-allocator] Investigate Target Allocator memory usage #1257

jaronoff97 commented Nov 16, 2022

seankhliao commented Nov 17, 2022

[target-allocator] Investigate Target Allocator memory usage #1257

[target-allocator] Investigate Target Allocator memory usage #1257

Comments

jaronoff97 commented Nov 16, 2022

seankhliao commented Nov 17, 2022