Fix for Target Allocator not saving targets when collector instances take time to come up #2351

rashmichandrashekar · 2023-11-14T01:12:43Z

Description:
Fixing a bug - In a kubernetes cluster, have scrape jobs configured using TargetAllocator. Due to timing issue(which is very common with kubernetes clusters, where no ordering can be guaranteed), the instances are not discovered until after the first set of targets are discovered. The configuration applied until the instances are discovered are lost.
This PR fixes that issue where it saves the targets until the collector instances come up

Link to tracking Issue:
#2350

Testing:
Tested that in such scenarios, the targets are not lost and assigned to collector instances when they come up

swiatekm · 2023-11-14T10:13:41Z

Can you add a unit test for this case?

Also, does the same problem not apply to the other allocation strategy as well?

…ichandrashekar/opentelemetry-operator into rashmi/collector-instances

rashmichandrashekar · 2023-11-15T02:33:07Z

Can you add a unit test for this case?

Also, does the same problem not apply to the other allocation strategy as well?

Thanks @swiatekm-sumo. I have made the changed for the other hashing algo as well and added a couple unit tests. please lmk if anything else is needed.

swiatekm · 2023-11-15T17:43:53Z

Can you also check the case where you add targets to a strategy with 0 collectors, and later add collectors to see if targets get allocated to them correctly?

There's also a lint error you need to fix.

… algo

…ichandrashekar/opentelemetry-operator into rashmi/collector-instances

rashmichandrashekar · 2023-11-16T00:14:39Z

Can you also check the case where you add targets to a strategy with 0 collectors, and later add collectors to see if targets get allocated to them correctly?

There's also a lint error you need to fix.

Thanks @swiatekm-sumo . That's a good test case, I have added that. Please take a look.

rashmichandrashekar · 2023-11-16T18:37:08Z

@swiatekm-sumo - I see an e2e test failure due to timeouts during CR creation. Could you please rerun the tests? Thanks!

rashmichandrashekar · 2023-11-16T20:29:57Z

@swiatekm-sumo - All the checks seem to have passed, could you pls help get this merged? Thanks!

swiatekm

I feel like there should be a way to do this without all the special cases, but we can refactor it later now that we have tests for them. I don't think there's any point in keeping this PR held up on DRY issues, so I'm approving it.

Thank you for the contribution!

rashmichandrashekar · 2023-11-17T18:04:35Z

Thanks @swiatekm-sumo! I dont have permissions to merge, could you pls merge this?
or maybe @jaronoff97 can help merge this?

…take time to come up (open-telemetry#2351) * change to account for targets discoverd before instances * adding change log * adding instance changed to leat weighted as well * adding tests * adding unit tests * updaintg tests * fixing lint error and updating tests and fixing bug in least weighted algo * fix indent

rashmichandrashekar added 2 commits November 13, 2023 16:48

change to account for targets discoverd before instances

0bbe830

adding change log

b0d72ae

rashmichandrashekar requested review from a team November 14, 2023 01:12

rashmichandrashekar mentioned this pull request Nov 14, 2023

Target Allocator loses targets discovered when collector instances take time to come up #2350

Closed

Merge branch 'main' into rashmi/collector-instances

40b987c

rashmichandrashekar added 5 commits November 14, 2023 14:18

Merge branch 'main' into rashmi/collector-instances

09eaf53

adding instance changed to leat weighted as well

505a1ed

Merge branch 'rashmi/collector-instances' of https://github.com/rashm…

ed5564e

…ichandrashekar/opentelemetry-operator into rashmi/collector-instances

adding tests

40b9c71

adding unit tests

645b018

updaintg tests

e917cc8

rashmichandrashekar added 3 commits November 15, 2023 12:49

Merge branch 'main' into rashmi/collector-instances

5ebec60

fixing lint error and updating tests and fixing bug in least weighted…

2866fb1

… algo

Merge branch 'rashmi/collector-instances' of https://github.com/rashm…

563755f

…ichandrashekar/opentelemetry-operator into rashmi/collector-instances

rashmichandrashekar added 2 commits November 15, 2023 16:29

fix indent

38aafdb

Merge branch 'main' into rashmi/collector-instances

264e81b

swiatekm approved these changes Nov 17, 2023

View reviewed changes

Merge branch 'main' into rashmi/collector-instances

ba3724f

jaronoff97 approved these changes Nov 17, 2023

View reviewed changes

jaronoff97 merged commit 181fefa into open-telemetry:main Nov 17, 2023
27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for Target Allocator not saving targets when collector instances take time to come up #2351

Fix for Target Allocator not saving targets when collector instances take time to come up #2351

rashmichandrashekar commented Nov 14, 2023

swiatekm commented Nov 14, 2023

rashmichandrashekar commented Nov 15, 2023

swiatekm commented Nov 15, 2023

rashmichandrashekar commented Nov 16, 2023

rashmichandrashekar commented Nov 16, 2023

rashmichandrashekar commented Nov 16, 2023

swiatekm left a comment

rashmichandrashekar commented Nov 17, 2023

Fix for Target Allocator not saving targets when collector instances take time to come up #2351

Fix for Target Allocator not saving targets when collector instances take time to come up #2351

Conversation

rashmichandrashekar commented Nov 14, 2023

swiatekm commented Nov 14, 2023

rashmichandrashekar commented Nov 15, 2023

swiatekm commented Nov 15, 2023

rashmichandrashekar commented Nov 16, 2023

rashmichandrashekar commented Nov 16, 2023

rashmichandrashekar commented Nov 16, 2023

swiatekm left a comment

Choose a reason for hiding this comment

rashmichandrashekar commented Nov 17, 2023