Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TG garbage collector #580

Merged
merged 3 commits into from
Jan 5, 2024
Merged

Add TG garbage collector #580

merged 3 commits into from
Jan 5, 2024

Conversation

mikhail-aws
Copy link
Contributor

@mikhail-aws mikhail-aws commented Dec 29, 2023

Note:
Address performance issue for unused target group cleanup. Implemented background garbage collector, that runs every X seconds. Might need to tune timers after.

e2e tests pass:

Ran 58 of 58 Specs in 2358.172 seconds
SUCCESS! -- 58 Passed | 0 Failed | 0 Pending | 0 Skipped
--- PASS: TestIntegration (2358.17s)

close #552

@mikhail-aws mikhail-aws marked this pull request as draft January 2, 2024 18:53
@mikhail-aws mikhail-aws marked this pull request as ready for review January 2, 2024 23:30
Copy link
Contributor

@zijun726911 zijun726911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed this PR and overall LGTM

defer func() {
tgGc.lock.Unlock()
}()
tgGc.lock.Lock()
Copy link
Contributor

@zijun726911 zijun726911 Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments here for the reason we need to do Lock and Unlock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

tgGcFn := NewTgGcFn(tgGcSynth)
tgGc = &TgGc{
lock: sync.Mutex{},
log: log.Named("gc"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.Named("tgGc") ?

Copy link
Contributor Author

@mikhail-aws mikhail-aws Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually log names are lower case, can do "tg-gc" or "tg.gc"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we ever garbage collect services? I think this would be the only GC

Copy link
Contributor Author

@mikhail-aws mikhail-aws Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not use GC at all. It will be a problem in future - global lock will cause contention and need carefully lock/unlock otherwise we will deadlock route reconciliations. This solution is a little pain relief from doing full TG clean up on each reconciliation loop.

We should have a better mechanism to identify TGs when we create brand new TG or when we replace old one and deletion is required. In that case we dont need locks, since problem will be properly partitioned by single-threaded owners - reconcilers.

}
succ := 0
for _, res := range results {
if res.Err == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to print res.Err or do errors.Join(res.Err) and return it with TgGcResult{}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, mostly errors happen when TG is in state that cannot be deleted yet. It pollutes logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think critical errors needs to be logged in SynthesizeUnusedDelete, or need to wrap them into some "NonOk" types and do type check here.

svcBuilder := gateway.NewLatticeServiceBuilder(log, k8sClient, brTgBuilder)

tgGcOnce.Do(func() {
// TODO: need to refactor TG synthesizer. Remove stack from constructor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this TODO a future issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should revisit synthesizers, the concept of abstract stack that we build and how it's actually implemented is questionable.

@xWink
Copy link
Member

xWink commented Jan 4, 2024

Overall LGTM, love the use of goroutines here.

Mikhail Berezovskiy added 2 commits January 5, 2024 00:41
@mikhail-aws mikhail-aws merged commit 3c3a2e6 into aws:main Jan 5, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize unused target group cleanup
3 participants