Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix leaking goroutine in memorylimiter #9100

Merged

Conversation

jskiba
Copy link
Contributor

@jskiba jskiba commented Dec 13, 2023

Description:
Fixing a bug - described here #9099

Link to tracking Issue:
#9099

Testing:
No tests added.
Unit tests were run

Documentation:
None

@jskiba jskiba requested review from a team and mx-psi December 13, 2023 08:19
Comment on lines 148 to 150
ml.closeOnce.Do(func() {
close(ml.closed)
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this, we guarantee to close only once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without this

func TestCreateProcessor(t *testing.T) {
that test fails

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think that this test doesn't make sense. I would like @open-telemetry/collector-approvers to say their opinion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the second Shutdown call from the test. We don't have guarantees that it's allowed.

Copy link
Member

@dmitryax dmitryax Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's Start->Shutdown->Start-> causing the failure... This change breaks that workflow even without ml.closeOnce. I don't think we have other components supporting this workflow. So should be fine to not test it.

Copy link
Member

@dmitryax dmitryax Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so maybe let's move closed and closeOnce initialization to Start method, the behavior will stay the same

We can move closed channel creation to Start. closeOnce will not be needed anymore.

@dmitryax is calling a processor's Start after calling Shutdown an unsupported action? Would OPAMP ever need to restart a shutdown component?

I'm not sure

If that pattern is not allowed I agree that the test is invalid and we can treat the code in shutdown as only ever being invoked once.

I'm not saying it's not allowed. We just don't explicitly support it. Likely some components will be failing.

Probably better not not break this pattern here with this change and just move closed channel creation to Start.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would OPAMP ever need to restart a shutdown component?

If we ever want a hot restart we will add a "Reload" capability. Once the component is shutdown, I see no reason to re-start it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I moved closed channel to Start and removed closeOnce

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to startMonitoring, so the closed channel is created when the goroutine is created. When there are metrics, logs and traces in the pipeline it would be called 3 times and closed would be created 3 times too resulting in data race

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there are metrics, logs and traces in the pipeline it would be called 3 times and closed would be created 3 times too resulting in data race

Why? They are synchronized via refCounterLock

Copy link

codecov bot commented Dec 13, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (6b12cc3) 91.50% compared to head (8a31f6d) 91.52%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9100      +/-   ##
==========================================
+ Coverage   91.50%   91.52%   +0.02%     
==========================================
  Files         320      320              
  Lines       17189    17198       +9     
==========================================
+ Hits        15728    15740      +12     
+ Misses       1163     1161       -2     
+ Partials      298      297       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jskiba
Copy link
Contributor Author

jskiba commented Dec 13, 2023

@bogdandrutu I forgot about the changelog

@yurishkuro
Copy link
Member

Has anyone considered enforcing https://github.com/uber-go/goleak in all tests?

@dmitryax
Copy link
Member

Has anyone considered enforcing https://github.com/uber-go/goleak in all tests?

It would be nice to have. I think I saw some leaking goroutines with the open census instrumentation. So we probably need to wait until we migrate from it. I'll create an issue anyway

@bogdandrutu bogdandrutu merged commit ce44516 into open-telemetry:main Dec 21, 2023
31 checks passed
@github-actions github-actions bot added this to the next release milestone Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants