-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kafka-consumer] Use wait group to ensure goroutine is finished before returning from Close #4582
Conversation
Signed-off-by: kennyaz <[email protected]>
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #4582 +/- ##
==========================================
- Coverage 97.08% 97.04% -0.05%
==========================================
Files 300 301 +1
Lines 17813 17839 +26
==========================================
+ Hits 17293 17311 +18
- Misses 417 423 +6
- Partials 103 105 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's indeed a good idea to keep track of all goroutines that begin in Start, and reusing the existing doneWg
for this makes sense. However, you're still not tracking the deadlock detector goroutine started in L77, which also performs logging, so I don't think this change will fully address your issue.
c.logger.Info("Starting main loop") | ||
c.doneWg.Done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c.logger.Info("Starting main loop") | |
c.doneWg.Done() | |
defer c.doneWg.Done() | |
c.logger.Info("Starting main loop") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing, that cause test to timeout because this channel is not closed at all for some reason. I investigate it further and it seems like this method is not closing the partition channel properly at least. The test that causes the code to timeout is https://github.com/jaegertracing/jaeger/blob/main/cmd/ingester/app/consumer/consumer_test.go#L122. It seems like the bug is happening on the "github.com/Shopify/sarama/mocks" side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it back. It is happening on the Jaeger Side. I will submit a correct push shortly.
Signed-off-by: kennyaz <[email protected]>
Signed-off-by: kennyaz <[email protected]>
Signed-off-by: kennyaz <[email protected]>
Signed-off-by: kennyaz <[email protected]>
Which problem is this PR solving?
Resolves # 4576
Short description of the changes
Once we call start and call close again with fxtest module, there is a chance that the application will close the goroutine before the logline is executed causing panic Log in routine after .. has completed.
Fix:
Add a wait group that will wait for the goroutine to finish before closing the channel.