-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix more Kafka committing errors #471
Fix more Kafka committing errors #471
Conversation
Pushing this with the plan to add more unit tests shortly Signed-off-by: Greg Schohn <[email protected]>
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #471 +/- ##
============================================
+ Coverage 72.88% 73.59% +0.71%
- Complexity 1165 1182 +17
============================================
Files 124 124
Lines 4846 4890 +44
Branches 436 439 +3
============================================
+ Hits 3532 3599 +67
+ Misses 1021 998 -23
Partials 293 293
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Uncovering a lot of obscure race conditions along the way. Signed-off-by: Greg Schohn <[email protected]>
… moving all the documents over... BUT - I'm still getting NPEs that indicate that the OffsetLifecycleTracker's pQueue is empty when I'm trying to pull a value from it. Signed-off-by: Greg Schohn <[email protected]>
…code, & the introduction of a failing test for double-commits. Signed-off-by: Greg Schohn <[email protected]>
One change through the merge (deviating from both of the prior commits) - trafficStreamKeysBeingHeld becomes lazily allocated. Some patterns may have long runs of transactions within single traffic streams and therefore won't be holding any keys at all. Signed-off-by: Greg Schohn <[email protected]> # Conflicts: # TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/replay/RequestResponsePacketPair.java
… processed Signed-off-by: Greg Schohn <[email protected]>
metricsLogger.atSuccess(MetricsEvent.PARSED_TRAFFIC_STREAM_FROM_KAFKA) | ||
.setAttribute(MetricsAttributeKey.CONNECTION_ID, ts.getConnectionId()) | ||
.setAttribute(MetricsAttributeKey.TOPIC_NAME, trackingKafkaConsumer.topic) | ||
.setAttribute(MetricsAttributeKey.SIZE_IN_BYTES, ts.getSerializedSize()).emit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we already do this is in the next four lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shoot - yes we do. Bad merge. I've pulled one of these away
@@ -68,6 +71,7 @@ public class KafkaTrafficCaptureSource implements ISimpleTrafficCaptureSource { | |||
|
|||
|
|||
final TrackingKafkaConsumer trackingKafkaConsumer; | |||
private final ExecutorService kafkaExecutor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remind me again what prompted the need for a dedicated kafkaExecutor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kafka consumers aren't threadsafe. The polling and touch calls are already built with an async pattern anyway, so callers might already be coming in from any number of threads (for commit/poll, not touch). Given that the kafka thread does blocking IO, that thread is going to be pretty busy. Since the callstack will be remarkably similar for all Kafka interactions, having some thread affinity seemed like a good idea to 1) make it easier to know where the kafka thread was (we should name that thread), 2) it's probably a tiny performance boost, and 3) it might create deadlock (just as the other way could have too), but we shouldn't ever get a race condition - plus, everything is more deterministic.
Signed-off-by: Greg Schohn <[email protected]>
Bugfixes include:
touch()
andreadNextTrafficStreamChunk()
activities for the Kafka client (except forclose
as per this issue) on a dedicated thread, rather than using CompletableFuture.supplyAsync(), which could change the actual thread being used.Refactorings include
Description
Issues Resolved
Continuation of https://opensearch.atlassian.net/browse/MIGRATIONS-1379
Is this a backport? If so, please add backport PR # and/or commits #
Testing
Only gradle.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.