Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafkaproducer: buffer is full messages after upgrading to v1.0.0-beta1 with round-robin partitioner for Kafka #800

Closed
plazmorez opened this issue Sep 9, 2024 · 5 comments
Labels
bug Something isn't working waiting feedback
Milestone

Comments

@plazmorez
Copy link

Many thanks for your time and effort with go-dnscollector!

After upgrading the go-dnscollector to version v1.0.0-beta1 and enabling the round-robin partitioner for kafkaproducer, we started encountering messages indicating that the buffer is full.
However, when specifying a specific partition number, the buffer does not overflow, and the issue does not occur.

The average DNS load is around 1400 RPS.

Config:

global:
  trace:
    verbose: true

  server-identity: "dns-collector"
  text-format: "timestamp-rfc3339ns localtime identity operation rcode queryip protocol length qname qtype"

multiplexer:
  collectors:
    - name: tap
      dnstap:
        sock-path: /var/run/dnstap.sock
      transforms:
        normalize:
          qname-lowercase: true

  loggers:
    - name: prometheus
      prometheus:
       ...
       listen-port: 9998
       prometheus-prefix: "dnscollector"
       top-n: 10

    - name: kafkaproducer
      kafkaproducer:
        remote-address: host.fqdn
        remote-port: 9999
        connect-timeout: 5
        retry-interval: 10
        flush-interval: 30
        tls-support: false
        tls-insecure: false
        tls-min-version: 1.2
        sasl-support: true
        sasl-mechanism: SCRAM-SHA-512
        sasl-username: ""
        sasl-password: ""
        mode: json
        buffer-size: 1000
        topic: "topic"
        partition: null
        chan-buffer-size: 800000
        compression: "snappy"

  routes:
    - from: [ tap ]
      to: [ prometheus, kafkaproducer ]

Logs:

2024/09/09 17:58:47.544421 worker - [tap] (conn #1) dnstap processor - worker[kafkaproducer] buffer is full, 10279 dnsmessage(s) dropped
2024/09/09 17:58:37.543725 worker - [tap] (conn #1) dnstap processor - worker[kafkaproducer] buffer is full, 9123 dnsmessage(s) dropped
2024/09/09 17:58:27.543366 worker - [tap] (conn #1) dnstap processor - worker[kafkaproducer] buffer is full, 5072 dnsmessage(s) dropped
2024/09/09 17:31:27.456113 worker - [tap] dnstap - conn #1 - receiver framestream initialized
2024/09/09 17:31:27.455869 worker - [tap] (conn #1) transformers applied: [normalize:qname-lowercase]
2024/09/09 17:31:27.455773 worker - [tap] (conn #1) dnstap processor - starting data collection
2024/09/09 17:31:27.455736 worker - [tap] (conn #1) dnstap processor - starting monitoring - refresh every 10s
2024/09/09 17:31:27.455545 worker - [tap] (conn #1) dnstap processor - enabled
2024/09/09 17:31:27.455527 worker - [tap] dnstap - conn #1 - new connection from @ (@)
2024/09/09 17:31:27.445308 worker - [kafkaproducer] kafka - connected with success
2024/09/09 17:31:27.397325 worker - [prometheus] prometheus - is listening on :9998
2024/09/09 17:31:27.397284 worker - [prometheus] prometheus - starting http server...
2024/09/09 17:31:27.397272 worker - [prometheus] prometheus - logging has started
2024/09/09 17:31:27.397199 worker - [kafkaproducer] kafka - connecting to kafka=host.fqdn:9999 partition=all topic=topic.>
2024/09/09 17:31:27.397176 worker - [prometheus] prometheus - starting data collection
2024/09/09 17:31:27.397159 worker - [kafkaproducer] kafka - logging has started
2024/09/09 17:31:27.397122 worker - [tap] dnstap - listening on /var/run/dnstap.sock
2024/09/09 17:31:27.397059 worker - [kafkaproducer] kafka - starting data collection
2024/09/09 17:31:27.397042 worker - [tap] dnstap - starting data collection
2024/09/09 17:31:27.396841 worker - [tap] dnstap - starting monitoring - refresh every 10s
2024/09/09 17:31:27.396818 worker - [kafkaproducer] kafka - starting monitoring - refresh every 10s
2024/09/09 17:31:27.395413 main - routing: collector[tap] send to logger[kafkaproducer]
2024/09/09 17:31:27.395382 main - routing: collector[tap] send to logger[prometheus]
2024/09/09 17:31:27.380180 worker - [tap] dnstap - enabled
2024/09/09 17:31:27.379516 main - loading collectors...
2024/09/09 17:31:27.333355 worker - [kafkaproducer] kafka - enabled
2024/09/09 17:31:27.332822 worker - [prometheus] prometheus - starting monitoring - refresh every 10s
2024/09/09 17:31:27.331285 worker - [prometheus] prometheus - enabled
2024/09/09 17:31:27.330633 main - loading loggers...
2024/09/09 17:31:27.330630 main - The multiplexer mode is deprecated. Please switch to the pipelines mode.
2024/09/09 17:31:27.330627 main - running in multiplexer mode
2024/09/09 17:31:27.330566 main - telemetry enabled on local address: :9997
2024/09/09 17:31:27.330507 main - version=1.0.0-beta1 revision=f961d61d17e5207bf1755ae62e7a7d202baa2473
@dmachard
Copy link
Owner

dmachard commented Sep 9, 2024

Thank to report that.
I will try to reproduce in my side

@uralmetal
Copy link

@dmachard as an alternative i suggest solution in kafka-go library
I found ready solution RoundRobin
https://github.com/segmentio/kafka-go/blob/main/balancer.go#L41

@dmachard
Copy link
Owner

The current implementation logic is not good because round robin is performed on each packet to be sent.
I will take a look to use the implementation in the kafa-go library

@dmachard
Copy link
Owner

new beta release available, any feedback will be appreciated

@dmachard
Copy link
Owner

Fixed since 1.0.0. If you encounter any further issues, please feel free to open a new ticket. Thank you for your feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working waiting feedback
Projects
None yet
Development

No branches or pull requests

3 participants