Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telegraf kafka consumer failed to authenticate with SASL Plain mechanism #9058

Closed
ghost opened this issue Mar 26, 2021 · 6 comments
Closed
Labels
area/kafka bug unexpected problem or unintended behavior

Comments

@ghost
Copy link

ghost commented Mar 26, 2021

Telegraf installed with helm chart and configured with kafka consumer input with below configurations
https://github.com/influxdata/helm-charts/tree/master/charts/telegraf

However the pod failed to start with kafka authentication errors as below. It would seems Telegraf does not respect the SASL_mechanism configured "PLAIN", instead if uses "sarama" which may causing the handshake failure.

Relevant telegraf.conf:

    - kafka_consumer:
        brokers: 
          - "(kakfa-broker).azure.confluent.cloud:9092"
        topics: 
          - "telegraf"
        version: "2.0.0"
        sasl_username: "(username)"
        sasl_password: "(password)"
        sasl_mechanism: "PLAIN"
        consumer_group: "telegraf"
        offset: "newest"
        data_format: "influx"

[[inputs.kafka_consumer]]
brokers = ["(kafka-server).confluent.cloud:9092"]
topics = ["telegraf"]
version = "2.0.0"
sasl_username = "(username)"
sasl_password = "(password)"
consumer_group = "test"
offset = "oldest"
max_message_len = 1000000
data_format = "influx"
insecure_skip_verify = true

System info:

Docker

Steps to reproduce:

  1. ...
  2. ...

Expected behavior:

Actual behavior:

2021-03-26T16:47:46Z I! Starting Telegraf 1.17.3
2021-03-26T16:47:46Z I! Using config file: /etc/telegraf/telegraf.conf
2021-03-26T16:47:46Z I! Loaded inputs: internal kafka_consumer statsd
2021-03-26T16:47:46Z I! Loaded aggregators:
2021-03-26T16:47:46Z I! Loaded processors: enum
2021-03-26T16:47:46Z I! Loaded outputs: influxdb_v2
2021-03-26T16:47:46Z I! Tags enabled: host=telegraf-polling-service
2021-03-26T16:47:46Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"telegraf-polling-service", Flush Interval:10s
2021-03-26T16:47:46Z D! [agent] Initializing plugins
2021-03-26T16:47:46Z D! [agent] Connecting outputs
2021-03-26T16:47:46Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2021-03-26T16:47:46Z D! [agent] Successfully connected to outputs.influxdb_v2
2021-03-26T16:47:46Z D! [agent] Starting service inputs
2021-03-26T16:47:46Z I! [inputs.statsd] UDP listening on "[::]:8125"
2021-03-26T16:47:46Z I! [inputs.statsd] Started the statsd service on ":8125"
2021-03-26T16:47:46Z D! [sarama] Initializing new client
2021-03-26T16:47:46Z D! [sarama] client/metadata fetching metadata for all topics from broker pkc-epwny.eastus.azure.confluent.cloud:9092
2021-03-26T16:47:46Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2021-03-26T16:47:46Z D! [sarama] Closed connection to broker pkc-epwny.eastus.azure.confluent.cloud:9092
2021-03-26T16:47:46Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2021-03-26T16:47:46Z D! [sarama] client/metadata no available broker to send metadata request to
2021-03-26T16:47:46Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2021-03-26T16:47:46Z D! [sarama] client/metadata retrying after 250ms... (3 attempts remaining)
2021-03-26T16:47:46Z D! [sarama] client/metadata fetching metadata for all topics from broker pkc-epwny.eastus.azure.confluent.cloud:9092
2021-03-26T16:47:46Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2021-03-26T16:47:46Z D! [sarama] Closed connection to broker pkc-epwny.eastus.azure.confluent.cloud:9092
2021-03-26T16:47:46Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2021-03-26T16:47:46Z D! [sarama] client/metadata no available broker to send metadata request to
2021-03-26T16:47:46Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2021-03-26T16:47:46Z D! [sarama] client/metadata retrying after 250ms... (2 attempts remaining)
2021-03-26T16:47:46Z D! [sarama] client/metadata fetching metadata for all topics from broker pkc-epwny.eastus.azure.confluent.cloud:9092
2021-03-26T16:47:46Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2021-03-26T16:47:46Z D! [sarama] Closed connection to broker pkc-epwny.eastus.azure.confluent.cloud:9092
2021-03-26T16:47:46Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2021-03-26T16:47:46Z D! [sarama] client/metadata no available broker to send metadata request to
2021-03-26T16:47:46Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2021-03-26T16:47:46Z D! [sarama] client/metadata retrying after 250ms... (1 attempts remaining)
2021-03-26T16:47:47Z D! [sarama] client/metadata fetching metadata for all topics from broker pkc-epwny.eastus.azure.confluent.cloud:9092
2021-03-26T16:47:47Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2021-03-26T16:47:47Z D! [sarama] Closed connection to broker pkc-epwny.eastus.azure.confluent.cloud:9092
2021-03-26T16:47:47Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2021-03-26T16:47:47Z D! [sarama] client/metadata no available broker to send metadata request to
2021-03-26T16:47:47Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2021-03-26T16:47:47Z D! [sarama] Closing Client
2021-03-26T16:47:47Z I! [inputs.statsd] Stopping the statsd service
2021-03-26T16:47:47Z I! [inputs.statsd] Stopped listener service on ":8125"
2021-03-26T16:47:47Z E! [telegraf] Error running agent: starting input inputs.kafka_consumer: kafka: client has run out of available brokers to talk to (Is your cluster reachable?)

Additional info:

@ghost ghost added the bug unexpected problem or unintended behavior label Mar 26, 2021
@sjwang90
Copy link
Contributor

This looks to be a issue with kafka_consumer's connection to Azure EventHubs #6342. You must set sasl_version = 0 and enable_tls = true in the plugin configuration.

Here is an example for Event Hub:

[[inputs.kafka_consumer]]
  brokers = ["myeventhub.servicebus.windows.net:9093"]
  version = "1.0.0"
  topics = ["mytopic"]
  enable_tls = true
  sasl_username = "$ConnectionString"
  sasl_password = "Endpoint=sb://myeventhub.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=00000000000000000000000000/11111111111111111"
  sasl_version = 0

@ghost
Copy link
Author

ghost commented Mar 29, 2021

Thanks for the suggestions.
Regarding the config:
enable_tls= true
Could you please advise which configration is this mapping to the Heml chart configuration as in the document:
https://github.com/influxdata/telegraf/tree/master/plugins/inputs/kafka_consumer
Also we are actually using a confluent cloud Kafka broker as data source for Telegraf, instead of Azure EH.

@ghost
Copy link
Author

ghost commented Mar 30, 2021

I have tried following telegraf helm chart config:
- kafka_consumer:
brokers:
- ".eastus.azure.confluent.cloud:9092"
topics:
- "telegraf"
version: "1.0.0"
sasl_username: ""
sasl_password: ""
consumer_group: "telegraf_metrics_consumers"
offset: "newest"
data_format: "influx"
sasl_version: 0
enable_tls: true
But still getting the same error, and pod is in crashing cycle
2021-03-30T14:33:53Z I! Starting Telegraf 1.17.3
2021-03-30T14:33:53Z I! Using config file: /etc/telegraf/telegraf.conf
2021-03-30T14:33:53Z I! Loaded inputs: internal kafka_consumer statsd
2021-03-30T14:33:53Z I! Loaded aggregators:
2021-03-30T14:33:53Z I! Loaded processors: enum
2021-03-30T14:33:53Z I! Loaded outputs: influxdb_v2
2021-03-30T14:33:53Z I! Tags enabled: host=telegraf-polling-service
2021-03-30T14:33:53Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"telegraf-polling-service", Flush Interval:10s
2021-03-30T14:33:53Z D! [agent] Initializing plugins
2021-03-30T14:33:53Z W! [kafka] enable_tls is deprecated, and the setting does nothing, you can safely remove it from the config
2021-03-30T14:33:53Z D! [agent] Connecting outputs
2021-03-30T14:33:53Z D! [agent] Attempting connection to [outputs.influxdb_v2]
2021-03-30T14:33:53Z D! [agent] Successfully connected to outputs.influxdb_v2
2021-03-30T14:33:53Z D! [agent] Starting service inputs
2021-03-30T14:33:53Z I! [inputs.statsd] UDP listening on "[::]:8125"
2021-03-30T14:33:53Z I! [inputs.statsd] Started the statsd service on ":8125"
2021-03-30T14:33:53Z D! [sarama] Initializing new client
2021-03-30T14:33:53Z D! [sarama] client/metadata fetching metadata for all topics from broker (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:53Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2021-03-30T14:33:53Z D! [sarama] Error while performing SASL handshake (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:53Z D! [sarama] Closed connection to broker (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:53Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2021-03-30T14:33:53Z D! [sarama] client/metadata no available broker to send metadata request to
2021-03-30T14:33:53Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2021-03-30T14:33:53Z D! [sarama] client/metadata retrying after 250ms... (3 attempts remaining)
2021-03-30T14:33:53Z D! [sarama] client/metadata fetching metadata for all topics from broker (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:53Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2021-03-30T14:33:53Z D! [sarama] Error while performing SASL handshake (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:53Z D! [sarama] Closed connection to broker (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:53Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2021-03-30T14:33:53Z D! [sarama] client/metadata no available broker to send metadata request to
2021-03-30T14:33:53Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2021-03-30T14:33:53Z D! [sarama] client/metadata retrying after 250ms... (2 attempts remaining)
2021-03-30T14:33:53Z D! [sarama] client/metadata fetching metadata for all topics from broker (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:53Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2021-03-30T14:33:53Z D! [sarama] Error while performing SASL handshake (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:53Z D! [sarama] Closed connection to broker (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:53Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2021-03-30T14:33:53Z D! [sarama] client/metadata no available broker to send metadata request to
2021-03-30T14:33:53Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2021-03-30T14:33:53Z D! [sarama] client/metadata retrying after 250ms... (1 attempts remaining)
2021-03-30T14:33:53Z D! [sarama] client/metadata fetching metadata for all topics from broker (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:54Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2021-03-30T14:33:54Z D! [sarama] Error while performing SASL handshake (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:54Z D! [sarama] Closed connection to broker (kafka).eastus.azure.confluent.cloud:9092
2021-03-30T14:33:54Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2021-03-30T14:33:54Z D! [sarama] client/metadata no available broker to send metadata request to
2021-03-30T14:33:54Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2021-03-30T14:33:54Z D! [sarama] Closing Client
2021-03-30T14:33:54Z I! [inputs.statsd] Stopping the statsd service
2021-03-30T14:33:54Z I! [inputs.statsd] Stopped listener service on ":8125"
2021-03-30T14:33:54Z E! [telegraf] Error running agent: starting input inputs.kafka_consumer: kafka: client has run out of available brokers to talk to (Is your cluster reachable?)
Could you please reopen this issue for advices, thanks.

@sjwang90
Copy link
Contributor

Didn't realize you were using confluent cloud. Your errors look similar to #7985 - let's follow up on that thread to determine a potential a solution.

@sjwang90 sjwang90 reopened this Mar 30, 2021
@wrossmorrow
Copy link

@sjwang90 I posted a comment on that thread yesterday, maybe having a similar issue.

@sjwang90
Copy link
Contributor

Workaround for Confluent Cloud/Azure EventHub using inputs.kafka_consumer:

In a previous issue that also applies to Confluent cloud, azure service bus presented TLS certs that don't
The workaround is to find the name the confluent tls cert presents and then configure telegraf to accept that name with the tls_server_name setting. It can also be helpful in debugging to try insecure_skip_verify to ignore the cert temporarily. insecure_skip_verify reduces security so it shouldn't be used outside of debugging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kafka bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants