Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when I try to consume from confluent cloud kafka broker #7985

Closed
AlberTadrous opened this issue Aug 14, 2020 · 11 comments
Closed

Error when I try to consume from confluent cloud kafka broker #7985

AlberTadrous opened this issue Aug 14, 2020 · 11 comments
Labels
area/kafka bug unexpected problem or unintended behavior cloud Issues or requests around cloud environments

Comments

@AlberTadrous
Copy link

AlberTadrous commented Aug 14, 2020

Relevant telegraf.conf:

[[inputs.kafka_consumer]]
brokers = ["cluster.eastus2.azure.confluent.cloud:9092"]
topics = ["test-topic"]
version = "1.1.0"

sasl_username = "Key"
sasl_password = "Secret"
sasl_version = 1
consumer_group = "telegraf"
offset = "oldest"

max_message_len = 1000000

System info:

Telegraf version: 1.15
Operating system: Centos 8

Docker

Steps to reproduce:

  1. run telegraf using this command "telegraf --config telegraf.conf"

Expected behavior:

Connect to kafka cluster

Actual behavior:

Get Exception:
[agent] Config: Interval:1s, Quiet:false, Hostname:"", Flush Interval:10s
2020-08-14T14:57:11Z D! [agent] Initializing plugins
2020-08-14T14:57:11Z D! [agent] Connecting outputs
2020-08-14T14:57:11Z D! [agent] Attempting connection to [outputs.file]
2020-08-14T14:57:11Z D! [agent] Successfully connected to outputs.file
2020-08-14T14:57:11Z D! [agent] Starting service inputs
2020-08-14T14:57:11Z D! [sarama]
2020-08-14T14:57:11Z D! [sarama] client/metadata fetching metadata for all topics from broker cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2020-08-14T14:57:11Z D! [sarama] Error while performing SASL handshake cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] Closed connection to broker cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2020-08-14T14:57:11Z D! [sarama]
2020-08-14T14:57:11Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2020-08-14T14:57:11Z D! [sarama] client/metadata retrying after 250ms... (3 attempts remaining)
2020-08-14T14:57:11Z D! [sarama] client/metadata fetching metadata for all topics from broker cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2020-08-14T14:57:11Z D! [sarama] Error while performing SASL handshake cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] Closed connection to broker cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2020-08-14T14:57:11Z D! [sarama]
2020-08-14T14:57:11Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2020-08-14T14:57:11Z D! [sarama] client/metadata retrying after 250ms... (2 attempts remaining)
2020-08-14T14:57:11Z D! [sarama] client/metadata fetching metadata for all topics from broker cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2020-08-14T14:57:11Z D! [sarama] Error while performing SASL handshake cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] Closed connection to broker cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2020-08-14T14:57:11Z D! [sarama]
2020-08-14T14:57:11Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2020-08-14T14:57:11Z D! [sarama] client/metadata retrying after 250ms... (1 attempts remaining)
2020-08-14T14:57:11Z D! [sarama] client/metadata fetching metadata for all topics from broker cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] Failed to read SASL handshake header : unexpected EOF
2020-08-14T14:57:11Z D! [sarama] Error while performing SASL handshake cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] Closed connection to broker cluster.eastus2.azure.confluent.cloud:9092
2020-08-14T14:57:11Z D! [sarama] client/metadata got error from broker -1 while fetching metadata: unexpected EOF
2020-08-14T14:57:11Z D! [sarama]
2020-08-14T14:57:11Z D! [sarama] client/brokers resurrecting 1 dead seed brokers
2020-08-14T14:57:11Z D! [sarama]
2020-08-14T14:57:11Z E! [telegraf] Error running agent: starting input inputs.kafka_consumer: kafka: client has run out of available brokers to talk to (Is your cluster reachable?)

Additional info:

@AlberTadrous AlberTadrous added the bug unexpected problem or unintended behavior label Aug 14, 2020
@ssoroka ssoroka added the cloud Issues or requests around cloud environments label Oct 30, 2020
@wtkwsk
Copy link

wtkwsk commented Nov 26, 2020

I'm still experiencing the same issue. Apparently the sarama library has some issues with Confluent Cloud

@ssoroka
Copy link
Contributor

ssoroka commented Mar 12, 2021

Is this still an issue? Looks like it sat for a bit, sorry about that. If you know of a related Sarama issue that is causing your problem, feel free to link it here. If it's no longer an issue we'd love to hear what resolved it for you.

@sjwang90
Copy link
Contributor

@wtkwsk @AlberTadrous Closing this issue. Feel free to re-open if issue persists and hasn't been resolved by sarama.

@wtkwsk
Copy link

wtkwsk commented Mar 29, 2021

Thanks, actually I for my part was able to resolve it after tinkering around with sarama settings for a bit. Getting the connection to work with sarama directly helped me understand the telegraf config options better:)

@sjwang90
Copy link
Contributor

sjwang90 commented Mar 30, 2021

@wtkwsk Do you mind sharing what you did with the sarama settings for others that may run into the same problems. We'd love to get anything useful into the documentation to assist with other.

FYI @gqianse

@wtkwsk
Copy link

wtkwsk commented Apr 6, 2021

@sjwang90 I might be able to find my working telegraf config later, but here is my working sarama client config:

       // Sarama Kafka
	config := sarama.NewConfig()
	config.ClientID = "sarama-cc-test" 
	config.Producer.RequiredAcks = sarama.WaitForAll // Wait for all in-sync replicas to ack the message
	config.Producer.Retry.Max = 10                   // Retry up to 10 times to produce the message
	config.Producer.Return.Successes = true
	config.Net.SASL.Enable = true
	config.Net.SASL.Handshake = true
	config.Net.SASL.Mechanism = "PLAIN"
	config.Net.SASL.User = os.Getenv("SASL_USER")
	config.Net.SASL.Password = os.Getenv("SASL_PW")
	config.Net.TLS.Enable = true
	tlsConfig := &tls.Config{
		InsecureSkipVerify: true,
		ClientAuth: 0,
	}
	config.Net.TLS.Config = tlsConfig

This works with the most recent paid Confluent Cloud version.

@wtkwsk
Copy link

wtkwsk commented Apr 7, 2021

@sjwang90 @AlberTadrous
This is the telegraf config that got it working for me on telegraf 1.18 and the most recent Confluent Cloud:

[[inputs.kafka_consumer]]
## kafka servers
   brokers = ["$KAFKA_BROKER"]
## topic(s) to consume
   topics = ["$KAFKA_TOPIC"]

## Optional Client id
   client_id = "cc-telegraf-influx"

## Various other settings
   enable_tls = true
   insecure_skip_verify = true
   # sasl_version = 0
   # version = "2.0.0"

## Optional SASL Config
    sasl_username = "$CONFLUENT_API_KEY"
    sasl_password = "$CONFLUENT_API_SECRET"
    sasl_mechanism = "PLAIN"

@sjwang90
Copy link
Contributor

sjwang90 commented Apr 7, 2021

Thank you @wtkwsk!

@wrossmorrow
Copy link

wrossmorrow commented Apr 16, 2021

I'm also having this issue in GCP with ccloud, running

$ telegraf --config /path/to/telegraf.conf

with

...
[[inputs.kafka_consumer]]
  brokers = ["***.gcp.confluent.cloud:9092"]
  topics = ["test"]
  sasl_mechanism = "PLAIN"
  sasl_username = "***"
  sasl_password = "***"
  client_id = "telegraf"
  consumer_group = "test-influxcloud-ingest"
  ...

results in

2021-04-16T02:21:28Z I! Starting Telegraf 1.18.1
2021-04-16T02:21:28Z I! Loaded inputs: kafka_consumer
2021-04-16T02:21:28Z I! Loaded aggregators: 
2021-04-16T02:21:28Z I! Loaded processors: 
2021-04-16T02:21:28Z I! Loaded outputs: influxdb_v2
2021-04-16T02:21:28Z I! Tags enabled: host=C02F32KSMD6Q
2021-04-16T02:21:28Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"C02F32KSMD6Q", Flush Interval:10s
2021-04-16T02:21:29Z E! [telegraf] Error running agent: starting input inputs.kafka_consumer: kafka: client has run out of available brokers to talk to (Is your cluster reachable?)

It's definitely reachable, I can ping it, run ccloud consumers, and run code with SASL_SSL configurations.

@sjwang90
Copy link
Contributor

@wrossmorrow In a previous issue that also applies to Confluent cloud, azure service bus presented TLS certs that don't match the domain name.

The workaround is to find the name the confluent tls cert presents and then configure telegraf to accept that name with the tls_server_name setting. It can also be helpful in debugging to try insecure_skip_verify to ignore the cert temporarily. insecure_skip_verify reduces security so it shouldn't be used outside of debugging.

Your errors don't look identical to the existing error messages for these confluent cloud issues. If these workaround don't work please let us know and open up a new issue with the pertinent telegraf configuration and error messaging info.

Thanks!

@kamtoeddy
Copy link

@sjwang90 I might be able to find my working telegraf config later, but here is my working sarama client config:

       // Sarama Kafka
	config := sarama.NewConfig()
	config.ClientID = "sarama-cc-test" 
	config.Producer.RequiredAcks = sarama.WaitForAll // Wait for all in-sync replicas to ack the message
	config.Producer.Retry.Max = 10                   // Retry up to 10 times to produce the message
	config.Producer.Return.Successes = true
	config.Net.SASL.Enable = true
	config.Net.SASL.Handshake = true
	config.Net.SASL.Mechanism = "PLAIN"
	config.Net.SASL.User = os.Getenv("SASL_USER")
	config.Net.SASL.Password = os.Getenv("SASL_PW")
	config.Net.TLS.Enable = true
	tlsConfig := &tls.Config{
		InsecureSkipVerify: true,
		ClientAuth: 0,
	}
	config.Net.TLS.Config = tlsConfig

This works with the most recent paid Confluent Cloud version.

Thanks
This just saved my day

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kafka bug unexpected problem or unintended behavior cloud Issues or requests around cloud environments
Projects
None yet
Development

No branches or pull requests

6 participants