kafka exporter does not show all Consumer Group #5341
Replies: 7 comments 10 replies
-
If you can properly format the YAML, it will be readable and someone could verify if you have it correct. This way nobody can read it. You should also check if the metrics are in Prometheus. |
Beta Was this translation helpful? Give feedback.
-
hi scholz, i just attached the cluster.yaml in a .zip file for anyone to review the metrics would have to be in Prometheus because if not, I would be seeing Strimzi's Consumer group defualt __strimzi-topic-operator-kstreams If not, how could I validate this? |
Beta Was this translation helpful? Give feedback.
-
Good morning Scholzj, I have access to Prometheus and if I look for this metric kafka_consumergroup_current_offset that is used in the grafagna dashboards I only see this result: In many cases I have Consumers Group that is created automatically because Databricks is used to connect and subscribe to kafka Topics, I leave a view with AKHQ of some ConsumerGroups created automatically that have already consumed messages and are in Empty state because after a while they are deleted automatically I also leave another screen taken where another active ConsumerGroup is seen but that is not seen in the Prometheus metrics, these ConsumerGroups last a very short time active but if I connect from kafka-console-consumer and consume a topic leaving the ConsumerGroup active, I do not see the metrics |
Beta Was this translation helpful? Give feedback.
-
hello scholzj, is how you indicate if there are no members the status is "empty" so it is not active, as I said if I connect from a server with kafka-avro-consumer and leave it consuming a topic in the console I see that it is active the ConsumerGroup but I don't see it in the kafkaExporter metrics: This screen shows a consumerGroup called "console-consumer-88980" that is created with kafka-arvro-consumer from a server and it is seen that it is active with 1 member, this consumer I can leave it active all the time but I cannot see it in Grafagna's dashboar and I don't see it in the metric kafka_consumergroup_current_offset from Promtheus: |
Beta Was this translation helpful? Give feedback.
-
Could you tell me how you are validating this in Prometheus and how are you seeing it in the Grafagna Dashboard? |
Beta Was this translation helpful? Give feedback.
-
In prometheus I see the following if I make the query to kafka_consumergroup_current_offset, so you are saying if you throw this query in Promethues you see all the ConsumerGroups I don't know how the kafkaExporter is receiving the ConsumerGroups or it could be that this KafkaExporter only report data for groups that are returned by the Kafka AdminClient if in prometheus I see another metric for example kafka_consumergroup_members I see other consumerGroup |
Beta Was this translation helpful? Give feedback.
-
I am having the same issue, consumer groups from a Do you know of any needed meta information that need to be included to show up in the metrics as consumer group @scholzj ? I will also open up tickets in the related connectors and see if someone else has the issue, if I find a solution I will post it here as well |
Beta Was this translation helpful? Give feedback.
-
hello I have a problem with kafka exporter does not show all Consumer Groups, it only shows the following Consumer Group __strimzi-topic-operator-kstreams in Grafana Dashboard (examples/metrics/grafana-dashboards/strimzi-kafka-exporter.json)
this is the infrastructure installed with Azure AKS:
This is my cluster.yaml file to deploy the strimzi cluster with the strimzi operator:
kafkaExporter config, groupRegex: .* and topicRegex: .* :
kafkaExporter:
groupRegex: "."
topicRegex: "."
resources:
requests:
cpu: 300m
memory: 128Mi
limits:
cpu: 600m
memory: 256Mi
logging: debug
enableSaramaLogging: true
readinessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
livenessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: kafka-cluster
spec:
kafka:
version: 2.8.0
replicas: 3
resources:
requests:
memory: 4G
cpu: "1"
limits:
memory: 10G
cpu: "2"
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: external
port: 9094
type: loadbalancer
tls: false
configuration:
bootstrap:
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
brokers:
- broker: 0
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
- broker: 1
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
- broker: 2
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
readinessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
livenessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
config:
auto.create.topics.enable: "true"
delete.topic.enable: "true"
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
log.message.format.version: "2.8"
inter.broker.protocol.version: "2.8"
default.replication.factor: 3
min.insync.replicas: 2
num.partitions: 3
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 250Gi
class: managed-standardssd-retain-sc
deleteClaim: false
- id: 2
type: persistent-claim
size: 250Gi
class: managed-standardssd-retain-sc
deleteClaim: false
- id: 3
type: persistent-claim
size: 250Gi
class: managed-standardssd-retain-sc
deleteClaim: false
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: kafka-metrics
key: kafka-metrics-config.yml
logging:
type: inline
loggers:
kafka.root.logger.level: "INFO"
zookeeper:
replicas: 3
resources:
requests:
memory: 500M
limits:
memory: 1G
cpu: "1"
storage:
type: persistent-claim
size: 100G
class: managed-standardssd-retain-sc
deleteClaim: false
metricsConfig:
type: jmxPrometheusExporter
valueFrom:
configMapKeyRef:
name: kafka-metrics
key: zookeeper-metrics-config.yml
logging:
type: inline
loggers:
zookeeper.root.logger: "INFO"
kafkaExporter:
groupRegex: "."
topicRegex: "."
resources:
requests:
cpu: 300m
memory: 128Mi
limits:
cpu: 600m
memory: 256Mi
logging: debug
enableSaramaLogging: true
readinessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
livenessProbe:
initialDelaySeconds: 15
timeoutSeconds: 5
entityOperator:
userOperator: {}
topicOperator: {}
kind: ConfigMap
apiVersion: v1
metadata:
name: kafka-metrics
labels:
app: strimzi
data:
kafka-metrics-config.yml: |
# See https://github.com/prometheus/jmx_exporter for more info about JMX Prometheus Exporter metrics
lowercaseOutputName: true
rules:
# Special cases and very specific rules
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), topic=(.+), partition=(.)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
topic: "$4"
partition: "$5"
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), brokerHost=(.+), brokerPort=(.+)><>Value
name: kafka_server_$1_$2
type: GAUGE
labels:
clientId: "$3"
broker: "$4:$5"
- pattern: kafka.server<type=(.+), cipher=(.+), protocol=(.+), listener=(.+), networkProcessor=(.+)><>connections
name: kafka_server_$1_connections_tls_info
type: GAUGE
labels:
listener: "$2"
networkProcessor: "$3"
protocol: "$4"
cipher: "$5"
- pattern: kafka.server<type=(.+), clientSoftwareName=(.+), clientSoftwareVersion=(.+), listener=(.+), networkProcessor=(.+)><>connections
name: kafka_server_$1_connections_software
type: GAUGE
labels:
clientSoftwareName: "$2"
clientSoftwareVersion: "$3"
listener: "$4"
networkProcessor: "$5"
- pattern: "kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+):"
name: kafka_server_$1_$4
type: GAUGE
labels:
listener: "$2"
networkProcessor: "$3"
- pattern: kafka.server<type=(.+), listener=(.+), networkProcessor=(.+)><>(.+)
name: kafka_server_$1_$4
type: GAUGE
labels:
listener: "$2"
networkProcessor: "$3"
# Some percent metrics use MeanRate attribute
# Ex) kafka.server<type=(KafkaRequestHandlerPool), name=(RequestHandlerAvgIdlePercent)><>MeanRate
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w><>MeanRate
name: kafka_$1_$2_$3_percent
type: GAUGE
# Generic gauges for percents
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*><>Value
name: kafka_$1_$2_$3_percent
type: GAUGE
- pattern: kafka.(\w+)<type=(.+), name=(.+)Percent\w*, (.+)=(.+)><>Value
name: kafka_$1_$2_$3_percent
type: GAUGE
labels:
"$4": "$5"
# Generic per-second counters with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, (.+)=(.+)><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)PerSec\w*><>Count
name: kafka_$1_$2_$3_total
type: COUNTER
# Generic gauges with 0-2 key/value pairs
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Value
name: kafka_$1_$2_$3
type: GAUGE
# Emulate Prometheus 'Summary' metrics for the exported 'Histogram's.
# Note that these are missing the 'sum' metric!
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+), (.+)=(.+)><>Count
name: kafka$1_$2_$3_count
type: COUNTER
labels:
"$4": "$5"
"$6": "$7"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.), (.+)=(.+)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
"$6": "$7"
quantile: "0.$8"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
labels:
"$4": "$5"
- pattern: kafka.(\w+)<type=(.+), name=(.+), (.+)=(.)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
"$4": "$5"
quantile: "0.$6"
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>Count
name: kafka_$1_$2_$3_count
type: COUNTER
- pattern: kafka.(\w+)<type=(.+), name=(.+)><>(\d+)thPercentile
name: kafka_$1_$2_$3
type: GAUGE
labels:
quantile: "0.$4"
zookeeper-metrics-config.yml: |
# See https://github.com/prometheus/jmx_exporter for more info about JMX Prometheus Exporter metrics
lowercaseOutputName: true
rules:
# replicated Zookeeper
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\d+)><>(\w+)"
name: "zookeeper_$2"
type: GAUGE
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\d+), name1=replica.(\d+)><>(\w+)"
name: "zookeeper_$3"
type: GAUGE
labels:
replicaId: "$2"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\d+), name1=replica.(\d+), name2=(\w+)><>(Packets\w+)"
name: "zookeeper_$4"
type: COUNTER
labels:
replicaId: "$2"
memberType: "$3"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\d+), name1=replica.(\d+), name2=(\w+)><>(\w+)"
name: "zookeeper_$4"
type: GAUGE
labels:
replicaId: "$2"
memberType: "$3"
- pattern: "org.apache.ZooKeeperService<name0=ReplicatedServer_id(\d+), name1=replica.(\d+), name2=(\w+), name3=(\w+)><>(\w+)"
name: "zookeeper_$4_$5"
type: GAUGE
labels:
replicaId: "$2"
memberType: "$3"
Beta Was this translation helpful? Give feedback.
All reactions