istio-router-metrics is a Go program that acts as a Kafka consumer for Istio access logs. It reads Istio access log messages from a Kafka topic, processes them, and sends the relevant information to an InfluxDB instance.
Some notes on how the code in the main function is working:
- Reads configuration parameters from environment variables.
- Sets up a TCP connection to InfluxDB.
- Creates a Kafka consumer group using the Sarama package.
- Launches a goroutine to consume Kafka messages and process them.
- Waits for termination signals (SIGINT or SIGTERM) and gracefully shuts down the program.
Once the container image has been deployed to your preferred container registry and deployed to the cluster, then you can check the akkeris-system
namespace to see if this is operating correctly.
As specified in the main function in the Go code a line that will print that things are up and running once the consumer is ready:
<-consumer.ready // Await till the consumer has been set up
log.Println("Sarama consumer up and running!...")
Therefore if the container is functioning correctly logs in kubernetes would show the following from a properly functioning pod:
kubectl logs istio-router-metrics-<pod>-<id>
Space Blacklist: [istio-system]
istio-router-metrics-1121-1-2023-04-04T13:13:41.295167-06:00
[akkeris-kafka-cluster-1.example.com:9092 akkeris-kafka-cluster-2.example.com:9092 akkeris-kafka-cluster-3.example.com:9092]
2023/04/04 13:13:42 Sarama consumer up and running!...
These curl
commands are ways to interact with metric collection system (presumably InfluxDB) and query the metrics generated by your Go application. They can help you verify if your Sarama consumer is working as expected and producing the relevant metrics for monitoring and observability.
Router Service Time Metrics:
curl -G -d "start=3h-ago" -d "m=sum:5m-avg:router.service.ms{fqdn=event-perf-prd.alamoapp.octanner.io}" http://10.84.25.51:4242/api/query | jq '.'
This command queries the metric router.service.ms for the specified FQDN (Fully Qualified Domain Name). It calculates the 5-minute average service time over the last 3 hours. If your Sarama consumer is working correctly, it should be producing these metrics, and you should see a meaningful response from this query.
Router HTTP Status Code Metrics:
curl -G -d "start=3h-ago" -d "m=sum:1m-sum:router.status.200{fqdn=event-perf-prd.alamoapp.octanner.io}" http://10.84.25.51:4242/api/query | jq '.'
This command queries the metric router.status.200 for the specified FQDN, calculating the sum of HTTP status code 200 over the last 3 hours. It helps verify if successful requests are being processed. If the consumer is working, you should see a relevant response.
Lookup Metric Names for a Specific FQDN:
curl -G -d "m=*{fqdn=event-perf-prd.alamoapp.octanner.io}" http://10.84.25.51:4242/api/search/lookup | jq '.'
This command retrieves a list of metric names for the specified FQDN. It's a useful way to explore available metrics. If your consumer is producing metrics, you should see them listed in the response.