Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md

HAProxy

Overview

How to install:

Add in HAProxy (or make sure that you have) next rules to enable statistics on socket

vi /etc/haproxy/haproxy.cfg
stats socket /var/lib/haproxy/stats mode 666 level admin
stats timeout 30s

Install socat and nc: yum install nc socat -yum
Make sure that HAProxy user can read from socket :sudo -uhaproxy echo "show info;show stat" | socat stdio unix-connect:/var/lib/haproxy/stats
Copy files:

a) userparameter\_haproxy.conf in /etc/zabbix/zabbix\_agentd.d/

b) haproxy_discovery.sh in /etc/zabbix/scripts/

c) haproxy_stats.sh in /etc/zabbix/scripts/

Make b and c scripts executable with chmod +x script_name

Note: Make sure that /etc/zabbix/scripts/ exist, if not, create it: mkdir -p /etc/zabbix/scripts/

Add host for HAProxy in Zabbix, add template, wait some time for get data

(You can change LLD discovery time to get data more faster, but after change to initial)

This template is based on:

a) Solution by Anastas Dancha - https://github.com/anapsix/zabbix-haproxy

b) Official template from Zabbix for Zabbix > 4.4 - https://www.zabbix.com/integrations/haproxy

The reason why I create this template was to have official zabbix template logic in Zabbix under 4.4

Files are there

a) https://cloud.mail.ru/public/D2M5%2F7ZEamjnVF

b) https://drive.google.com/open?id=16xoJyWut9R\_EudcRyAf2Ui8WuPyTxw6D

Write to [email protected] if something is not clear

Have a nice day

Author

Tudor Ticau

Macros used

Name	Description	Default	Type
{$HAPROXY_CONFIG}	-	`/etc/haproxy/haproxy.cfg`	Text macro
{$HAPROXY_SOCK}	-	`/var/lib/haproxy/stats`	Text macro

Template links

There are no template links in this template.

Discovery rules

Name	Description	Type	Key and additional info
HAProxy server discovery	-	`Zabbix agent`	haproxy.list.discovery[{$HAPROXY_SOCK},SERVER] Update: 1h
HAProxy backend discovery	-	`Zabbix agent`	haproxy.list.discovery[{$HAPROXY_SOCK},BACK] Update: 1h
HAProxy frontend discovery	-	`Zabbix agent`	haproxy.list.discovery[{$HAPROXY_SOCK},FRONT] Update: 1d

Items collected

Name	Description	Type	Key and additional info
HAProxy memory used	-	`Zabbix agent`	proc.mem[haproxy] Update: 300
HAProxy config file checksum ($1)	-	`Zabbix agent`	vfs.file.cksum[{$HAPROXY_CONFIG}] Update: 600
HAProxy number of running processes	-	`Zabbix agent`	proc.num[haproxy] Update: 60
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Responses denied per second	Responses denied due to security concerns (ACL-restricted). In most cases denials will originate in the frontend (e.g., a user is attempting to access an unauthorized URL). However, sometimes a request may be benign, yet the corresponding response contains sensitive information. In that case, you would want to set up an ACL to deny the offending response. Backend responses that are denied due to ACL restrictions will emit a 502 error code. With properly configured access controls on frontend, this metric should stay at or near zero. Denied responses and an increase in 5xx responses go hand-in-hand. If you are seeing a large number of 5xx responses, you should check your denied responses to shed some light on the increase in error codes	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},dresp] Update: 60 LLD
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Errors connection per second	Number of requests that encountered an error attempting to connect to a backend server. Backend connection failures should be acted upon immediately. Unfortunately, the econ metric not only includes failed backend requests but additionally includes general backend errors, like a backend without an active frontend. Thankfully, correlating this metric with eresp and response codes from both frontend and backend servers will give a better idea of the causes of an increase in backend connection errors.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},econ] Update: 60 LLD
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Response errors per second	Number of requests whose responses yielded an error This represents the number of response errors generated by your backends. This includes errors caused by data transfers aborted by the servers as well as write errors on the client socket and failures due to ACLs. Combined with other error metrics, the backend error response rate helps diagnose the root cause of response errors. For example, an increase in both the backend error response rate and denied responses could indicate that clients are repeatedly attempting to access ACL-ed resources.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},eresp] Update: 60 LLD
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Number of responses with codes 4xx per second	Number of HTTP client errors per second.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},hrsp_4xx] Update: 60 LLD
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Number of responses with codes 5xx per second	Number of HTTP server errors per second.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},hrsp_5xx] Update: 60 LLD
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Unassigned requests	Current number of requests unassigned in queue. The qcur metric tracks the current number of connections awaiting assignment to a backend server. If you have enabled cookies and the listed server is unavailable, connections will be queued until the queue timeout is reached	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},qcur] Update: 60 LLD
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Time in queue	Average time spent in queue (in ms) for the last 1,024 requests Minimizing time spent in the queue results in lower latency and an overall better client experience. Each use case can tolerate a certain amount of queue time but in general, you should aim to keep this value as low as possible	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},qtime] Update: 60 LLD
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Responses time	Average backend response time (in ms) for the last 1,024 requests Tracking average response times is an effective way to measure the latency of your load-balancing setup. Generally speaking, response times in excess of 500 ms will lead to degradation of application performance and customer experience. Monitoring the average response time can give you the upper hand to respond to latency issues before your customers are substantially impacted. Keep in mind that this metric will be zero if you are not using HTTP (see #60)	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},rtime] Update: 60 LLD
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Status	HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}] status UP = 1 DOWN = 0	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},status] Update: 60 LLD
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Redispatched requests per second	Number of times a request was redispatched to a different backend. The redispatch rate metric tracks the number of times a client connection was unable to reach its original target, and was subsequently sent to a different server. If a client holds a cookie referencing a backend server that is down, the default action is to respond to the client with a 502 status code. However, if is enabled option redispatch in haproxy.cfg, the request will be sent to any available backend server and the cookie will be ignored.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},wredis] Update: 60 LLD
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Retried connections per second	Number of times a connection was retried. Some dropped or timed-out connections are to be expected when connecting to a backend server. The retry rate represents the number of times a connection to a backend server was retried. This metric is usually non-zero under normal operating conditions. Should you begin to see more retries than usual, it is likely that other metrics will also change, including econ and eresp. Tracking the retry rate in addition to the above two error metrics can shine some light on the true cause of an increase in errors	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},{#SERVER_NAME},wretr] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}] bytes in	HAProxy Backend bytes in	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,bin] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}] bytes out	HAProxy Backend bytes out	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,bout] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}]: Responses denied per second	Responses denied due to security concerns (ACL-restricted). In most cases denials will originate in the frontend (e.g., a user is attempting to access an unauthorized URL). However, sometimes a request may be benign, yet the corresponding response contains sensitive information. In that case, you would want to set up an ACL to deny the offending response. Backend responses that are denied due to ACL restrictions will emit a 502 error code. With properly configured access controls on frontend, this metric should stay at or near zero. Denied responses and an increase in 5xx responses go hand-in-hand. If you are seeing a large number of 5xx responses, you should check your denied responses to shed some light on the increase in error codes	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,dresp] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}]: Errors connection per second	Number of requests that encountered an error attempting to connect to a backend server. Backend connection failures should be acted upon immediately. Unfortunately, the econ metric not only includes failed backend requests but additionally includes general backend errors, like a backend without an active frontend. Thankfully, correlating this metric with eresp and response codes from both frontend and backend servers will give a better idea of the causes of an increase in backend connection errors.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,econ] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}] : Response errors per second	Number of requests whose responses yielded an error This represents the number of response errors generated by your backends. This includes errors caused by data transfers aborted by the servers as well as write errors on the client socket and failures due to ACLs. Combined with other error metrics, the backend error response rate helps diagnose the root cause of response errors. For example, an increase in both the backend error response rate and denied responses could indicate that clients are repeatedly attempting to access ACL-ed resources.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,eresp] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}]: Unassigned requests	Current number of requests unassigned in queue. The qcur metric tracks the current number of connections awaiting assignment to a backend server. If you have enabled cookies and the listed server is unavailable, connections will be queued until the queue timeout is reached	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,qcur] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}]: Time in queue	Average time spent in queue (in ms) for the last 1,024 requests Minimizing time spent in the queue results in lower latency and an overall better client experience. Each use case can tolerate a certain amount of queue time but in general, you should aim to keep this value as low as possible	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,qtime] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}]: Responses time	Average backend response time (in ms) for the last 1,024 requests Tracking average response times is an effective way to measure the latency of your load-balancing setup. Generally speaking, response times in excess of 500 ms will lead to degradation of application performance and customer experience. Monitoring the average response time can give you the upper hand to respond to latency issues before your customers are substantially impacted. Keep in mind that this metric will be zero if you are not using HTTP (see #60)	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,rtime] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}]: Status	HAProxy Backend [{#BACKEND_NAME}] status UP = 1 DOWN = 0	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,status] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}]: Redispatched requests per second	Number of times a request was redispatched to a different backend. The redispatch rate metric tracks the number of times a client connection was unable to reach its original target, and was subsequently sent to a different server. If a client holds a cookie referencing a backend server that is down, the default action is to respond to the client with a 502 status code. However, if is enabled option redispatch in haproxy.cfg, the request will be sent to any available backend server and the cookie will be ignored.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,wredis] Update: 60 LLD
HAProxy Backend [{#BACKEND_NAME}]: Retried connections per second	Number of times a connection was retried. Some dropped or timed-out connections are to be expected when connecting to a backend server. The retry rate represents the number of times a connection to a backend server was retried. This metric is usually non-zero under normal operating conditions. Should you begin to see more retries than usual, it is likely that other metrics will also change, including econ and eresp. Tracking the retry rate in addition to the above two error metrics can shine some light on the true cause of an increase in errors	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#BACKEND_NAME},BACKEND,wretr] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Incoming traffic	Number of bits received by the frontend	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,bin] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Outgoing traffic	Number of bits sent by the frontend	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,bout] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Denied requests per second	Requests denied due to security concerns (ACL-restricted) per second. An increase in denied requests will subsequently cause an increase in 403 Forbidden codes. - For tcp this is because of a matched tcp-request content rule. - For http this is because of a matched http-request or tarpit rule. Correlating the two can help to discern the root cause of an increase in 4xx responses.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,dreq] Update: 60 LLD
HAProxy Frontend[{#FRONTEND_NAME}]: Request errors per second	HTTP request errors per second. The frontend request rate measures the number of requests received over the last second. Keeping an eye on peaks and drops is essential to ensure continuous service availability. In the event of a traffic spike, clients could see increases in latency or even denied connections. Some of the possible causes are: - early termination from the client, before the request has been sent. - read error from the client - client timeout - client closed connection - various bad requests from the client. - request was tarpitted.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,ereq] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Number of responses with codes 1xx per second	Number of informational (1xx) HTTP responses per second.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,hrsp_1xx] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Number of responses with codes 2xx per second	Number of successful HTTP responses per second. ( with 2xx code)	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,hrsp_2xx] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Number of responses with codes 3xx per second	Number of HTTP redirections per second.. ( with 3xx code)	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,hrsp_3xx] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Number of responses with codes 4xx per second	Number of HTTP client errors per second. ( with 4xx code) Ideally, all responses forwarded by HAProxy would be class 2xx codes, so an unexpected surge in the number of other code classes could be a sign of trouble. Correlating the denial metrics with the response code data can shed light on the cause of an increase in error codes. No change in denials coupled with an increase in the number of 404 responses could point to a misconfigured application or unruly client.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,hrsp_4xx] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Number of responses with codes 5xx per second	Number of HTTP server errors per second. ( with 5xx code) Ideally, all responses forwarded by HAProxy would be class 2xx codes, so an unexpected surge in the number of other code classes could be a sign of trouble. Correlating the denial metrics with the response code data can shed light on the cause of an increase in error codes.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,hrsp_5xx] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Number of responses with other codes per second	Number of other HTTP server errors per second. ( all other codes, no 1-5xx)	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,hrsp_other] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Sessions rate	Number of sessions created per second A significant spike in the number of sessions over a short period could cripple server operations and bring servers down	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,rate] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Requests rate	HTTP requests per second. The frontend request rate measures the number of requests received over the last second. Keeping an eye on peaks and drops is essential to ensure continuous service availability. In the event of a traffic spike, clients could see increases in latency or even denied connections.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,req_rate] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Established sessions	The current number of established sessions.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,scur] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Session limits	The most simultaneous sessions that are allowed, as defined by the maxconn setting in the frontend.	`Zabbix agent (active)`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,slim] Update: 60 LLD
HAProxy Frontend [{#FRONTEND_NAME}]: Session utilization	Percentage of sessions used (scur / slim * 100). For every HAProxy session, two connections are consumed—one for the client to HAProxy, and the other for HAProxy to your backend. Alerting on this metric is essential to ensure your server has sufficient capacity to handle all concurrent sessions. Unlike requests, upon reaching the session limit HAProxy will deny additional clients until resource consumption drops.	`Calculated`	haproxy.stats[{$HAPROXY_SOCK},{#FRONTEND_NAME},FRONTEND,sutil] Update: 60 LLD

Triggers

Name	Description	Expression	Priority
HAProxy Backend [{#BACKEND_NAME}]: Average response time is more than 10 sec for 5m	Average backend response time (in ms) for the last 1,024 requests is more than 10 seconds. Tracking average response times is an effective way to measure the latency of haproxy load-balancing setup. Generally speaking, response times in excess of 500 ms will lead to degradation of application performance and customer experience. Monitoring the average response time can give you the upper hand to respond to latency issues before your customers are substantially impacted. Keep in mind that this metric will be zero if you are not using HTTP	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},BACKEND,rtime].min(5m)}>10s Recovery expression:	average
HAProxy Backend [{#BACKEND_NAME}]: Average time spent in queue is more than 10 sec for 5m	Average time spent in queue (in ms) for the last 1,024 requests is more than 10 s. It is obviously that minimizing time spent in the queue results in lower latency and an overall better client experience. Each use case can tolerate a certain amount of queue time but in general, you should aim to keep this value as low as possible	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},BACKEND,qtime].min(5m)}>10s Recovery expression:	average
HAProxy Backend [{#BACKEND_NAME}]: Current number of requests unassigned in queue is more than 10 for 5m	Current number of requests on backend unassigned in queue is more than 10. If your backend is bombarded with connections to the point you have reached your global maxconn limit, HAProxy will seamlessly queue new connections in system kernel’s socket queue until a backend server becomes available. Keeping connections out of the queue is ideal, resulting in less latency and a better user experience. You should alert if the size of your queue exceeds the threshold. If you find that connections are consistently enqueueing, configuration changes may be in order, such as increasing global maxconn limit or changing the connection limits on individual backend servers. Keep in mind: empty queue = happy client	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},BACKEND,qcur].min(5m)}>10 Recovery expression:	average
HAProxy Backend [{#BACKEND_NAME}]: Number of responses with error is more than 10 for 5m	Number of requests on backend, whose responses yielded an error, is more than 10. The backend error response rate represents the number of response errors generated by your backends. This includes errors caused by data transfers aborted by the servers as well as write errors on the client socket and failures due to ACLs. Combined with other error metrics, the backend error response rate helps diagnose the root cause of response errors. For example, an increase in both the backend error response rate and denied responses could indicate that clients are repeatedly attempting to access ACL-ed resources.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},BACKEND,eresp].min(5m)}>10 Recovery expression:	average
HAProxy Backend [{#BACKEND_NAME}]: Server is DOWN	HAProxy Backend [{#BACKEND_NAME}] is not available.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},BACKEND,status].max(#5)}=0 Recovery expression:	disaster
HAProxy Frontend [{#FRONTEND_NAME}]: Average response time is more than 10 sec for 5m	Number of request errors in last 5 minutes is more than 10. Client-side request errors could have a number of causes: client terminates before sending request read error from client client timeout client terminated connection request was tarpitted/subject to ACL Under normal conditions, it is acceptable to (infrequently) receive invalid requests from clients. However, a significant increase in the number of invalid requests received could be a sign of larger, looming issues. For example, an abnormal number of terminations or timeouts by numerous clients could mean that your application is experiencing excessive latency, causing clients to manually close their connections.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#FRONTEND_NAME},FRONTEND,ereq].min(5m)}>10 Recovery expression:	average
HAProxy Frontend [{#FRONTEND_NAME}]: Number of requests denied is more than 10 for 5m	Number of requests denied due to security concerns (ACL-restricted) is more than 10 in last 5 minutes. In the event of a significant increase in denials—a malicious attacker or misconfigured application could be to blame An increase in denied requests will subsequently cause an increase in 403 Forbidden codes. Correlating the two can help you discern the root cause of an increase in 4xx responses.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#FRONTEND_NAME},FRONTEND,dreq].min(5m)}>10 Recovery expression:	average
HAProxy Frontend [{#FRONTEND_NAME}]: Session utilization is more than 80% for 5m	For every HAProxy session, two connections are consumed—one for the client to HAProxy, and the other for HAProxy to your backend. Alerting on this metric is essential to ensure your server has sufficient capacity to handle all concurrent sessions. Unlike requests, upon reaching the session limit HAProxy will deny additional clients until resource consumption drops. Furthermore, if you find your session usage percentage to be hovering above 80%, it could be time to either modify HAProxy’s configuration to allow more sessions, or migrate your HAProxy server to a bigger box.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#FRONTEND_NAME},FRONTEND,sutil].min(5m)}>80 Recovery expression:	average
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Average response time is more than 10s for 5m	Average server response time (in ms) for the last 1,024 requests is more than 10s. Tracking average response times is an effective way to measure the latency of haproxy load-balancing setup. Generally speaking, response times in excess of 500 ms will lead to degradation of application performance and customer experience. Monitoring the average response time can give you the upper hand to respond to latency issues before your customers are substantially impacted. Keep in mind that this metric will be zero if you are not using HTTP	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},{#SERVER_NAME},rtime].min(5m)}>10s Recovery expression:	average
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Average time spent in queue is more than 10s for 5m	Average time spent in queue (in ms) for the last 1,024 requests is more than 10s. It is obviously that minimizing time spent in the queue results in lower latency and an overall better client experience. Each use case can tolerate a certain amount of queue time but in general, you should aim to keep this value as low as possible	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},{#SERVER_NAME},qtime].min(5m)}>10s Recovery expression:	average
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Current number of requests unassigned in queue is more than 10s for 5m	Current number of requests unassigned in queue is more than 10. If your server is bombarded with connections to the point you have reached your global maxconn limit, HAProxy will seamlessly queue new connections in system kernel’s socket queue until the server becomes available. Keeping connections out of the queue is ideal, resulting in less latency and a better user experience. You should alert if the size of your queue exceeds the threshold. If you find that connections are consistently enqueueing, configuration changes may be in order, such as increasing global maxconn limit or changing the connection limits on individual backend servers. Keep in mind: empty queue = happy client	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},{#SERVER_NAME},qcur].min(5m)}>10 Recovery expression:	average
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Number of responses with error is more than 10s for 5m	Number of requests on server, whose responses yielded an error, is more than 10. The server error response rate represents the number of response errors generated by your servers. This includes errors caused by data transfers aborted by the servers as well as write errors on the client socket and failures due to ACLs. Combined with other error metrics, the server error response rate helps diagnose the root cause of response errors. For example, an increase in both the server error response rate and denied responses could indicate that clients are repeatedly attempting to access ACL-ed resources.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},{#SERVER_NAME},eresp].min(5m)}>10 Recovery expression:	average
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Server is DOWN	Server is not available. The check directive must be enabled in HAProxy server configuration	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},{#SERVER_NAME},status].max(#5)}=0 Recovery expression:	disaster
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Average response time is more than 10s for 5m (LLD)	Average server response time (in ms) for the last 1,024 requests is more than 10s. Tracking average response times is an effective way to measure the latency of haproxy load-balancing setup. Generally speaking, response times in excess of 500 ms will lead to degradation of application performance and customer experience. Monitoring the average response time can give you the upper hand to respond to latency issues before your customers are substantially impacted. Keep in mind that this metric will be zero if you are not using HTTP	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},{#SERVER_NAME},rtime].min(5m)}>10s Recovery expression:	average
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Average time spent in queue is more than 10s for 5m (LLD)	Average time spent in queue (in ms) for the last 1,024 requests is more than 10s. It is obviously that minimizing time spent in the queue results in lower latency and an overall better client experience. Each use case can tolerate a certain amount of queue time but in general, you should aim to keep this value as low as possible	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},{#SERVER_NAME},qtime].min(5m)}>10s Recovery expression:	average
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Current number of requests unassigned in queue is more than 10s for 5m (LLD)	Current number of requests unassigned in queue is more than 10. If your server is bombarded with connections to the point you have reached your global maxconn limit, HAProxy will seamlessly queue new connections in system kernel’s socket queue until the server becomes available. Keeping connections out of the queue is ideal, resulting in less latency and a better user experience. You should alert if the size of your queue exceeds the threshold. If you find that connections are consistently enqueueing, configuration changes may be in order, such as increasing global maxconn limit or changing the connection limits on individual backend servers. Keep in mind: empty queue = happy client	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},{#SERVER_NAME},qcur].min(5m)}>10 Recovery expression:	average
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Number of responses with error is more than 10s for 5m (LLD)	Number of requests on server, whose responses yielded an error, is more than 10. The server error response rate represents the number of response errors generated by your servers. This includes errors caused by data transfers aborted by the servers as well as write errors on the client socket and failures due to ACLs. Combined with other error metrics, the server error response rate helps diagnose the root cause of response errors. For example, an increase in both the server error response rate and denied responses could indicate that clients are repeatedly attempting to access ACL-ed resources.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},{#SERVER_NAME},eresp].min(5m)}>10 Recovery expression:	average
HAProxy Server [{#BACKEND_NAME}/{#SERVER_NAME}]: Server is DOWN (LLD)	Server is not available. The check directive must be enabled in HAProxy server configuration	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},{#SERVER_NAME},status].max(#5)}=0 Recovery expression:	disaster
HAProxy Backend [{#BACKEND_NAME}]: Average response time is more than 10 sec for 5m (LLD)	Average backend response time (in ms) for the last 1,024 requests is more than 10 seconds. Tracking average response times is an effective way to measure the latency of haproxy load-balancing setup. Generally speaking, response times in excess of 500 ms will lead to degradation of application performance and customer experience. Monitoring the average response time can give you the upper hand to respond to latency issues before your customers are substantially impacted. Keep in mind that this metric will be zero if you are not using HTTP	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},BACKEND,rtime].min(5m)}>10s Recovery expression:	average
HAProxy Backend [{#BACKEND_NAME}]: Average time spent in queue is more than 10 sec for 5m (LLD)	Average time spent in queue (in ms) for the last 1,024 requests is more than 10 s. It is obviously that minimizing time spent in the queue results in lower latency and an overall better client experience. Each use case can tolerate a certain amount of queue time but in general, you should aim to keep this value as low as possible	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},BACKEND,qtime].min(5m)}>10s Recovery expression:	average
HAProxy Backend [{#BACKEND_NAME}]: Current number of requests unassigned in queue is more than 10 for 5m (LLD)	Current number of requests on backend unassigned in queue is more than 10. If your backend is bombarded with connections to the point you have reached your global maxconn limit, HAProxy will seamlessly queue new connections in system kernel’s socket queue until a backend server becomes available. Keeping connections out of the queue is ideal, resulting in less latency and a better user experience. You should alert if the size of your queue exceeds the threshold. If you find that connections are consistently enqueueing, configuration changes may be in order, such as increasing global maxconn limit or changing the connection limits on individual backend servers. Keep in mind: empty queue = happy client	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},BACKEND,qcur].min(5m)}>10 Recovery expression:	average
HAProxy Backend [{#BACKEND_NAME}]: Number of responses with error is more than 10 for 5m (LLD)	Number of requests on backend, whose responses yielded an error, is more than 10. The backend error response rate represents the number of response errors generated by your backends. This includes errors caused by data transfers aborted by the servers as well as write errors on the client socket and failures due to ACLs. Combined with other error metrics, the backend error response rate helps diagnose the root cause of response errors. For example, an increase in both the backend error response rate and denied responses could indicate that clients are repeatedly attempting to access ACL-ed resources.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},BACKEND,eresp].min(5m)}>10 Recovery expression:	average
HAProxy Backend [{#BACKEND_NAME}]: Server is DOWN (LLD)	HAProxy Backend [{#BACKEND_NAME}] is not available.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#BACKEND_NAME},BACKEND,status].max(#5)}=0 Recovery expression:	disaster
HAProxy Frontend [{#FRONTEND_NAME}]: Average response time is more than 10 sec for 5m (LLD)	Number of request errors in last 5 minutes is more than 10. Client-side request errors could have a number of causes: client terminates before sending request read error from client client timeout client terminated connection request was tarpitted/subject to ACL Under normal conditions, it is acceptable to (infrequently) receive invalid requests from clients. However, a significant increase in the number of invalid requests received could be a sign of larger, looming issues. For example, an abnormal number of terminations or timeouts by numerous clients could mean that your application is experiencing excessive latency, causing clients to manually close their connections.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#FRONTEND_NAME},FRONTEND,ereq].min(5m)}>10 Recovery expression:	average
HAProxy Frontend [{#FRONTEND_NAME}]: Number of requests denied is more than 10 for 5m (LLD)	Number of requests denied due to security concerns (ACL-restricted) is more than 10 in last 5 minutes. In the event of a significant increase in denials—a malicious attacker or misconfigured application could be to blame An increase in denied requests will subsequently cause an increase in 403 Forbidden codes. Correlating the two can help you discern the root cause of an increase in 4xx responses.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#FRONTEND_NAME},FRONTEND,dreq].min(5m)}>10 Recovery expression:	average
HAProxy Frontend [{#FRONTEND_NAME}]: Session utilization is more than 80% for 5m (LLD)	For every HAProxy session, two connections are consumed—one for the client to HAProxy, and the other for HAProxy to your backend. Alerting on this metric is essential to ensure your server has sufficient capacity to handle all concurrent sessions. Unlike requests, upon reaching the session limit HAProxy will deny additional clients until resource consumption drops. Furthermore, if you find your session usage percentage to be hovering above 80%, it could be time to either modify HAProxy’s configuration to allow more sessions, or migrate your HAProxy server to a bigger box.	Expression: {HAProxy:haproxy.stats[/var/lib/haproxy/stats,{#FRONTEND_NAME},FRONTEND,sutil].min(5m)}>80 Recovery expression:	average

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

6.0

6.0

README.md

HAProxy

Overview

Author

Macros used

Template links

Discovery rules

Items collected

Triggers

Files

6.0

Directory actions

More options

Directory actions

More options

Latest commit

History

6.0

Folders and files

parent directory

README.md

HAProxy

Overview

Author

Macros used

Template links

Discovery rules

Items collected

Triggers