Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GoAccess with haproxy #2763

Closed
eddietorial opened this issue Nov 26, 2024 · 7 comments
Closed

GoAccess with haproxy #2763

eddietorial opened this issue Nov 26, 2024 · 7 comments

Comments

@eddietorial
Copy link

I'm requesting guidance with parsing haproxy logs with goaccess.

Here are a couple of formating attempts, however, all attempts keep returning errors at every step.

goaccess -f haproxy.log --log-format='%^ %^ %^:%^:%^ %^ %^[%^]: %h:%^ [%d:%t.%^] %^ %^ %^/%^/%^/%^/%L %s %b %^ %^ %^ %^/%^/%^/%^/%^ %^/%^ "%r"' --date-format='%d/%b/%Y' --time-format='%H:%M:%S' -q

goaccess -f haproxy.log --log-format='%^]%^ %h:%^ [%d:%t.%^] %^/%^/%^/%^/%L/%^ %s %b %^"%r"' --date-format='%d/%b/%Y' --time-format='%H:%M:%S' -q

Here are a few sample lines of the logs..

Nov 18 16:55:38 localhost haproxy[42886]: 4.22.141.40:27088 [18/Nov/2024:16:55:37.482] proxy0~ cache0 cache0 0/0/657 4243 GET https://www.example.com/ HTTP/2.0 200/0/0/657 4243 cache0 
Nov 18 16:55:38 localhost haproxy[42886]: 4.22.141.40:27088 [18/Nov/2024:16:55:37.482] proxy0~ cache0 cache0 0/0/657 4243 GET https://www.example.com/ HTTP/2.0 200/0/0/657 4243 cache0 
Nov 18 16:55:38 localhost haproxy[42886]: 4.22.141.40:27088 [18/Nov/2024:16:55:38.111] proxy0~ cache0 cache0 0/0/229 8931 GET https://www.example.com/css/main.min.css HTTP/2.0 200/0/0/229 8931 cache0 
Nov 18 16:55:38 localhost haproxy[42886]: 4.22.141.40:27088 [18/Nov/2024:16:55:38.111] proxy0~ cache0 cache0 0/0/229 8931 GET https://www.example.com/css/main.min.css HTTP/2.0 200/0/0/229 8931 cache0 
Nov 18 16:55:38 localhost haproxy[42886]: 4.22.141.40:27088 [18/Nov/2024:16:55:38.315] proxy0~ cache0 cache0 0/0/154 2251 GET https://www.example.com/css/dark.min.css HTTP/2.0 200/0/0/154 2251 cache0 
Nov 18 16:55:38 localhost haproxy[42886]: 4.22.141.40:27088 [18/Nov/2024:16:55:38.315] proxy0~ cache0 cache0 0/0/154 2251 GET https://www.example.com/css/dark.min.css HTTP/2.0 200/0/0/154 2251 cache0 
Nov 18 16:55:38 localhost haproxy[42886]: 4.22.141.40:27088 [18/Nov/2024:16:55:38.315] proxy0~ cache0 cache0 0/0/283 83575 GET https://www.example.com/js/bundle.min.9a920d7dabdbad8363b6a0a94e29a9dfebdb7ee64cfcb193a0145e512ef2bdab.js HTTP/2.0 200/0/0/283 83575 cache0 

And here's the log format breakdown which I hope I've got right:

IP Address and Port:
    4.22.141.40:27088 represents the client IP and source port.

Timestamp:
    [18/Nov/2024:16:55:37.482] indicates the time the request was received.

Frontend and Backend Details:
    proxy0~ represents the frontend.
    cache0/cache0 indicates that the request was served via the cache0 backend.

Timers:
    0/0/0/10/10 shows timings for various stages:
        Tq (Queue): Time waiting in the queue.
        Tw (Waiting for Connection): Time waiting for a connection.
        Tc (Connect): Time to establish a TCP connection.
        Tr (Response): Time to receive the full response.
        Tt (Total): Total request time.

HTTP Response Code:
    200 signifies the request was successful.

Response Size:
    4243 bytes were sent in response.

Request Details:
    "GET https://www.example.com/about/ HTTP/2.0" shows the method, resource, and protocol version.
@allinurl
Copy link
Owner

Give this a shot:

goaccess access.log --log-format='%^]: %h:%^ [%d:%t.%^] %e %v %^ %^/%^/%L %b %m %U %H %s/%^' --date-format=%d/%b/%Y --time-format=%T

2024-11-26-160043_733x415_scrot

@eddietorial
Copy link
Author

Perfect. Thank you!

@eddietorial
Copy link
Author

I'm playing around with goaccess and haproxy. I've not had any luck with geoip yet.. I also discovered that haproxy could be tweaked to include referrer and user agent in the logs ([https://serverfault.com/questions/764098/haproxy-log-custom-format-for-goaccess]).

Here are some more sample logs...
Nov 28 16:17:07 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:17:06.861] GET /js/bundle.min.9a920d7dabdbad8363b6a0a94e29a9dfebdb7ee64cfcb193a0145e512ef2bdab.js HTTP/1.1 200 83599 {https://www.example.com/|Mozilla/5.0 (X11; FreeBSD amd64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15 Midori/6} Nov 28 16:17:07 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:17:06.861] GET /js/bundle.min.9a920d7dabdbad8363b6a0a94e29a9dfebdb7ee64cfcb193a0145e512ef2bdab.js HTTP/1.1 200 83599 {https://www.example.com/|Mozilla/5.0 (X11; FreeBSD amd64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15 Midori/6} Nov 28 16:17:07 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:17:07.297] GET /fonts/Lato-Regular.ttf HTTP/1.1 200 75878 {https://www.example.com/|Mozilla/5.0 (X11; FreeBSD amd64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15 Midori/6} Nov 28 16:17:07 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:17:07.297] GET /fonts/Lato-Regular.ttf HTTP/1.1 200 75878 {https://www.example.com/|Mozilla/5.0 (X11; FreeBSD amd64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15 Midori/6} Nov 28 16:17:08 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:17:07.164] GET /fonts/Lato-Bold.ttf HTTP/1.1 200 74058 {https://www.example.com/|Mozilla/5.0 (X11; FreeBSD amd64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15 Midori/6} Nov 28 16:17:08 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:17:07.164] GET /fonts/Lato-Bold.ttf HTTP/1.1 200 74058 {https://www.example.com/|Mozilla/5.0 (X11; FreeBSD amd64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15 Midori/6} Nov 28 16:30:45 localhost haproxy[1937]: -:- 4.22.141.40 [28/Nov/2024:16:30:43.751] GET https://example.com/ HTTP/2.0 301 104 {|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:45 localhost haproxy[1937]: -:- 4.22.141.40 [28/Nov/2024:16:30:43.751] GET https://example.com/ HTTP/2.0 301 104 {|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:45 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:45.139] GET https://www.example.com/ HTTP/2.0 200 4391 {|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:45 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:45.139] GET https://www.example.com/ HTTP/2.0 200 4391 {|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:46 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:45.894] GET https://www.example.com/css/main.min.css HTTP/2.0 200 8955 {https://www.example.com/|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:46 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:45.894] GET https://www.example.com/css/main.min.css HTTP/2.0 200 8955 {https://www.example.com/|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:46 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:46.176] GET https://www.example.com/css/dark.min.css HTTP/2.0 200 2275 {https://www.example.com/|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:46 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:46.176] GET https://www.example.com/css/dark.min.css HTTP/2.0 200 2275 {https://www.example.com/|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:46 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:46.176] GET https://www.example.com/js/bundle.min.9a920d7dabdbad8363b6a0a94e29a9dfebdb7ee64cfcb193a0145e512ef2bdab.js HTTP/2.0 200 83599 {https://www.example.com/|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:46 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:46.176] GET https://www.example.com/js/bundle.min.9a920d7dabdbad8363b6a0a94e29a9dfebdb7ee64cfcb193a0145e512ef2bdab.js HTTP/2.0 200 83599 {https://www.example.com/|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:46 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:46.176] GET https://www.example.com/logo.webp HTTP/2.0 200 4261 {https://www.example.com/|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:46 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:46.176] GET https://www.example.com/logo.webp HTTP/2.0 200 4261 {https://www.example.com/|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:46 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:46.176] GET https://www.example.com/fonts/Lato-Regular.ttf HTTP/2.0 200 75878 {https://www.example.com/css/main.min.css|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0} Nov 28 16:30:46 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:30:46.176] GET https://www.example.com/fonts/Lato-Regular.ttf HTTP/2.0 200 75878 {https://www.example.com/css/main.min.css|Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0}

I attempted to build the goaccess query starting simple, but am struggling and could not get it right.

Can you help me with a logical method to build a custom format query? I would like to be able to do this myself. Thanks!

@mariusnorheim
Copy link

I am struggling with getting goaccess to generate a report from a HAproxy log as well, and I'm really lost as to why

Example log outputs:
172.20.0.1 [04/Dec/2024 10:49:27.875] 200 +8006 "GET / HTTP/1.1" "User-Agent"
172.20.0.1 [04/Dec/2024 10:49:27.878] 200 +8003 "GET / HTTP/1.1" "User-Agent"
172.20.0.1 [04/Dec/2024 10:49:27.894] 200 +7988 "GET / HTTP/1.1" "User-Agent"

Goaccess config:
date-format %d/%b/%Y
time-format %T
log-format %h [%d/%b/%Y %T.%^] %s %T "%r" "%u"

Error:
[PARSING /logs/sanitized.log] {0} @ {0/s}
Cleaning up resources...
==18== GoAccess - version 1.8.1 - Apr 14 2024 22:34:39
==18== Config file: goaccess.conf
==18== https://goaccess.io - [email protected]
==18== Released under the MIT License.
==18==
==18== FILE: /logs/sanitized.log
==18== Parsed 10 lines producing the following errors:
==18==
==18== Token '04' doesn't match specifier '%d'
==18== Token '04' doesn't match specifier '%d'
==18== Token '04' doesn't match specifier '%d'
==18==
==18== Format Errors - Verify your log/date/time format

I've made sure that the containers for syslog, haproxy and goaccess all have en_US.UTF-8 locale, I've also tried adding --tz=UTC parameter. I'm really confused as to why it won't accept 04 as a %d specifier

@heyainsleymae
Copy link
Contributor

I'm really confused as to why it won't accept 04 as a %d specifier
@mariusnorheim

This is because the specifiers used by the --log-format option aren't the exact same as those used by strftime, as GoAccess needs to extract more semantic information, like IP addresses and status codes, from your log entries; the --date-format and --time-format options use strftime specifiers, while the --log-format option uses the specifiers listed in the "Custom Log/Date Format" section of the manual.

I was able to parse your example output with the following configuration:

date-format %d/%b/%Y
time-format %T
log-format %h [%d %t.%^] %s %T "%r" "%u"

@heyainsleymae
Copy link
Contributor

heyainsleymae commented Dec 5, 2024

Can you help me with a logical method to build a custom format query? I would like to be able to do this myself. Thanks!
— @geoffmx

In the future, please try to provide a failing log format that you tried—as you did in the initial issue description—since it gives those want to assist a solid place to start, and helps them understand what each part of your log may represent. In this case, I modified the log format provided in the accepted answer of the linked Stack Overflow question.

Solution

Given the following log entry:

Nov 28 16:17:07 localhost haproxy[1937]: 192.168.4.4:8080 4.22.141.40 [28/Nov/2024:16:17:06.861] GET /js/bundle.min.9a920d7dabdbad8363b6a0a94e29a9dfebdb7ee64cfcb193a0145e512ef2bdab.js HTTP/1.1 200 83599 {https://www.example.com/|Mozilla/5.0 (X11; FreeBSD amd64) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.0 Safari/605.1.15 Midori/6}

We can add the following lines to our GoAccess configuration file:

time-format %H:%M:%S
date-format %d/%b/%Y
log-format %^ %^ %^ %^ %^ %^ %h [%d:%t.%^] %m %U %H %s %b "{%R|%u}"

Or pass those same options as arguments to the goaccess command:

goaccess example.log --time-format=%H:%M:%S --date-format=%d/%b/%Y --log-format='%^ %^ %^ %^ %^ %^ %h [%d:%t.%^] %m %U %H %s %b "{%R|%u}"'

See my other comment on this issue for an overview of the different log specifiers, and how they differ slightly depending on the option being set.

@eddietorial
Copy link
Author

Thank you, that helped. I now have a better understanding of constructing a query on my own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants