-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Haproxy filebeat module tcp and default formats #8637
Haproxy filebeat module tcp and default formats #8637
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added comments with the discussions on the previous PRs so we don't forget.
filebeat/docs/fields.asciidoc
Outdated
Default HAProxy log format | ||
|
||
|
||
*`haproxy.default.facility`*:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use default
on these fields, as they can be common to other log formats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, this is taken from our Grok list https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns#L82
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, but are they specific of the default log format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, I have spent some time digging in the doc without really understanding what it is. I was supported on the idea that if it's correct in the Grok patterns in that way, it is for us. I really don't care about placing it outside on inside of the default format.
So if you can clarify this, just tell me where you'd like that I place it and I'll do it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't put them under default
, specially if they are common to other formats.
Also, if something looks quite general, look for them in ECS or existing filebeat fields, for example log.syslog.facility
and log.syslog.priority
already exist in filebeat, you could use them to store facility and priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are better going to do the ECS port in a different PR so we can decide between original and ECS naming conventions
- name: time_duration | ||
description: time_duration is the time the request remained active in haproxy, which is the total time in milliseconds elapsed between the first byte of the request was received and the last byte of response was sent. | ||
- name: time_backend_connect | ||
description: time_backend_connect is the total time in milliseconds spent waiting for the connection to establish to the final server, including retries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to rewrite these descriptions on this PR do it, if not we can do it in another PR, but we should do it for consistency at some moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 884ef4f772540389e5c3e96ee8a1a91ef9db36c1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I think I didn't explain well, I meant the other way around 🙁 If you look at other modules they don't start with the name of the field.
metricbeat/my_metricbeat.yml
Outdated
@@ -0,0 +1,148 @@ | |||
###################### Metricbeat Configuration Example ####################### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably not be in this PR.
@sayden Could you add more details to the PR description on what this PR is about? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing the format of the descriptions!
- name: process_name | ||
description: Name of the process | ||
|
||
- name: pid | ||
description: Process ID | ||
description: pid of the process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use PID
, or just left the previous description, it was fine. Or add a description about the process, something like PID of the process that executed the request
"haproxy.connections.frontend": 1, | ||
"haproxy.connections.retries": 0, | ||
"haproxy.connections.server": 0, | ||
"haproxy.default.logsource": "127.0.0.1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding fields under default
, this logsource looks common to default and TCP, it shouldn't be under default
. If this is always going to be an IP, it could be haproxy.source.ip
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it seems it could be an IP or a hostname. I'll place this field in the root (haproxy.logsource
) anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruflin what would we do ECS-wise for a source that can be an ip or a hostname?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have a solution here as one is ip type and other other is keyword type in ES. Probably best it to store it under keyword for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sayden umm, in haproxy documentation it says that this is an IP, also I think that source IP in default format and client IP in tcp format are the same, I'd store both in haproxy.source.ip
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, they can also be a unix socket, tricky... @ruflin opinions for these source fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haproxy.source
then? I wouldn't use haproxy.logsource
because it seems to be the source of the log, not the source of the logged connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haproxy.source sounds like a good tradeoff for me at the moment. We should make it a keyword.
88524da
to
3f0acdb
Compare
Seems like the failing tests are related. The generated output files for the tests must be updated. Please ping me if you need assistance on how to generate them. |
3f0acdb
to
0b72a09
Compare
0b72a09
to
37511b1
Compare
Failed test is related. Checking |
fields: | ||
- name: mode | ||
type: text | ||
description: mode that the frontend is operating (TCP or HTTP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can also be common to multiple log formats, remove it too from the default namespace, and it can be set to tcp
on the tcp log format.
A possible name for this could be haproxy.protocol
, or network.protocol
as of ECS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not use network.protocol
yet from ECS as it still might change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can read in the HAProxy documentation,mode
is only specific for Default format https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#8.2.1 In TCP format there is no "mode" at all (mode is always TCP).
About the naming, I thought that we already agreed that we were going to leave ECS apart in this PR and focus on maintain naming of the service, which is the most familiar naming for an HAProxy user who could use this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, mode
seems to be only specified in the default format, it wouldn't make sense in protocol-specific formats, but it can still be useful in events so users can differentiate logs from tcp and http connections.
We can set haproxy.protocol
to the value of mode for the default formats, and for protocol-specific formats set it to the specific protocol depending on the matching pattern. I'm not sure how to set a field depending on the matching pattern, we can leave it for a future change, but I'd use a common name like haproxy.protocol
instead of haproxy.default.mode
in any case, thinking in the future uses of this field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are spending too much time and energy in a field that only appears (and gets parsed) in a deprecated log format.
I feel like writing complex conditional parsing for non-straightforward (invented) fields which doesn't usually appear in HAProxy and to get something that we haven't measure and that it has not come from a user requirements is over-engineering things a bit
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as I said we can leave adding this field in other formats for future changes. I was just asking about using a generic name like haproxy.protocol
(instead of haproxy.default.mode
), no overengineering, just using a more future-proof name, wdyt about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or haproxy.mode
if you want to keep haproxy naming, but without the default
namespacing.
Oh, and could you please add a changelog entry? |
Changelog added! :) |
CHANGELOG.asciidoc
Outdated
@@ -134,6 +134,7 @@ https://github.com/elastic/beats/compare/v6.4.0...master[Check the HEAD diff] | |||
- Release `docker` input as GA. {pull}8328[8328] | |||
- Keep unparsed user agent information in user_agent.original. {pull}8537[8537] | |||
- Better tracking of number of open file descriptors. {pull}7986[7986] | |||
- Added default and TCP parsing formats {issue}8311[8311] {pull}8637[8637] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...to haproxy module
? 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, you're totally right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI failures are not related.
…mats (#8690) Haproxy filebeat module tcp and default formats
Adds TCP and Default format parsing in parallel to HTTP format as described here https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#8.2 So the three types of logs can be parsed even if they are mixed in the file
Merges #8428 and #8526 into a single PR because now #8526 depends on #8428