-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migration of apache2 error fileset to ECS #8963
Conversation
"log.offset": 229, | ||
"message": "File does not exist: /usr/local/apache2/htdocs/favicon.ico", | ||
"process.pid": "35708", | ||
"process.tid": "4328636416", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why this shows up as a string instead of a int here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just hasn't been cast, I assume.
However I think the transient state of the JSON payload is not that critical:
- The ECS template will define them as
long
, and ES will index them as such. - Re-reading them with Kibana will correctly indicate they are numeric
- The only problem is that the
_source
will return this as a string, not a numeric.
Because of 3, we may want to fix this eventually. But I think it's out of scope for the mapping to ECS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@webmat Totally missed your comment here. I think you are actually spot on. I was kind of thinking if the type is defined as long it would also have an affect on _source but obviously it will not.
@@ -5,27 +5,15 @@ | |||
"grok": { | |||
"field": "message", | |||
"patterns": [ | |||
"\\[%{APACHE_TIME:apache2.error.timestamp}\\] \\[%{LOGLEVEL:apache2.error.level}\\]( \\[client %{IPORHOST:apache2.error.client}\\])? %{GREEDYDATA:apache2.error.message}", | |||
"\\[%{APACHE_TIME:apache2.error.timestamp}\\] \\[%{DATA:apache2.error.module}:%{LOGLEVEL:apache2.error.level}\\] \\[pid %{NUMBER:apache2.error.pid}(:tid %{NUMBER:apache2.error.tid})?\\]( \\[client %{IPORHOST:apache2.error.client}\\])? %{GREEDYDATA:apache2.error.message1}" | |||
"\\[%{APACHE_TIME:apache2.error.timestamp}\\] \\[%{LOGLEVEL:log.level}\\]( \\[client %{IPORHOST:source.ip}\\])? %{GREEDYDATA:message}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is apache2.error.client
always going to be an IP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Apache error log doesn't seem to support a "host" (meaning hostname or IP), according to their doc. The value here seems to be %a
, which would be IP:port
(kind of like an "authority", but in this case always an IP before the port).
This makes sense, because in an error logging scenario, you want to minimize the chances of fancy features breaking before an error is logged. Therefore the reverse DNS lookup needed to produce %h
and try to resolve to a name would be skipped here.
Only thing I'm not sure of is %a
seems to always include a port. Although I'm not sure the original parsing ever supported that (where are Beats grok patterns defined?), and the test logs don't include it either...
I did a bit of spelunking, and our test log may be incorrect. My default Apache httpd install includes the port, even for local access. The current grok is correct in making this whole section optional, as some error logs have nothing to do with clients (e.g. logging restarts). I wonder if this may have hidden problems with the client trying to parse an IP, when it was in fact IP:port
.
In any case, I think we could be pragmatic and optionally parse the port here:
Recap:
- yes it's always an IP
- There should also be a port. Let's parse it if it's present.
- See below for example logs that include the port
- The whole client section being optional is a good thing. Keep this too.
For reference here's the logs I just generated on my local homebrew httpd, trying to access localhost:8080/meh:
[Wed Nov 07 13:51:42.196822 2018] [mpm_prefork:notice] [pid 60830] AH00163: Apache/2.4.37 (Unix) configured -- resuming normal operations
[Wed Nov 07 13:51:42.197371 2018] [core:notice] [pid 60830] AH00094: Command line: '/usr/local/opt/httpd/bin/httpd'
[Wed Nov 07 14:05:19.882541 2018] [mpm_prefork:notice] [pid 60830] AH00171: Graceful restart requested, doing restart
[Wed Nov 07 14:05:19.898329 2018] [mpm_prefork:notice] [pid 60830] AH00163: Apache/2.4.37 (Unix) configured -- resuming normal operations
[Wed Nov 07 14:05:19.898366 2018] [mpm_prefork:info] [pid 60830] AH00164: Server built: Oct 24 2018 19:50:11
[Wed Nov 07 14:05:19.898388 2018] [core:notice] [pid 60830] AH00094: Command line: '/usr/local/opt/httpd/bin/httpd'
[Wed Nov 07 14:05:19.898451 2018] [core:debug] [pid 60830] log.c(1568): AH02639: Using SO_REUSEPORT: no (1)
[Wed Nov 07 14:05:19.898527 2018] [mpm_prefork:debug] [pid 60830] prefork.c(918): AH00165: Accept mutex: none (default: sysvsem)
[Wed Nov 07 14:05:32.324848 2018] [authz_core:debug] [pid 62310] mod_authz_core.c(817): [client ::1:50002] AH01626: authorization result of Require all granted: granted
[Wed Nov 07 14:05:32.327138 2018] [authz_core:debug] [pid 62310] mod_authz_core.c(817): [client ::1:50002] AH01626: authorization result of <RequireAny>: granted
[Wed Nov 07 14:05:32.327240 2018] [core:info] [pid 62310] [client ::1:50002] AH00128: File does not exist: /usr/local/var/www/meh
[Wed Nov 07 14:05:52.563539 2018] [authz_core:debug] [pid 62312] mod_authz_core.c(817): [client ::1:50093] AH01626: authorization result of Require all granted: granted
[Wed Nov 07 14:05:52.564189 2018] [authz_core:debug] [pid 62312] mod_authz_core.c(817): [client ::1:50093] AH01626: authorization result of <RequireAny>: granted
[Wed Nov 07 14:05:52.564320 2018] [core:info] [pid 62312] [client ::1:50093] AH00128: File does not exist: /usr/local/var/www/meh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like in the first step we can feed it into our "magic" hostname field and then follow up also with the port separation and feed it properly into the fields. Should we open a separate issue for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for these migrations we should fix little things like this right in the migration PR. We have too many modules to convert. If each of them requires 2-3 PRs each to do in full, that's going to mean a whole lot of back & forth.
If we unearth deeper problems with a given module, there may be a case for more than one PR, though.
a79b3f6
to
415bb35
Compare
@webmat Ready for review. |
415bb35
to
25d98c1
Compare
dev-tools/ecs-migration.yml
Outdated
copy_to: false | ||
|
||
- from: apache2.error.tid | ||
to: process.tid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Threads can also have a name, a priority (nice level), and perhaps other interesting properties. I would name this field process.thread.tid
, to make space for other attributes we may want to eventually support about threads...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember we discussed it in the past that we should also have a thread object. Was the conclusion to have it under process or global? @andrewkroh do you remember?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider this:
process.thread.tid
process.thread.name
process.thread.priority
I don't yet have a strong opinion on nesting under process vs top level.
But I think it's really important we have a "thread" field set, so we can eventually track other things about threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a thread always belongs to a thread (is that correct 🤔 ) I think it makes sense to nest it under process. Should we open an issue / PR in ECS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. That's what I've been dying to do. Open this PR and about 15 others ;-)
@ruflin I assume I should take over this one as well? |
@webmat Will not say no ;-) But also do not want to drop the work on you. |
edd00bb
to
57bd9e0
Compare
@webmat Pushed this forward a bit on my end. |
57bd9e0
to
cf99ef9
Compare
@@ -1740,7 +1740,17 @@ | |||
packet/event. Source fields are usually populated in conjunction with destination fields. | |||
type: group | |||
fields: | |||
|
|||
- name: address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
@ruflin Good stuff. I want to do my next big push on modules next week. If you don't have time to finish this one, I'll pick it up. |
"\\[%{APACHE_TIME:apache2.error.timestamp}\\] \\[%{LOGLEVEL:apache2.error.level}\\]( \\[client %{IPORHOST:apache2.error.client}\\])? %{GREEDYDATA:apache2.error.message}", | ||
"\\[%{APACHE_TIME:apache2.error.timestamp}\\] \\[%{DATA:apache2.error.module}:%{LOGLEVEL:apache2.error.level}\\] \\[pid %{NUMBER:apache2.error.pid}(:tid %{NUMBER:apache2.error.tid})?\\]( \\[client %{IPORHOST:apache2.error.client}\\])? %{GREEDYDATA:apache2.error.message1}" | ||
"\\[%{APACHE_TIME:apache2.error.timestamp}\\] \\[%{LOGLEVEL:log.level}\\]( \\[client %{IPORHOST:source.address}\\])? %{GREEDYDATA:message}", | ||
"\\[%{APACHE_TIME:apache2.error.timestamp}\\] \\[%{DATA:apache2.error.module}:%{LOGLEVEL:log.level}\\] \\[pid %{NUMBER:process.pid:int}(:tid %{NUMBER:process.thread.id:long})?\\]( \\[client %{IPORHOST:source.address}\\])? %{GREEDYDATA:message}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@webmat To my surprise I had to use :long
here as it seems for :int
the value was too big. Probably good to know if you hit issue with other values too and we should be careful as we now know there is a limit.
@jakelandis Is there any disadvantage here to always use :long
instead of :int
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow I'm glad we caught the :int
case with our test data! I've been using only "int" for all of the coercions I've fixed so far. Although all of those I did fit in the bounds, AFAICT (ports, PIDs, mostly).
I think always using long
would actually be fine. It'll instantiate a long, but then shove that into a JSON object, where the distinction disappears anyway.
I'm now curious to know if the int
used by grok is signed or not. Both possibilities can actually cause problems. If it's signed, it's currently not enough to properly capture the high ports or PIDs, over 32k. If it's unsigned, it wouldn't work to capture negative integers...
I created #9598 to tackle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruflin - no disadvantage other then you may also need to update the mapping.
@webmat - It is signed. Specifically we use https://docs.oracle.com/javase/8/docs/api/java/lang/Long.html#parseLong-java.lang.String- or https://docs.oracle.com/javase/8/docs/api/java/lang/Integer.html#parseInt-java.lang.String- to cast the value found from the grok expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for confirming, @jakelandis.
Most of our fields actually are long
in ElasticSearch, so this won't be a problem. I'll fix our coercions soon.
a433dbb
to
108717c
Compare
Fields migrated: * apache.error.level -> log.level * apache.error.message -> message * apache.error.pid -> process.pid * apache.error.tid -> process.tid * apache.error.client -> source.ip Other changes * Changelog updated * Migration file updated * Field alias added for the old fields * Test outputs updated * Fix source.ip to be IP only. Before had a HOSTORIP pattern. Note: The ECS fields used were on purpose only put on the module level and are put into filebeat or libbeat when they are used in more then 1 place.
Co-Authored-By: ruflin <[email protected]>
108717c
to
52ade7e
Compare
As discussed, here's my post-merge review ;-)
I'm not sure we'll be able to get started necessarily on the ideas we're accumulating in elastic/ecs#181 for 7.0. So I'd rather see this small pipeline processor added in the module for now. |
I'll take care of it likely today, btw. |
* Migration of apache2 error fileset to ECS Fields migrated: * apache.error.level -> log.level * apache.error.message -> message * apache.error.pid -> process.pid * apache.error.tid -> process.tid * apache.error.client -> source.ip Other changes * Changelog updated * Migration file updated * Field alias added for the old fields * Test outputs updated * Fix source.ip to be IP only. Before had a HOSTORIP pattern. Note: The ECS fields used were on purpose only put on the module level and are put into filebeat or libbeat when they are used in more then 1 place.
Other changes
Note: The ECS fields used were on purpose only put on the module level and are put into filebeat or libbeat when they are used in more then 1 place.