-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change text fields to keyword for Metricbeat #10318
Conversation
@@ -40,7 +40,7 @@ | |||
description: consumer offset into partition being read | |||
|
|||
- name: meta | |||
type: text | |||
type: keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@urso objections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If text
indexing is needed, it should be added as a multi-field at .meta.text
@@ -9,7 +9,7 @@ | |||
description: > | |||
Kibana instance UUID | |||
- name: name | |||
type: text | |||
type: keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ycombinator Ok? Don't see why this should be text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, makes sense as keyword
to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I screwed up. This should remain text
IMHO. See my more-enlightened comment here: https://github.com/elastic/beats/pull/10318/files#r250673640
@@ -34,7 +34,7 @@ | |||
description: > | |||
The request method (GET, POST, etc) (of the current request) | |||
- name: request_uri | |||
type: text | |||
type: keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@webmat should we map this to url.original?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely
c10eac2
to
58936eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if Ceph's children
is made into a keyword as well.
Is this search limited to Metricbeat? Or did you search everywhere, and only Metricbeat had text
field left?
@@ -9,7 +9,7 @@ | |||
description: > | |||
Kibana instance UUID | |||
- name: name | |||
type: text | |||
type: keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
@@ -34,7 +34,7 @@ | |||
description: > | |||
The request method (GET, POST, etc) (of the current request) | |||
- name: request_uri | |||
type: text | |||
type: keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely
@@ -40,7 +40,7 @@ | |||
description: consumer offset into partition being read | |||
|
|||
- name: meta | |||
type: text | |||
type: keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If text
indexing is needed, it should be added as a multi-field at .meta.text
Got an answer to my question via the tag. We should do the same for all beats. Adding a note to #8655, in the section "Fields changes" |
@@ -10640,7 +10640,7 @@ Kibana instance UUID | |||
*`kibana.stats.name`*:: | |||
+ | |||
-- | |||
type: text | |||
type: keyword |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be text
. This value comes from the server.name
setting in kibana.yml
. The setting is meant for display purposes, according to it's documentation:
# The Kibana server's name. This is used for display purposes.
#server.name: "your-hostname"
So it could be a string like "Marketing's Fantastic Analytics UI". In that case I can see the benefit of letting ES analyze the string. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If searching for specific words (e.g. "marketing", to follow your example), then we simply need server.name.text
for these searches.
I agree it's really useful in situations where these resources are not always well tagged & so on.
But the canonical field -- server.name
-- should really be keyword
. Defaulting to keyword
all the time, then only adding text
as a multi-field when needed ensures this progression over time (adding text
) is never a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good to me but I have a tangential question:
IIRC if the server.name
field is mapped as keyword
in ES, it wouldn't also get a server.name.text
multi-field automatically. Wouldn't we need to explicitly specify that somehow in the fields.yml
so ES knows to create the multi-field? Or do all keyword
fields in fields.yml
automatically get a multi-field mapping added to them in the template via dynamic templates or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another point: having server.name as keyword will enable users to aggregate per kibana server. It's the reason in ECS, we've decided to push for keyword
across the board, by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha I missed your answer when posting my previous message.
The template will have to create the multi-field explicitly for the .text
.
Going for keyword
across the board is also a small performance gain, so the default is kw only. Then we add text
strategically, where it makes sense.
I'm not sure the exact incantation to make the multi-field in Beats fields.yml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I understand now. Thanks. I'm good with changing this to keyword
for now and adding a text
multi-field later if we need.
@webmat For the |
metricbeat/docs/fields.asciidoc
Outdated
@@ -18573,7 +18573,7 @@ The request method (GET, POST, etc) (of the current request) | |||
*`php_fpm.process.request_uri`*:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@webmat Sounds like a field that should be mapped to ECS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, agreed. url.original
?
I noticed a few more things in this field defs:
user
=>user.name
script
=>file.path
content_length
=>http.request.body.size
request_method
=>http.request.method
request_duration
=>event.duration
, with an adjustment to nanospid
=>process.pid
No longer sure if any of those were discussed in #10218
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we should review php_fpm in detail again. Can you comment the above on #10218 so we can track it there?
e0b2842
to
a0cbe5e
Compare
This should be ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ceph children
field still looks like it's text
. Other than that, looking good :-)
Note that I trust your grepping skill. I haven't double checked whether you had missed any.
@@ -86,7 +86,7 @@ func eventsMapping(content []byte) ([]common.MapStr, error) { | |||
nodeInfo := common.MapStr{} | |||
if node.ID < 0 { | |||
//bucket node | |||
nodeInfo["children"] = childrenMap[node.Name] | |||
nodeInfo["children"] = strings.Split(childrenMap[node.Name], ",") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
c6e6fca
to
a5cf580
Compare
8f4dd45
to
0f84273
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Searching for `type: text` revealed a few fields which were text but I think should be keyword.
0f6356f
to
d9a7c5d
Compare
Searching for
type: text
revealed a few fields which were text but I think should be keyword.