Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add geo fields to add_host_metadata processor. #9392

Merged
merged 2 commits into from
Dec 14, 2018

Conversation

andrewvc
Copy link
Contributor

@andrewvc andrewvc commented Dec 5, 2018

** EDIT ** I've left the original issue below the break, but after discussion we added geo fields to the add_host_metadata processor instead of a new one. Original is below


This carries over from the discussion from #8620 .

This adds a new processor that lets users easily add geo fields associated with the host that created the event. You would use it like so:

processors:
  - add_host_geo:
      name: MN HQ
      location: "44.977753, -93.265015"

It's debate-able whether ECS should actually let you put these under perhaps agent.geo. That's something we should discuss here.

One other question here, should we just fold this functionality under add_host_metadata? I believe that probably makes more sense. We agreed in #8620 to make this a separate processor, but with the data nested under host, that makes less sense IMHO.

@andrewvc andrewvc added enhancement discuss Issue needs further discussion. Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team labels Dec 5, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime

@andrew-moldovan

This comment has been minimized.

@andrew-moldovan andrew-moldovan removed their request for review December 5, 2018 04:59
}

func (ag addGeo) asMap() common.MapStr {
return common.MapStr(ag).Clone()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: +1 on cloning, but from name only I would asMap expect to be a cast operation only. Caught me a little offguard when reading the argument for PutValue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll rename to mapClone().

CountryISOCode string `config:"country_iso_code"`
RegionName string `config:"region_name"`
CityName string `config:"city_name"`
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have validation for at least the fields that are stored as geo point in Elasticsearch here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so we'd check the format to be a lat lon pair? I think that makes sense. Anything beyond that?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, at least lat-lon pair.

The others are kind free form text and validating these might require a location database, right? We even store the others as keyword, so not that important. Validating them might be too much effort here (would be nice if we can do some simple validation - regex like -, but no hard requirement for me).

Most important is that indexing won't fail due to misconfiguration of the processor + maps in kibana dashboards will work correctly, cause lat lon pairs are correct and make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation added :)

libbeat/_meta/fields.ecs.yml Outdated Show resolved Hide resolved

ag := addGeo{
"name": config.Name,
"location": config.Location,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we / can we do any validation for the location field to make sure we don't have ingestion errors?

return ag, nil
}

type Config struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit but I'm somehow used to have configs on top

type addGeo common.MapStr

func (ag addGeo) Run(event *beat.Event) (*beat.Event, error) {
_, err := event.PutValue("host.geo", ag.asMap())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really belong into the host? We are interested where the agent is running so it would be agent.geo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an interesting question. I agree that agent is slightly better, but I wonder if that is outweighed by the simplicity of not having yet another geo field. A location is tied to a host as much as it is tied to an agent.

There's an argument that's not true for the geo.name field we use, which is free-form, but perhaps we don't need to be so dogmatic there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One point that would speak for the host is that in the case of apm-server, apm-server is not the agent but the observer. There it would then be observer.geo.*. Having it under host would then not change it. I'm good with going with host.* for now and if someone is not happy with that, he can use the rename processor.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for not having agent.geo.*
As @ruflin says, in ECS, the agent is defined as running on a host or an observer.

Both host.geo.* and observer.geo.* are defined by ECS, so we are covered in either case.

@andrewvc
Copy link
Contributor Author

andrewvc commented Dec 7, 2018

After a discussion with @ruflin we decided to move this to the add_host_metadata processor

@andrewvc
Copy link
Contributor Author

OK, redid the whole thing under the add_host_metadata processor.

The only thing not yet done is the changes to fields.yml. Since this is now in ECS, is there a process for updated the ECS fields, or should I just manually add the ECS name field?

@andrewvc andrewvc changed the title Add new processor for host geo Add geo fields to add_host_metadata processor. Dec 12, 2018
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but we should add a CHANGELOG entry.


if config.Geo != nil {
if len(config.Geo.Location) > 0 {
if m, _ := regexp.MatchString("^\\-?\\d+(\\.\\d+)?\\s*\\,\\s*\\-?\\d+(\\.\\d+)?$", config.Geo.Location); !m {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm undecided if I like this one or not. It's good that we check the user input but it looks like really complex regexp where I must confess I could not easily review to make sure it's correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can break the regexp up into component strings to make ti more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the regexp isn't 100% accurate (it doesn't check the numeric bounds), but it's a lot better than nothing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In b4feb7e I've made the regex more clear

@ruflin
Copy link
Contributor

ruflin commented Dec 12, 2018

@andrewvc Feel free to edit the ecs or the libbeat common file for the field, both works.

@andrewvc andrewvc force-pushed the new-geo-fields branch 3 times, most recently from 36f8a04 to 16524a6 Compare December 12, 2018 22:53
@andrewvc
Copy link
Contributor Author

andrewvc commented Dec 12, 2018

Looks like rebasing of master gives us the geo fields thanks to @andrewkroh :) in #9121

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except a missing CHANGELOG entry.

Ignore the failing mac build as I stopped them.

if len(config.Geo.Location) > 0 {
// Regexp matching a number with an optional decimal component
// Valid numbers: '123', '123.23', etc.
latOrLon := `\-?\d+(\.\d+)?`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Thanks for all the details here.

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great, love it.

Perhaps one more test for empty values being purged correctly, but other than this, LGTM

}
})
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests.

Perhaps one more to test "Delete any empty values"? Make sure your empty values include things like empty spaces too {"city_name": " "}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@webmat I've added handling + tests for blank strings. Mind taking a look?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good stuff.

Seems like I noticed the "din expect" typo too late for this PR ;-)

This lets users add geo data if they have it to the host
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please ignore the failed test in Travis. Thought I fixed it but seems it still happens. Not related to this PR.

@andrewvc andrewvc merged commit c6c4a30 into elastic:master Dec 14, 2018
andrewvc added a commit to andrewvc/beats that referenced this pull request Dec 14, 2018
This lets users add geo data if they have it to the host


(cherry picked from commit c6c4a30)
andrewvc added a commit that referenced this pull request Dec 17, 2018
This lets users add geo data if they have it to the host


(cherry picked from commit c6c4a30)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs further discussion. enhancement Heartbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team v6.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants