[Maps] tracks source #33320

nreese · 2019-03-15T15:04:46Z

WIP

This PR adds a new source called Tracks that uses a terms aggregation and top_hits aggregation to create tracks from documents. The document geo_point fields are converted into a LineString for the terms split bucket.

The screen shot is a nonsense demo using sample web logs and making tracks on machine.os.keyword just because none of the sample data sets have tracks in them

cc @alexfrancoeur @AlonaNadler

elasticmachine · 2019-03-15T15:05:10Z

Pinging @elastic/kibana-gis

elasticmachine · 2019-03-15T15:49:09Z

💔 Build Failed

continuous-integration/kibana-ci/pull-request

alexfrancoeur · 2019-03-15T16:00:28Z

@nreese this is awesome!! I'll try it out with my flight data today

alexfrancoeur · 2019-03-15T19:28:57Z

@nreese I love this! I think I'm running into issues with the dynamic mapping here, likely because of the limitation we have for the number of individual documents we can show on a map.

In the above map, the dark red points are the most up to date flight records. The lines are built from history. Both utilizing this script (live_flights.py.zip) I would expect each point to be at the front of every line.

If I zoom in enough, this is the case.

But as I zoom out, or simply refresh at a zoomed out map (as you can see in the gif above) lines tend to come and go.

I imagine this is because we're doing some sort of filtering client side so we don't crash the browser. However, this isn't necessarily obvious that this is happening. And I'm still not sure if it's the case. It's possible that I'm just checking out this PR too early (sorry, I was excited). Side note - I didn't see an option for filtering on visible map layer for tracks. Not sure if this was intentional or not.

Also, after running for a bit I got this.

Unrelated, but I'm not convinced dynamic filtering works for geo_point, geo_shape or tracks

nreese · 2019-03-15T19:33:08Z

@alexfrancoeur The terms aggregation is limited to 30 so only the top 30 terms based on count will be displayed. The top_hits aggregation is limited to 100 so only the 100 most recent documents per term are getting added to the line. You can mess around with those value if you want to see where performance problems start to show up

kibana/x-pack/plugins/maps/public/shared/layers/sources/es_tracks_source/es_tracks_source.js

Line 146 in 74bc917

size: 30

Maybe these values should be configurable via advanced settings config or something?

nreese · 2019-03-15T19:35:15Z

@alexfrancoeur Can you provide the full error message the next time you see that error. Not sure how to reproduce it.

alexfrancoeur · 2019-03-15T20:00:01Z

saved_objects.zip

I can't reproduce either, it just popped up. But the attached saved objects file with the python script I put in my previous comment is what I had been using locally.

alexfrancoeur · 2019-03-15T20:07:21Z

@alexfrancoeur The terms aggregation is limited to 30 so only the top 30 terms based on count will be displayed. The top_hits aggregation is limited to 100 so only the 100 most recent documents per term are getting added to the line. You can mess around with those value if you want to see where performance problems start to show up

kibana/x-pack/plugins/maps/public/shared/layers/sources/es_tracks_source/es_tracks_source.js

Line 146 in 74bc917

size: 30

Maybe these values should be configurable via advanced settings config or something

Yeah, I figured that was the case. An advanced setting would be great to make it configurable, I think it make sense to add that. We'll need a warning somehow so that it's obvious there are limitations, and we should probably recommend to define zoom levels appropriately. What do you think?

thomasneirynck

This is really great! Really neat too that this essentially boiled down to just introducing a new source, and all the other functionality (tooltips, fitltering, bounds-fitting, ...) remains.

I think the concept is especially powerful because it will greatly reduce the complexity of data-modeling and ingestion of this type of track data. Users will no longer have to maintain two indices for the same data (last known index vs index with all historical updates).

After using this for a bit, I do think there might be a few things that will get complex.

This PR introduces a number of new concepts:

a split field, which is an identifier of a feature over multiple documents, similar to a foreign key. Let's call it "entity" (as this seems to be something we want to expand on in the stack).
creation of a "phantom" geometry for this entity, in this case a line, connecting the points.
new "virtual" properties. in this case, pretty much what we get back from ES + some metadata, so this new entity can have some tooltip-contents

There are some restrictions:

we only do top-hits for geo_points (otherwise the "connecting with line" does not make sense).
we sort by date for "last known" functionality.

All this combined, seems to introduces some edge cases that might be tough to handle naively:

Dynamic styling is suddenly complicated. We can no longer use the fields from the index-pattern. I think some sort of rolled up values would be useful (e.g. average of foobar, where foobar is a field in the documents), but then we have to introduce multiple metrics on these terms aggregation, a second pass to ES. fwiw - I'm not sure if we should develop a client-side "agg-engine" to do these kinds of roll-ups, we should push this to ES as much as we can.
For the dynamic styling, from what I understand, users would like to symbolize each constituent segment of the line. This approach does not really help with this. In general, what kind of "geometry" to construct based on a set of documents could be much larger topic. E.g. we might want to do the same of any kind of data
tooltips lose power: tooltips only work on the line, but are no longer relevant. Technically, they can just show some totals
bounds filtering stops working. for a bounds that does not contain any of the vertices of the line, but is intersected by it, ES-will not return the results. This may cause a flicker when zooming in on these lines

I'm wondering if it would make sense to take a slightly different tack, at least for a first introduction of this concept:

Would it make sense to introduce this track-source, at least initially, without the creation of the new geometry, and rather, show the raw underlying documents instead? Basically, it would be very close to an es document source, and behave completely identical after configuration, except, there's a reduction-step based on top-hits.

I think restricting to the above would already be really useful (basically, it would remove the need for this contrived data-modeling of maintaining a last-known index and an ndex with all docs), and it would allow us to postpone solving some of the more difficult questions to a later iteration:

no need to "collapse" documents into a new phantom geometry and the issues that could introduce (e.g. flickering due to spatial filtering)
no need to find a way to aggregate constituent fields into some meaningful metrics for this new phantom geometry.
no need to limit this to just geo_point (although, that might be a good initial restriction anyway).
no need to limit this to index-patterns with a date-field (although, that might be a good initial restriction anyway, since then we don't have to worry about defining any custom sorting)
dynamic styling will just work and can be driven by the field-values of the underlying index-pattern, just like a regular es_document source.

In this new trimmed-down track-source, we could add:

making the size configurable. 1 would be equivalent to "last known location"

As far as the limit on docs goes. Perhaps we should think about this in context of the entire app.

The following should essentially share the same upper bound of that of the es-document-sources

limit on number of docs (2014 now)
limit on number of terms in es_join source (10000 now)
limit on term-splits in tracks (30 in this PR)

This minimum upper bound (or whatever this would be called) could indeed be an advanced setting.

alexfrancoeur · 2019-03-25T15:52:08Z

If we take a reduced approach here, it sounds like this should just be another option (show all locations for timeframe / show last known location) for the elasticsearch document source vs. introducing a new one. That's just my first opinion upon reading your comment @thomasneirynck, maybe it make sense to touch base as a group over zoom for 30 minutes this week?

elasticmachine · 2019-03-26T20:15:26Z

💚 Build Succeeded

continuous-integration/kibana-ci/pull-request

nreese · 2019-06-04T22:35:56Z

Closing.

Moving top hits into Documents source as suggested #38052

[Maps] tracks source

74bc917

nreese added release_note:enhancement [Deprecated-Use Team:Presentation]Team:Geo Former Team Label for Geo Team. Now use Team:Presentation v8.0.0 v7.2.0 labels Mar 15, 2019

thomasneirynck self-requested a review March 22, 2019 16:56

thomasneirynck reviewed Mar 22, 2019

View reviewed changes

Merge branch 'master' of github.com:elastic/kibana into tracks

174f9b2

thomasneirynck mentioned this pull request Apr 19, 2019

[Maps] Support for timeslider and playback #27714

Closed

nreese removed the v7.2.0 label May 21, 2019

nreese mentioned this pull request Jun 4, 2019

[Maps] add support for Top Hits to Documents source #38052

Merged

nreese closed this Jun 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Maps] tracks source #33320

[Maps] tracks source #33320

nreese commented Mar 15, 2019 •

edited

Loading

elasticmachine commented Mar 15, 2019

elasticmachine commented Mar 15, 2019

alexfrancoeur commented Mar 15, 2019

alexfrancoeur commented Mar 15, 2019

nreese commented Mar 15, 2019 •

edited

Loading

nreese commented Mar 15, 2019

alexfrancoeur commented Mar 15, 2019

alexfrancoeur commented Mar 15, 2019

thomasneirynck left a comment •

edited

Loading

alexfrancoeur commented Mar 25, 2019

elasticmachine commented Mar 26, 2019

nreese commented Jun 4, 2019

[Maps] tracks source #33320

[Maps] tracks source #33320

Conversation

nreese commented Mar 15, 2019 • edited Loading

elasticmachine commented Mar 15, 2019

elasticmachine commented Mar 15, 2019

💔 Build Failed

alexfrancoeur commented Mar 15, 2019

alexfrancoeur commented Mar 15, 2019

nreese commented Mar 15, 2019 • edited Loading

nreese commented Mar 15, 2019

alexfrancoeur commented Mar 15, 2019

alexfrancoeur commented Mar 15, 2019

thomasneirynck left a comment • edited Loading

Choose a reason for hiding this comment

alexfrancoeur commented Mar 25, 2019

elasticmachine commented Mar 26, 2019

💚 Build Succeeded

nreese commented Jun 4, 2019

nreese commented Mar 15, 2019 •

edited

Loading

nreese commented Mar 15, 2019 •

edited

Loading

thomasneirynck left a comment •

edited

Loading