-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Maps] tracks source #33320
[Maps] tracks source #33320
Conversation
Pinging @elastic/kibana-gis |
💔 Build Failed |
@nreese this is awesome!! I'll try it out with my flight data today |
@nreese I love this! I think I'm running into issues with the dynamic mapping here, likely because of the limitation we have for the number of individual documents we can show on a map. In the above map, the dark red points are the most up to date flight records. The lines are built from history. Both utilizing this script (live_flights.py.zip) I would expect each point to be at the front of every line. If I zoom in enough, this is the case. But as I zoom out, or simply refresh at a zoomed out map (as you can see in the gif above) lines tend to come and go. I imagine this is because we're doing some sort of filtering client side so we don't crash the browser. However, this isn't necessarily obvious that this is happening. And I'm still not sure if it's the case. It's possible that I'm just checking out this PR too early (sorry, I was excited). Side note - I didn't see an option for filtering on visible map layer for tracks. Not sure if this was intentional or not. Also, after running for a bit I got this. Unrelated, but I'm not convinced dynamic filtering works for |
@alexfrancoeur The kibana/x-pack/plugins/maps/public/shared/layers/sources/es_tracks_source/es_tracks_source.js Line 146 in 74bc917
Maybe these values should be configurable via advanced settings config or something? |
@alexfrancoeur Can you provide the full error message the next time you see that error. Not sure how to reproduce it. |
I can't reproduce either, it just popped up. But the attached saved objects file with the python script I put in my previous comment is what I had been using locally. |
Yeah, I figured that was the case. An advanced setting would be great to make it configurable, I think it make sense to add that. We'll need a warning somehow so that it's obvious there are limitations, and we should probably recommend to define zoom levels appropriately. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great! Really neat too that this essentially boiled down to just introducing a new source, and all the other functionality (tooltips, fitltering, bounds-fitting, ...) remains.
I think the concept is especially powerful because it will greatly reduce the complexity of data-modeling and ingestion of this type of track data. Users will no longer have to maintain two indices for the same data (last known index vs index with all historical updates).
After using this for a bit, I do think there might be a few things that will get complex.
This PR introduces a number of new concepts:
- a split field, which is an identifier of a feature over multiple documents, similar to a foreign key. Let's call it "entity" (as this seems to be something we want to expand on in the stack).
- creation of a "phantom" geometry for this entity, in this case a line, connecting the points.
- new "virtual" properties. in this case, pretty much what we get back from ES + some metadata, so this new entity can have some tooltip-contents
There are some restrictions:
- we only do top-hits for geo_points (otherwise the "connecting with line" does not make sense).
- we sort by date for "last known" functionality.
All this combined, seems to introduces some edge cases that might be tough to handle naively:
- Dynamic styling is suddenly complicated. We can no longer use the fields from the index-pattern. I think some sort of rolled up values would be useful (e.g. average of foobar, where foobar is a field in the documents), but then we have to introduce multiple metrics on these terms aggregation, a second pass to ES. fwiw - I'm not sure if we should develop a client-side "agg-engine" to do these kinds of roll-ups, we should push this to ES as much as we can.
- For the dynamic styling, from what I understand, users would like to symbolize each constituent segment of the line. This approach does not really help with this. In general, what kind of "geometry" to construct based on a set of documents could be much larger topic. E.g. we might want to do the same of any kind of data
- tooltips lose power: tooltips only work on the line, but are no longer relevant. Technically, they can just show some totals
- bounds filtering stops working. for a bounds that does not contain any of the vertices of the line, but is intersected by it, ES-will not return the results. This may cause a flicker when zooming in on these lines
I'm wondering if it would make sense to take a slightly different tack, at least for a first introduction of this concept:
Would it make sense to introduce this track-source, at least initially, without the creation of the new geometry, and rather, show the raw underlying documents instead? Basically, it would be very close to an es document source, and behave completely identical after configuration, except, there's a reduction-step based on top-hits.
I think restricting to the above would already be really useful (basically, it would remove the need for this contrived data-modeling of maintaining a last-known index and an ndex with all docs), and it would allow us to postpone solving some of the more difficult questions to a later iteration:
- no need to "collapse" documents into a new phantom geometry and the issues that could introduce (e.g. flickering due to spatial filtering)
- no need to find a way to aggregate constituent fields into some meaningful metrics for this new phantom geometry.
- no need to limit this to just
geo_point
(although, that might be a good initial restriction anyway). - no need to limit this to index-patterns with a date-field (although, that might be a good initial restriction anyway, since then we don't have to worry about defining any custom sorting)
- dynamic styling will just work and can be driven by the field-values of the underlying index-pattern, just like a regular es_document source.
In this new trimmed-down track-source, we could add:
- making the size configurable.
1
would be equivalent to "last known location"
As far as the limit on docs goes. Perhaps we should think about this in context of the entire app.
The following should essentially share the same upper bound of that of the es-document-sources
- limit on number of docs (2014 now)
- limit on number of terms in es_join source (10000 now)
- limit on term-splits in tracks (30 in this PR)
This minimum upper bound (or whatever this would be called) could indeed be an advanced setting.
If we take a reduced approach here, it sounds like this should just be another option (show all locations for timeframe / show last known location) for the elasticsearch document source vs. introducing a new one. That's just my first opinion upon reading your comment @thomasneirynck, maybe it make sense to touch base as a group over zoom for 30 minutes this week? |
💚 Build Succeeded |
Closing. Moving top hits into Documents source as suggested #38052 |
WIP
This PR adds a new source called
Tracks
that uses a terms aggregation and top_hits aggregation to create tracks from documents. The document geo_point fields are converted into a LineString for the terms split bucket.The screen shot is a nonsense demo using sample web logs and making tracks on machine.os.keyword just because none of the sample data sets have tracks in them
cc @alexfrancoeur @AlonaNadler