Querying what's 'current' in a time series index is hard, there may be a better way #61349
Labels
:Analytics/Aggregations
Aggregations
>enhancement
:Search/Search
Search-related issues that do not fall into other categories
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
Team:Search
Meta label for search team
I work on Uptime / Heartbeat here at Elastic, and a frequent source of query complexity is querying over the subset of our timeseries data that represents the current state of the system. For instance, "Tell me how many monitors are up vs down". This sounds simple, but it's not, because we have time series data. To accomplish this we must:
@timestamp
fieldmonitor.status
value ofup
vs.down
.This gets even more complex when you add querying on top. What if you also want to query only the most recent documents? If you find a value that was present in a past document that matches you may display an old 'current' status, rather than a new one.
We handle this today by using composite aggs and lots of post-processing in JS. We aim for our UI to handle large numbers of monitors, ~100k today. This essentially pushes the complexity of solving this difficult problems onto developers and isn't ideal.
I've discussed this a bit with @polyfractal and we've covered a few different solutions.
The text was updated successfully, but these errors were encountered: