-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a default mapping template for metrics #72536
Comments
Pinging @elastic/es-search (Team:Search) |
Pinging @elastic/es-core-features (Team:Core/Features) |
@exekias I can imagine the number of dynamic templates growing very quickly using this approach so I'd like to take some time to look into whether Elasticsearch can make it easier. One idea I'm considering is giving clients the ability to override metadata as part of bulk requests similarly to how we allow selecting a dynamic template, something like this:
With the cc @csoulios given the ongoing discussion about leveraging |
Because other stack components rely on I'd feel better if the definition of those dynamic templates had some way to indicate that they expected parameterized metadata, either a new flag ( |
I really like this idea. I imagine a request would look something like:
and the dynamic_template (but probably with some conditional rendering of
|
I'm removing the |
Pinging @elastic/es-data-management (Team:Data Management) |
When working on #104037, this issue came up again. It is particularly relevant for dynamically mapping the In addition to @axw's proposal (#72536 (comment)), I'd like to propose another alternative that is based around encoding the unit in the field path and then reading out the unit from the field name in the meta.unit field of the metric itself. We could store metrics in a With a dynamic template like this, we could then read out the {
"mappings": {
"dynamic_templates": [
{
"counter": {
"path_match": "metrics.counter.*.*",
"mapping": {
"type": "{dynamic_type}",
"time_series_metric": "counter",
"meta": {
"unit": "{parent.name}"
}
}
}
}
]
}
} This introduces a new To make sure that you can still just do aggregations on The dynamic mapping for the unit field could look like the following: {
"mappings": {
"dynamic_templates": [
{
"units": {
"path_match": "metrics.*.*",
"path_unmatch": "metrics.*.*.*",
"match_mapping_type": "object",
"mapping": {
"subobjects": false,
"type": "passthrough"
}
}
}
]
}
} |
I had a chat about the two proposals with the @elastic/es-storage-engine team. @kkrik-es felt like that we should maybe defer support for dynamically specifying the unit for now and rather build a dedicated metrics intake API in Elasticsearch first, possibly OTLP. This is not the only reason why a dedicated metrics API would make sense but it would probably also make it easier to support units compared to building something that relies on We've also discussed whether changing the internal structure of how we store metrics may be considered a breaking change. For example, if we store metric in Maybe we shouldn't allow any features for OTel metrics that rely on the physical structure of the ES documents (such as ingest processing, and exposing If we consider structural changes to be a breaking change, there's some urgency to get the mapping right soon. |
++ If I were to choose, I'd just implement OTLP support. Otherwise we'll still need to adapt OTel metrics to whatever we build -- so unless there's some additional metric features that don't exist in OTel, I don't see the point.
++ A big shift indeed, and potentially far-reaching. e.g. it could also cover removing the need for |
I like that but it seems it would make hard to migrate to that because it breaks a lot of assumptions.
Why can't we do that now and why does not exposing individual data points enable us to do that? |
It comes down to user expectations: if you expect to be able to get out of ES exactly what you put in, then it precludes the kind of automatic transformations I'm thinking of. If on the other hand you are happy to record a data point in a time series, and only ever get back aggregations with some bounded granularity, then it's fine. It wouldn't really matter if the storage is cumulative, delta, rolled up at ingest time, etc., as it would be hidden from the user. |
I agree that we should be able to do that but I don't think we're blocked to do that now. I'd argue that with rollups, we've kind of broken the glass already. Configuring rollups is currently a conscious choice made by the user. However, I don't think we'd be blocked from enabling rollups by default in integrations, for example. |
Another idea that came to mind is that we could allow accessing field values of the document via template variables. We could then store the unit in a top-level {
"counter": {
"mapping": {
"type": "{dynamic_type}",
"time_series_metric": "counter",
"meta": {
"unit": "{doc[unit]}"
}
}
}
} |
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
With dynamic templates in bulk requests (#69948) merged, we are considering the option of creating a default template to allow for dynamic metrics ingestion. This would be leveraged by many Elastic Agent integrations, and is specially interesting for the ones reporting dynamic metrics (where we only get to know the metric / field names at runtime, and hence, we cannot define a mapping before ingestion).
The overall idea is to map all possible combinations of <metric_type> and <unit>, so each combination gets a unique predictable name that can be referenced at ingest time. For example:
The obvious concern is: We would be creating a dynamic mapping entry per every combination of <metric_type> and , which will explode into many entries:
byte
,percent
,d
,h
,m
,s
,ms
,micros
,nanos
, with more units to come as we onboard other metrics.long
,double
,integer
,byte
,float
,scaled_float
.So the main question is: Would this be considered a good practice? Can it cause any issues because of a too large mapping template? Perhaps we should do this in a different way?
The text was updated successfully, but these errors were encountered: