-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New mapping parameters to annotate dimensions and metrics in timeseries data #74014
Comments
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Pinging @elastic/es-search (Team:Search) |
I feel like
|
I like |
As long as it's only used for downsampling/rollups, and doesn't bleed into restricting storage or aggregations generally. There's an issue with summary metrics I'd like to raise. In Elastic APM (maybe eventually Metricbeat's Prometheus module? CC @exekias), we would like to store pre-aggregated "summary" metrics: always sum/count, and optionally min/max and possibly other quantiles. These metrics would always support value_count, sum, and avg; possibly min/max. Ideally we would use aggregate_metric_double, but you have to explicitly state up front which metric sub-fields to store. We don't necessarily know this up front. |
@imotov I am fine with renaming However, I think that renaming Finally, in my view the most important part is that |
@csoulios had a chat about the aggregation field. The |
@axw I have created a separate issue #74145 to discuss about relaxing the constraints in |
A related problem has has been brewing around histograms. I had been putting it off, but @benwtrent has just come across the same problem (see #74213) The histogram field docs say:
The problem with this is that a consumer doesn't necessarily know how the data was stored. This certainly applies to custom application metrics, e.g. so Lens could automatically decide the most appropriate algorithm to use for calculating percentiles. I imagine this would also apply when rolling up histogram metrics. Moreover, for rollups you might want to reduce the histogram resolution to reduce storage cost. Would it make sense to extend the |
I wouldn't mix these two. The issue in #74213 is quite specific to histogram fields. In a sense these two concernes are somewhat orthogonal, I would rather add an |
Added the dimension parameter to the following field types: keyword ip Numeric field types (integer, long, byte, short) The dimension parameter is of type boolean (default: false) and is used to mark that a field is a time series dimension field. Relates to #74014
…tic#78012) This PR renames dimension mapping parameter to time_series_dimension to make it consistent with time_series_metric parameter (elastic#76766) Relates to elastic#74450 and elastic#74014
…#78204) Added the time_series_metric mapping parameter to the unsigned_long and scaled_float field types Added the time_series_dimension mapping parameter to the unsigned_long field type Fixes elastic#78100 Relates to elastic#76766, elastic#74450 and elastic#74014
…rameters (#78265) Backports the following PRs: * Add dimension mapping parameter (#74450) Added the dimension parameter to the following field types: keyword ip Numeric field types (integer, long, byte, short) The dimension parameter is of type boolean (default: false) and is used to mark that a field is a time series dimension field. Relates to #74014 * Add constraints to dimension fields (#74939) This PR adds the following constraints to dimension fields: It must be an indexed field and must has doc values It cannot be multi-valued The number of dimension fields in the index mapping must not be more than 16. This should be configurable through an index property (index.mapping.dimension_fields.limit) keyword fields cannot be more than 1024 bytes long keyword fields must not use a normalizer Based on the code added in PR #74450 Relates to #74660 * Expand DocumentMapperTests (#76368) Adds a test for setting the maximum number of dimensions setting and tests the names and types of the metadata fields in the index. Previously we just asserted the count of metadata fields. That made it hard to read failures. * Fix broken test for dimension keywords (#75408) Test was failing because it was testing 1024 bytes long keyword and assertion was failing. Closes #75225 * Checkstyle * Add time_series_metric parameter (#76766) This PR adds the time_series_metric parameter to the following field types: Numeric field types histogram aggregate_metric_double * Rename `dimension` mapping parameter to `time_series_dimension` (#78012) This PR renames dimension mapping parameter to time_series_dimension to make it consistent with time_series_metric parameter (#76766) Relates to #74450 and #74014 * Add time series params to `unsigned_long` and `scaled_float` (#78204) Added the time_series_metric mapping parameter to the unsigned_long and scaled_float field types Added the time_series_dimension mapping parameter to the unsigned_long field type Fixes #78100 Relates to #76766, #74450 and #74014 Co-authored-by: Nik Everett <[email protected]>
As Elasticsearch embraces time series data, we must make sure that fundamental timeseries concepts become first class citizens. We propose that we implement two new mapping parameters to be used for annotating dimension and metric fields in the index mapping.
Mapping dimensions
To mark a field as a dimension we will create a mapping paramenter named
time_series_dimension
that can take boolean values.Mapping metrics
To mark a field as a metric we must create a mapping paramenter named
time_series_metric
. Its value will be a string that can take one of the following values:gauge
,counter
,histogram
andsummary
.For each metric type there should be a set of supported downsampling aggregations. However, there are cases that users want to override the downsampling aggregations that are supported by default. To allow them to override the default aggregations, we can allow the
time_series_metric
to be an object containing atype
and aaggregations
field.An example to illustrate the index mapping can be found below:
This issue deprecates the support for
metric_type
key in the field mappingmeta
(#72536).Also, we should expose those parameters through the field capabilities API so that Kibana can access this information. However, this feature will be described in a separate issue.
The text was updated successfully, but these errors were encountered: