Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/transform] Add ability to change metric data type #10130

Closed
TylerHelmuth opened this issue May 18, 2022 · 10 comments
Closed

[processor/transform] Add ability to change metric data type #10130

TylerHelmuth opened this issue May 18, 2022 · 10 comments
Assignees
Labels
help wanted Extra attention is needed priority:p1 High

Comments

@TylerHelmuth
Copy link
Member

TylerHelmuth commented May 18, 2022

Is your feature request related to a problem? Please describe.
The transform processor is being wired up for metrics. As a result, it is possible to do a query like

`set("metric.type", "Gauge") where "metric.type" == "Sum"

But the metrics implementation does not know how to convert from one data type to another.

Describe the solution you'd like

The transformprocessor should be updated to be able to convert between all appropriate data types.

** Alternative Solutions **
set is never allowed to change the data type and specific functions should be used instead.

@codeboten codeboten added the priority:p1 High label May 18, 2022
@bogdandrutu
Copy link
Member

bogdandrutu commented May 18, 2022

This should be only for "scalar/number" (a.k.a only between sum -> gauge, or between monotonic/non-monotonic sum) metrics.

@BinaryFissionGames
Copy link
Contributor

I'm for a separate function; I think it'll be more transparent that more happens than just setting a single field on the metric, something like convert_scalar_metric_type?

@djaglowski
Copy link
Member

djaglowski commented May 19, 2022

When setting the data type to a sum, I think it is also appropriate to specify the monotonicity and aggregation temporality.

@TylerHelmuth
Copy link
Member Author

When setting the data type to a sum, I think it is also appropriate to specify the monotonicity and aggregation temporality.

Agreed

@dmitryax dmitryax added the help wanted Extra attention is needed label May 20, 2022
@BinaryFissionGames
Copy link
Contributor

I'd be happy to take a stab at this.
Here's what I propose:

Two functions:

  • convert_scalar_metric_to_sum(aggregation_temporality, monotonic)
    • Where aggregation_temporality is one of "delta" or "cumulative", and monotonic is true or false
    • If the metric is already a sum, aggregation_temporality and monotonic will be set based on the parameters.
    • If the metric is a gauge, it will be converted to a sum with the given aggregation temporality and monotonicity.
    • This function is a no op on non-scalar metrics
    • This function is only valid for metric pipelines
  • convert_scalar_metric_to_gauge()
    • No input parameters
    • If the metric is already a gauge, nothing is done.
    • If the metric is a sum, it is converted to a gauge
    • This function is a no-op for non-scalar metrics
    • This function is only valid for metric pipelines

The functions could be used like this:

transform:
   metrics:
      queries:
         # Convert all sum metrics to gauges
         - convert_scalar_metric_to_gauge() where metric.type == "Sum"
         # convert the "my.metric.name.count" metric to a monotonic sum with "cumulative" as its aggregation temporality.
         - convert_scalar_metric_to_sum("cumulative", true) where metric.name == "my.metric.name.count"

@TylerHelmuth @bogdandrutu @djaglowski Thoughts on the above design?

@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented May 20, 2022

If the metric is already a sum, aggregation_temporality and monotonic will be set based on the parameters.

The proposal will almost be able to replace the need for the cumulativetodelta processor because once multiple conditions are allowed in the where clause you'll be able to do:

convert_scalar_metric_to_sum("delta", "true") where metric.aggregation_temporality == "Cumulative" and metric.type == "Sum" and metric.is_monotonic == "True"

which is awesome, but also feels like the the function is doing something unintended (changing only temporality for Sum -> Sum). Also, histograms will need to be able to switch temporality, so we'll likely need a convert_aggregation_temporality function in the future no matter what.

Should these functions be restricted to only going from Sum -> Gauge and from Gauge -> Sum, with Sum -> Sum conversions of temporality and monotonicity being handled in different function(s)? Going from Gauge -> Sum you would still be allowed to specify which temporality and monotonicity be used during the conversion.

@BinaryFissionGames
Copy link
Contributor

Should these functions be restricted to only going from Sum -> Gauge and from Gauge -> Sum, with Sum -> Sum conversions of temporality and monotonicity being handled in different function(s)? Going from Gauge -> Sum you would still be allowed to specify which temporality and monotonicity be used during the conversion.

I like the idea of reducing the scope a bit here, for the sake of orthogonality.

One thing that this brings up is that converting Sum -> Sum in terms of aggregation temporality is a little overloaded. I was imagining that this "conversion" would just be changing the descriptor, while keeping the original datapoint values. This would be the case for gauge -> sum, regardless of the aggregation temporality.

Maybe "convert" is the wrong term to use here and the function names should be more like gauge_to_sum and sum_to_gauge or similar, without a convert_ prefix.

@TylerHelmuth
Copy link
Member Author

One thing that this brings up is that converting Sum -> Sum in terms of aggregation temporality is a little overloaded. I was imagining that this "conversion" would just be changing the descriptor, while keeping the original datapoint values.

If Sum -> Sum aggregation temporality was only going to update the value in the data model but not convert the data itself then I definitely think we should not allow Sum -> Sum.

The function syntax standards require the function to start with a verb, so I think the names convert_gauge_to_sum and convert_sum_to_gauge are good.

@BinaryFissionGames
Copy link
Contributor

Just to sum up what was discussed here:

  • convert_gauge_to_sum(aggregation_temporality, monotonic)
    • Where aggregation_temporality is one of "delta" or "cumulative", and monotonic is true or false
    • If the metric is a gauge, it will be converted to a sum with the given aggregation temporality and monotonicity.
    • This function is a no op on anything other than a gauge
    • This function is only valid for metric pipelines
  • convert_sum_to_gauge()
    • No input parameters
    • If the metric is a sum, it is converted to a gauge
    • This function is a no-op for anything other than sum metrics
    • This function is only valid for metric pipelines

I'm going to start on this, let me know if there are any concerns with the above.

@TylerHelmuth
Copy link
Member Author

Looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed priority:p1 High
Projects
None yet
Development

No branches or pull requests

6 participants