Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mitigation] Breaking change in Istio 1.5 telemetry #478

Closed
stefanprodan opened this issue Mar 7, 2020 · 0 comments
Closed

[Mitigation] Breaking change in Istio 1.5 telemetry #478

stefanprodan opened this issue Mar 7, 2020 · 0 comments

Comments

@stefanprodan
Copy link
Member

stefanprodan commented Mar 7, 2020

Istio 1.5 comes with a breaking change for Flagger uses. In Istio telemetry v2 the metric istio_request_duration_seconds_bucket has been removed and replaced with istio_request_duration_milliseconds_bucket. This change breaks Flagger's request-duration metric check that queries Prometheus for istio_request_duration_seconds_bucket.

Mitigation

With Flagger v1.0 is possible to define custom metric checks.

Create a metric template object for istio_request_duration_milliseconds_bucket:

apiVersion: flagger.app/v1beta1
kind: MetricTemplate
metadata:
  name: latency
  namespace: istio-system
spec:
  provider:
    type: prometheus
    address: http://prometheus.istio-system:9090
  query: |
    histogram_quantile(
        0.99,
        sum(
            rate(
                istio_request_duration_milliseconds_bucket{
                    reporter="destination",
                    destination_workload_namespace="{{ namespace }}",
                    destination_workload=~"{{ target }}"
                }[{{ interval }}]
            )
        ) by (le)
    )

In the canary manifests, replace request-duration metric with:

  analysis:
    metrics:
    - name: latency
      templateRef:
        name: latency
        namespace: istio-system
      thresholdRange:
        max: 500
      interval: 1m

Note that you need to upgrade Flagger to v1.0 to use metric templates. Am upgrade guide is available here.

@stefanprodan stefanprodan changed the title Breaking change in Istio 1.5 telemetry [Mitigation] Breaking change in Istio 1.5 telemetry Mar 7, 2020
funkypenguin added a commit to funkypenguin/flagger that referenced this issue Sep 19, 2021
A minor issue I stumbled across while learning how to drive Flagger, is that the docs still use `istio_request_duration_seconds_bucket` to illustrate the query behind the `request-duration` metric. I understand that this changed with Istio 1.5 (fluxcd#478), but it seems that in the current version of flagger, the correct metric must already be used, since I'm getting duration metrics out of Istio 1.10 :)

This change simply makes the docs clearer for those of us trying to understand exactly what `request-duration` entails!
funkypenguin added a commit to funkypenguin/flagger that referenced this issue Sep 19, 2021
A minor issue I stumbled across while learning how to drive Flagger, is that the docs still use `istio_request_duration_seconds_bucket` to illustrate the query behind the `request-duration` metric. I understand that this changed with Istio 1.5 (fluxcd#478), but it seems that in the current version of flagger, the correct metric must already be used, since I'm getting duration metrics out of Istio 1.10 :)

This change simply makes the docs clearer for those of us trying to understand exactly what `request-duration` entails!

Signed-off-by: David Young <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant