Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynatrace v2 ingestion reporting "inconsistent gauge fields" #3007

Closed
stevegaeke-kr opened this issue Feb 8, 2022 · 9 comments · Fixed by #3030
Closed

Dynatrace v2 ingestion reporting "inconsistent gauge fields" #3007

stevegaeke-kr opened this issue Feb 8, 2022 · 9 comments · Fixed by #3030
Labels
bug A general bug registry: dynatrace A Dynatrace Registry related issue
Milestone

Comments

@stevegaeke-kr
Copy link

Describe the bug
Dynatrace is reporting an inconsistent gauge error due to the measurement count being zero, but a maximum value is still being reported.

ERROR i.m.dynatrace.v2.DynatraceExporterV2 - Failed metric ingestion: Error Code=400, Response Body={"linesOk":122,"linesInvalid":1,"error":{"code":400,"message":"1 invalid lines","invalidLines":[{"line":90,"error":"inconsistent gauge fields: count == 0 => min, max, sum == 0: min: 0.000000, max: 1642.000000, sum: 0.000000, count: 0"}]},...

It seems like the DynatraceExporterV2 class can be enhanced to protect against this situation similar to the checks that are performed on min to keep it in bounds.

This is the routine I believe could be enhanced to prevent reporting max when count is zero.

    private Stream<String> toSummaryLine(Meter meter, HistogramSnapshot histogramSnapshot, TimeUnit timeUnit) {
        long count = histogramSnapshot.count();
        double total = timeUnit != null ? histogramSnapshot.total(timeUnit) : histogramSnapshot.total();
        double max = timeUnit != null ? histogramSnapshot.max(timeUnit) : histogramSnapshot.max();
        double min = count == 1L ? max : this.minFromHistogramSnapshot(histogramSnapshot, timeUnit);
        return this.createSummaryLine(meter, min, max, total, count);
    }

I'm assuming the cause is related to the note found in this section: https://micrometer.io/docs/concepts#_timers

Is the period of the Dynatrace export job synchronized with any max value clearing?

Environment

  • Micrometer version 1.8.1
  • Micrometer registry Dynatrace v2
  • OS: Linux, MacOS
  • Java version: 11.0.7

To Reproduce
How to reproduce the bug:

In SpringBoot, create a Time using:

    messageProcessingDuration = Timer.builder(MESSAGE_PROCESSING_DURATION)
        .register(meterRegistry);

   // Later, call messageProcessingDuration.record(duration);
   // And then wait for 2 reporting cycles. First it reports the max with count = 1.
   // In the subsequent cycle, the count becomes 0 but the max continues to be reported.

Expected behavior

If the gauge count == 0, no statistics should be reported to avoid Dynatrace reporting inconsistent gauge errors.

Additional context
Add any other context about the problem here, e.g. related issues.

@stevegaeke-kr stevegaeke-kr added the bug A general bug label Feb 8, 2022
@pirgeo
Copy link
Contributor

pirgeo commented Feb 10, 2022

Hi, thanks for reporting this. We have a PR open to fix this behavior, it should hopefully get merged soon. We're also working with the Micrometer maintainers to adapt the max buffering/decaying behavior in general (also for count>0) to match the input expected by Dynatrace.
@jonatan-ivanov @shakuzen Could you take another look at the PR (#2970)?

@stevegaeke-kr
Copy link
Author

Thanks for the good news @pirgeo!

Will the max buffering/decaying behavior enhancement you described above address the statistics inconsistency like the example below?

{"line":53,"error":"inconsistent gauge fields: min <= avg <= max doesn't hold: min: 473.283030, max: 473.283030, sum: 104.426720, count: 1, avg: 104.426720, tolerance: 0.000001"}

@jonatan-ivanov
Copy link
Member

Fixed in #3030

@jonatan-ivanov jonatan-ivanov added the registry: dynatrace A Dynatrace Registry related issue label Feb 15, 2022
@pirgeo
Copy link
Contributor

pirgeo commented Feb 16, 2022

Hey @stevegaeke-kr, yes, we are currently looking into this to see if and how we can improve this behavior as well!

@sathish256
Copy link

Thanks for the good news @pirgeo!

Will the max buffering/decaying behavior enhancement you described above address the statistics inconsistency like the example below?

{"line":53,"error":"inconsistent gauge fields: min <= avg <= max doesn't hold: min: 473.283030, max: 473.283030, sum: 104.426720, count: 1, avg: 104.426720, tolerance: 0.000001"}

@stevegaeke-kr We are facing the same error, but are unable to find the metrics which is causing the issue. Could you please help us to narrow down the metrics which is causing the issue.

@shakuzen
Copy link
Member

@sathish256 are you using the latest version of Micrometer? If you confirm the issue still happens with the latest version, please open a new issue with details for us to investigate.

@sathish256
Copy link

@shakuzen Thank you for the response, Yes we are on the latest version of the micrometer. Sure I will create new issue with full details.

@ImpulseKomal
Copy link

@sathish256 , do you cause of the issue? I am facing the same. I don't see any new issue that you have created for this error.

@pirgeo
Copy link
Contributor

pirgeo commented Feb 13, 2024

Hi @ImpulseKomal,
With the upgrade to 1.12.3 (released yesterday) you should be able to see which metrics are the offending ones (the metric name will be logged). Additionally, I have opened this PR: #4724 to fix LongTaskTimer metrics, which we found to be the cause of this error in many cases. I hope to get this change into 1.12.4 (I assume ~March 11, cc @jonatan-ivanov, @shakuzen).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A general bug registry: dynatrace A Dynatrace Registry related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants