Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Replace manual rate calculation with rate agg #114814

Closed
sorenlouv opened this issue Oct 13, 2021 · 2 comments · Fixed by #115651
Closed

[APM] Replace manual rate calculation with rate agg #114814

sorenlouv opened this issue Oct 13, 2021 · 2 comments · Fixed by #115651
Assignees
Labels
blocked discuss Team:APM All issues that need APM UI Team support v8.0.0

Comments

@sorenlouv
Copy link
Member

sorenlouv commented Oct 13, 2021

For calculating throughput we should use the rate agg.

There is currently a bug causing the rate agg to produce incorrect results for metric documents with a custom _doc_count.

We can work around this issue by using rate agg with "mode": "value_count", thus side-stepping doc count entirely. See examples below

Transaction-based

{
  "timeseries": {
    "date_histogram": {
      "field": "@timestamp",
      // ...
    },
    "aggs": {
        "throughput": {
          "rate": {
            "unit": "minute",
            "field": "transaction.duration.us",
            "mode": "value_count"
          }
        }
    }
  }
}

Metrics-based

{
  "timeseries": {
    "date_histogram": {
      "field": "@timestamp",
      // ...
    },
    "aggs": {
        "throughput": {
          "rate": {
            "unit": "minute",
            "field": "transaction.duration.histogram",
            "mode": "value_count"
          }
        }
    }
  }
}

Considerations

The could be the perf implications of using "mode": "value_count" over the custom doc count. If transaction.duration.histogram contains many buckets and we don’t use _doc_count it will be more expensive.

@sorenlouv sorenlouv added [zube]: Inbox discuss Team:APM All issues that need APM UI Team support labels Oct 13, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:apm)

@axw
Copy link
Member

axw commented Oct 14, 2021

The could be the perf implications of using "mode": "value_count" over the custom doc count. If transaction.duration.histogram contains many buckets and we don’t use _doc_count it will be more expensive.

It is substantially slower. I just ran an experiment, indexing 10000 transaction metric documents with 1000 buckets each. I then compared the aggregation using doc_count vs. value_count modes:

GET /metrics-apm.internal-*/_search?request_cache=false
{
  "size": 0,
  "aggs": {
    "timeseries": {
      "date_histogram": {
        "field": "@timestamp",
        "fixed_interval": "60s"
      },
      "aggs": {
        "throughput": {
          "rate": {
            "unit": "minute"
          }
        }
      }
    }
  }
}

GET /metrics-apm.internal-*/_search?request_cache=false
{
  "size": 0,
  "aggs": {
    "timeseries": {
      "date_histogram": {
        "field": "@timestamp",
        "fixed_interval": "60s"
      },
      "aggs": {
        "throughput": {
          "rate": {
            "unit": "minute",
            "mode": "value_count",
            "field": "transaction.duration.histogram"
          }
        }
      }
    }
  }
}

The first aggregation (doc_count mode) takes single digit milliseconds, while the second takes approximately 300-500ms. If elastic/elasticsearch#74359 were implemented I would expect the time to be in the same order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked discuss Team:APM All issues that need APM UI Team support v8.0.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants