-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[APM] Use status_code field to calculate error rate #71109
Conversation
Pinging @elastic/apm-ui (Team:apm) |
@@ -17,6 +17,9 @@ import { | |||
} from '../helpers/setup_request'; | |||
import { rangeFilter } from '../../../common/utils/range_filter'; | |||
|
|||
// Regex for 5xx and 4xx | |||
const errorStatusCodeRegex = /5\d{2}|4\d{2}/; | |||
|
|||
export async function getErrorRate({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an api test for GET /api/apm/services/{serviceName}/errors/rate
to cover this?
`http.response.status_code` is a long, so you can use a range filter
(gte 400) (you can also use it on strings btw).
…On Wed, Jul 8, 2020 at 7:59 PM Cauê Marcondes ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In x-pack/plugins/apm/server/lib/errors/get_error_rate.ts
<#71109 (comment)>:
> - size: 0,
- query: {
- bool: {
- filter: [
- ...filter,
- ...groupIdFilter,
- { term: { [PROCESSOR_EVENT]: ProcessorEvent.error } },
- ],
+ aggs: {
+ histogram: {
+ date_histogram: getMetricsDateHistogramParams(start, end),
+ aggs: {
+ statusAggregation: {
+ terms: {
+ field: HTTP_RESPONSE_STATUS_CODE,
+ size: 10,
@sqren <https://github.com/sqren> do you for instance have an idea on how
to fetch only the erroneous transactions? I can do a terms filter, but I'd
have to manually pass all possible status in the array.
{
"terms": {
"http.response.status_code": ["500", "400"...]
}
}
—
You are receiving this because your review was requested.
Reply to this email directly, view it on GitHub
<#71109 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACWDXDLVSGGOW23NVOLWYDR2SXXNANCNFSM4OUV5H4Q>
.
|
So I could do:
|
@cauemarcondes yes (I'm not sure if it's useful to provide the upper bounds though) |
Why? But I'm good to use only |
@cauemarcondes status codes from 600 and above are not valid, and if they do exist in the data, I'd expect that they should be treated as errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 👍
thanks but I still have to add the error chart in the transaction details page. 😅 |
Can you double check with @nehaduggal first? |
Lets call it Transaction error rate chart and add it to the transactions page. |
Have we considered how this would work with pre-aggregated transactions? |
We will use the |
I do have some concerns about the things that are being added that are not compatible with pre-aggregated transactions today (the RUM UI is another example). APM Server's release cycle is not tied to Kibana, so it's a risk that the UI expects fields in the pre-aggregated transaction documents that are not there because the user is running an older version of APM Server. That being said, we haven't shipped pre-aggregated transactions yet, so maybe I'm thinking too far ahead. |
@elasticmachine merge upstream |
💚 Build SucceededBuild metrics
History
To update your PR or re-run it, just comment with: |
* calculating error rate based on status code * fixing unit test * addressing pr comments * adding erroneous transactions rate * adding erroneous transactions rate * adding error rate to detail page * fixing i18n Co-authored-by: Elastic Machine <[email protected]>
💚 Backport successfulThe PR was backported to the following branches:
|
* master: (314 commits) [APM] Use status_code field to calculate error rate (elastic#71109) [Observability] Change appLink passing the date range (elastic#71259) [Security] Add Timeline improvements (elastic#71506) adjust vislib bar opacity (elastic#71421) Fix ScopedHistory mock and adapt usages (elastic#71404) [Security Solution] Add hook for reading/writing resolver query params (elastic#70809) [APM] Bug fixes from ML integration testing (elastic#71564) [Discover] Add caused_by.type and caused_by.reason to error toast modal (elastic#70404) [Security Solution] Add 3rd level breadcrumb to admin page (elastic#71275) [Security Solution][Exceptions] Exception modal bulk close alerts that match exception attributes (elastic#71321) Change signal.rule.risk score mapping from keyword to float (elastic#71126) Added help text where needed on connectors and alert actions UI (elastic#69601) [SIEM][Detections] Value Lists Management Modal (elastic#67068) [test] Skips test preventing promotion of ES snapshot elastic#71582 [test] Skips test preventing promotion of ES snapshot elastic#71555 [ILM] Fix alignment of the timing field (elastic#71273) [SIEM][Detection Engine][Lists] Adds the ability for exception lists to be multi-list queried. (elastic#71540) initial telemetry setup (elastic#69330) [Reporting] Formatting fixes for CSV export in Discover, CSV download from Dashboard panel (elastic#67027) Search across spaces (elastic#67644) ...
…t-apps-page-titles * 'master' of github.com:elastic/kibana: (88 commits) [ML] Functional tests - disable DFA creation and cloning tests [APM] Use status_code field to calculate error rate (elastic#71109) [Observability] Change appLink passing the date range (elastic#71259) [Security] Add Timeline improvements (elastic#71506) adjust vislib bar opacity (elastic#71421) Fix ScopedHistory mock and adapt usages (elastic#71404) [Security Solution] Add hook for reading/writing resolver query params (elastic#70809) [APM] Bug fixes from ML integration testing (elastic#71564) [Discover] Add caused_by.type and caused_by.reason to error toast modal (elastic#70404) [Security Solution] Add 3rd level breadcrumb to admin page (elastic#71275) [Security Solution][Exceptions] Exception modal bulk close alerts that match exception attributes (elastic#71321) Change signal.rule.risk score mapping from keyword to float (elastic#71126) Added help text where needed on connectors and alert actions UI (elastic#69601) [SIEM][Detections] Value Lists Management Modal (elastic#67068) [test] Skips test preventing promotion of ES snapshot elastic#71582 [test] Skips test preventing promotion of ES snapshot elastic#71555 [ILM] Fix alignment of the timing field (elastic#71273) [SIEM][Detection Engine][Lists] Adds the ability for exception lists to be multi-list queried. (elastic#71540) initial telemetry setup (elastic#69330) [Reporting] Formatting fixes for CSV export in Discover, CSV download from Dashboard panel (elastic#67027) ... # Conflicts: # x-pack/plugins/index_management/public/application/index.tsx
* master: (72 commits) [test] Skips test preventing promotion of ES snapshot elastic#71612 [Logs UI] Remove UUID from Alert Instances (elastic#71340) [Metrics UI] Remove UUID from Alert Instance IDs (elastic#71335) [ML] Functional tests - disable DFA creation and cloning tests [APM] Use status_code field to calculate error rate (elastic#71109) [Observability] Change appLink passing the date range (elastic#71259) [Security] Add Timeline improvements (elastic#71506) adjust vislib bar opacity (elastic#71421) Fix ScopedHistory mock and adapt usages (elastic#71404) [Security Solution] Add hook for reading/writing resolver query params (elastic#70809) [APM] Bug fixes from ML integration testing (elastic#71564) [Discover] Add caused_by.type and caused_by.reason to error toast modal (elastic#70404) [Security Solution] Add 3rd level breadcrumb to admin page (elastic#71275) [Security Solution][Exceptions] Exception modal bulk close alerts that match exception attributes (elastic#71321) Change signal.rule.risk score mapping from keyword to float (elastic#71126) Added help text where needed on connectors and alert actions UI (elastic#69601) [SIEM][Detections] Value Lists Management Modal (elastic#67068) [test] Skips test preventing promotion of ES snapshot elastic#71582 [test] Skips test preventing promotion of ES snapshot elastic#71555 [ILM] Fix alignment of the timing field (elastic#71273) ...
* calculating error rate based on status code * fixing unit test * addressing pr comments * adding erroneous transactions rate * adding erroneous transactions rate * adding error rate to detail page * fixing i18n Co-authored-by: Elastic Machine <[email protected]> Co-authored-by: Cauê Marcondes <[email protected]> Co-authored-by: Elastic Machine <[email protected]>
Some conclusions from testing:
|
closes #70223