Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Check if metric fields exist #145348

Merged
merged 25 commits into from
Apr 27, 2023
Merged

Conversation

kpatticha
Copy link
Contributor

@kpatticha kpatticha commented Nov 16, 2022

Fixes the error

"script_stack": [
            "[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues.throwIfEmpty(ScriptDocValues.java:92)",
            "[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues$Longs.get(ScriptDocValues.java:110)",
            "[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues$Longs.getValue(ScriptDocValues.java:105)",
            """total = useCgroupLimit ? doc[limitKey].value : doc['system.memory.total'].value;

    double """,
            "                                                                         ^---- HERE"


connected https://github.com/elastic/sdh-apm/issues/765 (internal)

This number represents the max possible value for the limit field.
*/
double CGROUP_LIMIT_MAX_VALUE = 9223372036854771712L;
if(doc.containsKey('${METRIC_CGROUP_MEMORY_LIMIT_BYTES}') && doc.containsKey('${METRIC_CGROUP_MEMORY_USAGE_BYTES}')){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about doc['${METRIC_SYSTEM_TOTAL_MEMORY}'] ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I was trying different things and forgot to add the one that was causing issue 🙈
good catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kpatticha can you reach out to the agents or server devs to figure out which fields are available in what scenarios? I'm hoping we can figure out something like "if METRIC_CGROUP_MEMORY_LIMIT_BYTES is available, then METRIC_CGROUP_MEMORY_USAGE_BYTES must be too, and if it's not available, we should use METRIC_SYSTEM_TOTAL_MEMORY, which is always available". Doesn't necessarily have to happen in this PR but I feel like with each change we understand how memory usage is recorded less.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure I can try that, it will be really useful 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started gathering information on how system memory work here but it should not block this PR

@kpatticha kpatticha marked this pull request as ready for review November 17, 2022 19:42
@kpatticha kpatticha requested a review from a team as a code owner November 17, 2022 19:42
@botelastic botelastic bot added the Team:APM All issues that need APM UI Team support label Nov 17, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:APM)

@kpatticha
Copy link
Contributor Author

@elasticmachine merge upstream

@kpatticha kpatticha added release_note:skip Skip the PR/issue when compiling release notes v8.7.0 labels Nov 17, 2022
@kpatticha
Copy link
Contributor Author

@elasticmachine merge upstream

@sorenlouv sorenlouv added v8.8.0 release_note:fix v8.7.1 and removed v8.7.0 release_note:skip Skip the PR/issue when compiling release notes labels Mar 22, 2023
@kpatticha
Copy link
Contributor Author

@elasticmachine merge upstream

2 similar comments
@kpatticha
Copy link
Contributor Author

@elasticmachine merge upstream

@kpatticha
Copy link
Contributor Author

@elasticmachine merge upstream

@kpatticha kpatticha force-pushed the painless-script branch 2 times, most recently from a84fcc0 to 87fa356 Compare March 29, 2023 12:01
This number represents the max possible value for the limit field.
*/
double CGROUP_LIMIT_MAX_VALUE = 9223372036854771712L;
if(doc.containsKey('${METRIC_SYSTEM_TOTAL_MEMORY}') && doc.containsKey('${METRIC_CGROUP_MEMORY_USAGE_BYTES}')){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the way to go about it... will this not unnecessarily return null in some cases? In all honesty it's a little hard to wrap my head around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm putting the PR back to draft. I've noticed recent changes in memory I need to take into consideration. https://github.com/elastic/kibana/blob/main/x-pack/plugins/apm/server/routes/metrics/by_agent/shared/memory/index.ts#L134-L179

@kpatticha kpatticha marked this pull request as draft April 3, 2023 08:04
@kpatticha kpatticha self-assigned this Apr 3, 2023
@kpatticha
Copy link
Contributor Author

note for me

Another case was reported when using APM dotnet agent. 8.7.2 is in 2 weeks

@sorenlouv
Copy link
Member

@kpatticha Did you talk to the server/agent folks about this one?

@kpatticha
Copy link
Contributor Author

@sqren yes.

so generally there is no guarantee that any of the system metrics will be available

Co-authored-by: Søren Louv-Jansen <[email protected]>
@kpatticha kpatticha enabled auto-merge (squash) April 27, 2023 09:10
@kpatticha kpatticha added the backport:prev-major Backport to (8.x, 8.17, 8.16) the previous major branch and other branches in development label Apr 27, 2023
@kpatticha kpatticha added bug Fixes for quality problems that affect the customer experience v8.6.0 backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) and removed backport:prev-major Backport to (8.x, 8.17, 8.16) the previous major branch and other branches in development v8.6.0 labels Apr 27, 2023
@kpatticha kpatticha merged commit 98165d2 into elastic:main Apr 27, 2023
@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

Unknown metric groups

ESLint disabled line counts

id before after diff
enterpriseSearch 17 19 +2
securitySolution 399 402 +3
total +5

Total ESLint disabled count

id before after diff
enterpriseSearch 18 20 +2
securitySolution 479 482 +3
total +5

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @kpatticha

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Apr 27, 2023
Fixes the error
```
"script_stack": [
            "[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues.throwIfEmpty(ScriptDocValues.java:92)",
            "[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues$Longs.get(ScriptDocValues.java:110)",
            "[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues$Longs.getValue(ScriptDocValues.java:105)",
            """total = useCgroupLimit ? doc[limitKey].value : doc['system.memory.total'].value;

    double """,
            "                                                                         ^---- HERE"

```

connected elastic/sdh-apm#765 (internal)

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Søren Louv-Jansen <[email protected]>
(cherry picked from commit 98165d2)
@kibanamachine
Copy link
Contributor

💔 Some backports could not be created

Status Branch Result
8.7 Backport failed because of merge conflicts
8.8

Note: Successful backport PRs will be merged automatically after passing CI.

Manual backport

To create the backport manually run:

node scripts/backport --pr 145348

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request May 3, 2023
# Backport

This will backport the following commits from `main` to `8.8`:
- [[APM] Check if metric fields exist
(#145348)](#145348)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Katerina
Patticha","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-04-27T12:55:14Z","message":"[APM]
Check if metric fields exist (#145348)\n\nFixes the error
\r\n```\r\n\"script_stack\": [\r\n
\"[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues.throwIfEmpty(ScriptDocValues.java:92)\",\r\n
\"[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues$Longs.get(ScriptDocValues.java:110)\",\r\n
\"[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues$Longs.getValue(ScriptDocValues.java:105)\",\r\n
\"\"\"total = useCgroupLimit ? doc[limitKey].value :
doc['system.memory.total'].value;\r\n\r\n double \"\"\",\r\n \" ^----
HERE\"\r\n\r\n\r\n```\r\n\r\nconnected
elastic/sdh-apm#765
(internal)\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>\r\nCo-authored-by:
Søren Louv-Jansen
<[email protected]>","sha":"98165d26846367b98a259f76b664545a43d2434b","branchLabelMapping":{"^v8.9.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["bug","release_note:fix","Team:APM","backport:prev-minor","v8.8.0","v8.7.1","v8.9.0"],"number":145348,"url":"https://github.com/elastic/kibana/pull/145348","mergeCommit":{"message":"[APM]
Check if metric fields exist (#145348)\n\nFixes the error
\r\n```\r\n\"script_stack\": [\r\n
\"[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues.throwIfEmpty(ScriptDocValues.java:92)\",\r\n
\"[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues$Longs.get(ScriptDocValues.java:110)\",\r\n
\"[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues$Longs.getValue(ScriptDocValues.java:105)\",\r\n
\"\"\"total = useCgroupLimit ? doc[limitKey].value :
doc['system.memory.total'].value;\r\n\r\n double \"\"\",\r\n \" ^----
HERE\"\r\n\r\n\r\n```\r\n\r\nconnected
elastic/sdh-apm#765
(internal)\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>\r\nCo-authored-by:
Søren Louv-Jansen
<[email protected]>","sha":"98165d26846367b98a259f76b664545a43d2434b"}},"sourceBranch":"main","suggestedTargetBranches":["8.8","8.7"],"targetPullRequestStates":[{"branch":"8.8","label":"v8.8.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.7","label":"v8.7.1","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.9.0","labelRegex":"^v8.9.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/145348","number":145348,"mergeCommit":{"message":"[APM]
Check if metric fields exist (#145348)\n\nFixes the error
\r\n```\r\n\"script_stack\": [\r\n
\"[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues.throwIfEmpty(ScriptDocValues.java:92)\",\r\n
\"[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues$Longs.get(ScriptDocValues.java:110)\",\r\n
\"[email protected]/org.elasticsearch.index.fielddata.ScriptDocValues$Longs.getValue(ScriptDocValues.java:105)\",\r\n
\"\"\"total = useCgroupLimit ? doc[limitKey].value :
doc['system.memory.total'].value;\r\n\r\n double \"\"\",\r\n \" ^----
HERE\"\r\n\r\n\r\n```\r\n\r\nconnected
elastic/sdh-apm#765
(internal)\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>\r\nCo-authored-by:
Søren Louv-Jansen
<[email protected]>","sha":"98165d26846367b98a259f76b664545a43d2434b"}}]}]
BACKPORT-->

Co-authored-by: Katerina Patticha <[email protected]>
Comment on lines +50 to 59
export const systemMemoryFilter = {
bool: {
filter: [
{ exists: { field: METRIC_SYSTEM_FREE_MEMORY } },
{ exists: { field: METRIC_SYSTEM_TOTAL_MEMORY } },
],
},
};

export const percentSystemMemoryUsedScript = {
Copy link
Member

@sorenlouv sorenlouv May 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kpatticha any reason for the difference in naming "systemMemoryFilter" and "percentSystemMemoryUsedScript"? What about "systemMemoryFilter" and "systemMemoryScript"?

Ideally I think they should be co-located in an object:

const systemMemory = {
  filter: {...},
  script: {...}
}

This will provide a strong signal that they belong together, and will make it less likely that somebody changing the code will make changes to the filter and not the script, and vice-versa.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sqren I can't remember any particular reason for the naming 🤔 .

As for the object, that's a really good point and I think we initially discussed doing it like that but somehow I missed it.

Let me fix it in a follow-up PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +72 to 83
export const cgroupMemoryFilter = {
bool: {
filter: [{ exists: { field: METRIC_CGROUP_MEMORY_USAGE_BYTES } }],
should: [
{ exists: { field: METRIC_CGROUP_MEMORY_LIMIT_BYTES } },
{ exists: { field: METRIC_SYSTEM_TOTAL_MEMORY } },
],
minimum_should_match: 1,
},
};

export const percentCgroupMemoryUsedScript = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above (naming)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) bug Fixes for quality problems that affect the customer experience release_note:fix Team:APM All issues that need APM UI Team support v8.7.1 v8.8.0 v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants