Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some guidelines about the use of metrics metadata and dimensions #5270

Merged
merged 1 commit into from
Feb 23, 2023

Conversation

jsoriano
Copy link
Member

@jsoriano jsoriano commented Feb 14, 2023

Relates to #4124.

@elasticmachine
Copy link

💔 Tests Failed

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-02-14T15:52:59.677+0000

  • Duration: 117 min 32 sec

Test stats 🧪

Test Results
Failed 1
Passed 4646
Skipped 11
Total 4658

Test errors 1

Expand to view the tests failures

Check integrations / kubernetes / kubernetes: check / system test: default – kubernetes.state_statefulset
    Expand to view the error details

     null 
    

    Expand to view the stacktrace

     one or more errors found in documents stored in metrics-kubernetes.state_statefulset-ep data stream: [0] found error.message in event: error making http request: Get "http://kube-state-metrics:8080/metrics": dial tcp 10.244.0.59:8080: i/o timeout 
    

Steps errors 2

Expand to view the steps failures

Test integration: kubernetes
  • Took 15 min 56 sec . View more details here
  • Description: eval "$(../../build/elastic-package stack shellinit)" ../../build/elastic-package test -v --report-format xUnit --report-output file --test-coverage
Google Storage Download
  • Took 0 min 0 sec . View more details here

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (376/376) 💚
Files 96.759% (627/648) 👎 -3.241
Classes 96.759% (627/648) 👎 -3.241
Methods 91.065% (6146/6749) 👍 66.065
Lines 92.033% (130471/141766) 👎 -7.967
Conditionals 100.0% (0/0) 💚

@ruflin
Copy link
Contributor

ruflin commented Feb 15, 2023

I added @agithomas as reviewer as he is thinking about this topic a lot at the moment.


It is important to choose wisely the set of fields, they should be the minimal set
of dimensions required to properly identify any time serie included in the data stream.
Too few dimensions can mix data of multiple time series into a single one, too many can
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we provide specific guideline on lower bound and higher bound on dimensions count? Otherwise, this question will come up in the future, in multiple forums.

I am also trying to understand, why elastic TSDB enforces a lower bound on dimensions (too few dimensions is an issue) and where-as in prometheus TSDB metrics can be ingested without labels(dimension) and does not break anything there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we provide specific guideline on lower bound and higher bound on dimensions count? Otherwise, this question will come up in the future, in multiple forums.

This is quite dependant of the context. And even for the same object it may vary depending on the available data. For example if you are monitoring containers in Kubernetes pods, and you have the container ID, this may be the only dimension needed, but if you don't have the container ID, you may need to use at least the container name, the pod name and the pod namespace as dimensions.

There are things like HTTP requests where we may want to store lots of dimensions, such as the method, source IP, url path, query parameters, important headers, backend handling the request and so on. For these cases we have a higher bound of 16 dimensions, that is checked by Elasticsearch itself. But this limit can be raised per data stream if needed, using the index.mapping.dimension_fields.limit option.

I am also trying to understand, why elastic TSDB enforces a lower bound on dimensions (too few dimensions is an issue) and where-as in prometheus TSDB metrics can be ingested without labels(dimension) and does not break anything there.

It is actually similar with Prometheus, it is not that things "break", but there can be misbehaviours.
For example if you ingest Kubernetes pod metrics only with the label name, and you have pods with the same name in different namespaces, you are going to have the same kind of problems also in Prometheus (only one metric is stored for the same timestamp, and mixed metrics of different pods in the same time serie).
Thus you need to use the name and the namespace as labels. If you add the namespace label later, you are creating new time series, disconnected from the previous ones, even for pods where their name was unique.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jsoriano for the clarifications. I am good for the document to the merged.

There is a separate discussion happening for dimensions in separate threads, to clarify/close some of these aspects.

@lalit-satapathy
Copy link
Collaborator

LGTM for the first version. We can start building on this to add more content/reference for TSDB going forward.

@jsoriano jsoriano merged commit c125f96 into elastic:main Feb 23, 2023
agithomas pushed a commit to agithomas/integrations that referenced this pull request Mar 20, 2023
agithomas pushed a commit to agithomas/integrations that referenced this pull request Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants