Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add elasticsearch/ingest_pipeline metricset #34012

Merged
merged 26 commits into from
Jan 27, 2023
Merged

Conversation

joshdover
Copy link
Contributor

@joshdover joshdover commented Dec 10, 2022

What does this PR do?

Depends on:

Ports elastic/integrations#4597 to metricbeat. This adds a new ingest_pipeline metricset to the elasticsearch module of Metricbeat. This module does the following:

  • Fetches the Nodes Stats API's ingest metrics
  • Ingest two types of documents:
    • Top-level pipeline metrics are ingested on every interval, including the counters for total time, documents processed, and failure counts. We also calculate a "self time" metric which will subtract the time spent on processor calls to other pipelines.
    • Processor-level metrics are ingested on a sampling rate (25% - every 4th interval by default, configurable). For these metrics, a separate document is created for every processor in every pipeline, on every ES node. For this reason, the sampling strategy is used to minimize the amount of data that is produced.
  • Supports local and cluster scopes for better performance

The UI for visualizing this data will be included as a dashboard. Right now this is only being shipped in an Agent integration with support for Stack Monitoring and Metricbeat index patterns. In a follow up PR to Kibana, a link will be added from the Stack Monitoring UI to this dashboard, or direct the user to install the package to get the dashboard.

Here's a summary of the new fields that are added:

- name: ingest_pipeline
  type: group
  release: beta
  description: Runtime metrics on ingest pipeline execution
  fields:
    - name: name
      type: wildcard
      description: Name / id of the ingest pipeline
    - name: total
      type: group
      description: Metrics on the total ingest pipeline execution, including all processors.
      fields:
        - name: count
          type: long
          description: Number of documents processed by this pipeline
        - name: failed
          type: long
          description: Number of documented failed to process by this pipeline
        - name: time.total.ms
          type: long
          description: Total time spent processing documents through this pipeline, inclusive of other pipelines called
        - name: time.self.ms
          type: long
          description: Time spent processing documents through this pipeline, exclusive of other pipelines called
    - name: processor
      type: group
      fields:
        - name: type
          type: keyword
          description: The type of ingest processor
        - name: type_tag
          type: keyword
          description: The type and the tag for this processor in the format "<type>:<tag>"
        - name: order_index
          type: long
          description: The order this processor appears in the pipeline definition
        - name: count
          type: long
          description: Number of documents processed by this processor
        - name: failed
          type: long
          description: Number of documented failed to process by this processor
        - name: time.total.ms
          type: long
          description: Total time spent processing documents through this processor

Why is it important?

With the adoption of Agent integrations, ingest pipeline performance is very important to overall ingest performance. Users have very little insight into this today and giving them dashboards and metrics from the existing ES APIs is a great first start in improving the situation.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Dec 10, 2022
@mergify
Copy link
Contributor

mergify bot commented Dec 10, 2022

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @joshdover? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@elasticmachine
Copy link
Collaborator

elasticmachine commented Dec 10, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-01-27T13:38:35.125+0000

  • Duration: 58 min 33 sec

Test stats 🧪

Test Results
Failed 0
Passed 4067
Skipped 887
Total 4954

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 3 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@joshdover joshdover changed the title Add elasticsearch.ingest metricset Add elasticsearch/ingest metricset Dec 12, 2022
@joshdover joshdover added enhancement Team:Elastic-Agent Label for the Agent team backport-skip Skip notification from the automated backport with mergify Team:Infra Monitoring UI - DEPRECATED Infrastructure Monitoring UI team - DEPRECATED - Use Team:Monitoring v8.7.0 labels Dec 12, 2022
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Dec 12, 2022
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
No Duplication information No Duplication information

@joshdover joshdover marked this pull request as ready for review January 16, 2023 13:56
@joshdover joshdover requested review from a team as code owners January 16, 2023 13:56
@joshdover joshdover requested review from belimawr and rdner and removed request for a team January 16, 2023 13:56
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@joshdover joshdover requested a review from klacabane January 16, 2023 16:36
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Left some minor comment. +1 on moving forward with it.

metricbeat/docs/modules/elasticsearch/ingest.asciidoc Outdated Show resolved Hide resolved
metricbeat/helper/elastic/elastic.go Outdated Show resolved Hide resolved
metricbeat/metricbeat.reference.yml Outdated Show resolved Hide resolved
metricbeat/module/elasticsearch/ingest/_meta/data.json Outdated Show resolved Hide resolved
metricbeat/module/elasticsearch/ingest/data_test.go Outdated Show resolved Hide resolved
metricbeat/helper/elastic/elastic.go Outdated Show resolved Hide resolved
@mergify
Copy link
Contributor

mergify bot commented Jan 25, 2023

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b es-ingest upstream/es-ingest
git merge upstream/main
git push upstream es-ingest

// format. It publishes the event which is then forwarded to the output. In case
// of an error set the Error field of mb.Event or simply call report.Error().
func (m *IngestMetricSet) Fetch(report mb.ReporterV2) error {
shouldSkip, err := m.ShouldSkipFetch()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ShouldSkipFetch will skip when we have a ScopeNode and we're not talking to master. This is implemented in most metricsets because we hit an API that returns the global state from master and we don't need to collect individual nodes data, but this does not seem to apply here.
iiuc we want to fetch individual node data when the ScopeNode, and hit the global API when ScopeCluster so we should remove that call and implement our own logic (similar to the node_stats metricset)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you. I think I can just remove this entirely as I'm already setting the URI correctly on the lines below this. Do you agree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

@joshdover joshdover requested a review from a team as a code owner January 27, 2023 13:36
@joshdover joshdover requested a review from klacabane January 27, 2023 13:36
@joshdover joshdover removed the request for review from a team January 27, 2023 13:38
@joshdover joshdover merged commit e355974 into elastic:main Jan 27, 2023
@joshdover joshdover deleted the es-ingest branch January 27, 2023 15:37
@amitkanfer
Copy link
Collaborator

🚢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify enhancement Team:Elastic-Agent Label for the Agent team Team:Infra Monitoring UI - DEPRECATED Infrastructure Monitoring UI team - DEPRECATED - Use Team:Monitoring v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants