Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ingest] Per processor metrics #33387

Closed
jakelandis opened this issue Sep 4, 2018 · 1 comment
Closed

[ingest] Per processor metrics #33387

jakelandis opened this issue Sep 4, 2018 · 1 comment
Assignees
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement

Comments

@jakelandis
Copy link
Contributor

jakelandis commented Sep 4, 2018

Currently if trying to review metrics for the ingest node, the lowest granularity is per pipeline [1]. Ideally the metrics would also show per process information too.

The current behavior of GET _nodes/stats/ingest

"nodes": {
  ...
      "ingest": {
        "total": {
          "count": 25,
          "time_in_millis": 8,
          "current": 0,
          "failed": 0
        },
        "pipelines": {
          "mypipeline3": {
            "count": 0,
            "time_in_millis": 0,
            "current": 0,
            "failed": 0
          },
          "mypipeline2": {
            "count": 6,
            "time_in_millis": 0,
            "current": 0,
            "failed": 0
          },
          "mypipeline": {
            "count": 19,
            "time_in_millis": 8,
            "current": 0,
            "failed": 0
          }
        }
      }
    }

The proposal is here is to add the same stats as the pipeline has, but also to the processor. For example:

{
   "mypipeline":{
      "count":19,
      "time_in_millis":8,
      "current":0,
      "failed":0,
      "processors":[
         {
            "set":{
            "count":19,
            "time_in_millis":4,
            "current":0,
            "failed":0
            }
         },
         {
            "rename":{
            "count":19,
            "time_in_millis":4,
            "current":0,
            "failed":0
           }
         }
      ]
   }
}

Since each processor can have a on_failure processor, and that on_failure processor can also have multiple processors with other on_failure processors, the resultant JSON can (if pipeline are defined this way) result in quite verbose tree like structures. However, I suspect that most pipeline don't go too deep into setting multiple on_failure handlers.

EDIT: After further discussion, the on_failure processors should be considered part of the parent processor.

Some recent additions to the ingest node capability should also be addressed here too. (open for discussion)

  • conditional if Ingest: Add conditional per processor #32398 . Attempts should be made to hide implementation details for the if conditional report any metrics that include the time take and/errors produced by the if conditional as part of the processor itself.
  • calling other pipelines via the pipeline processor : INGEST: Add Pipeline Processor #32473 . The pipeline processor should show as any other processor, however, the metrics for this pipeline AND that pipeline should both increase.

Also, tag's should be supported in the naming. Likely via the main name, for example rename:my_tag if the tag is defined.

https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html#ingest-stats

@jakelandis jakelandis added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Sep 4, 2018
@jakelandis jakelandis self-assigned this Sep 4, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement
Projects
None yet
Development

No branches or pull requests

3 participants