Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a summary of IngestStats to ClusterStatsNodes #46146

Closed
talevy opened this issue Aug 29, 2019 · 4 comments · Fixed by #48485
Closed

Add a summary of IngestStats to ClusterStatsNodes #46146

talevy opened this issue Aug 29, 2019 · 4 comments · Fixed by #48485
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement

Comments

@talevy
Copy link
Contributor

talevy commented Aug 29, 2019

It would be nice to see a unified view of which processors are being used by the cluster in the cluster-stats response. Right now, ingest-stats is present in node stats, but it is left out of cluster-stats. This issue is open to discuss and track if and how it makes sense to expose these details in the ClusterStatsNodes part of the ClusterStatsResponse

Some information that would be interesting to provide that is missing

  • number of pipelines present
  • set of processor types used
@talevy talevy added >enhancement :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP labels Aug 29, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features

@talevy
Copy link
Contributor Author

talevy commented Aug 29, 2019

Come to think of it, this type of info may be better served in the Cluster Info Response

For example

    "ingest" : {
      "number_of_pipelines" : 2,
      "processor_types" : [
        "circle",
        "gsub",
        "script"
      ]
    }

Can you think of any other information you'd like to see in this @giladgal?

@giladgal
Copy link
Contributor

To document our quick discussion about it, we want to collect:

  1. Which environments used the processor
  2. How many records were ingested by the processor

@martijnvg
Copy link
Member

👍 I think we can easily include this information (all installed processor types with a node count and the cluster wide ingest count) in the cluster stats api when it gathers stats from all nodes that are part of the cluster.

@talevy talevy self-assigned this Sep 3, 2019
talevy added a commit that referenced this issue Oct 29, 2019
This commit enhances the ClusterStatsNodes response to include global 
processor usage stats on a per-processor basis.

example output:

```
...
    "processor_stats": {
      "gsub": {
        "count": 0,
        "failed": 0
        "current": 0
        "time_in_millis": 0
      },
      "script": {
        "count": 0,
        "failed": 0
        "current": 0,
        "time_in_millis": 0
      }
    }
...
```

The purpose for this enhancement is to make it easier to collect stats on how specific processors are being used across the cluster beyond the current per-node usage statistics that currently exist in node stats.

Closes #46146.
talevy added a commit that referenced this issue Oct 31, 2019
* Add ingest info to Cluster Stats (#48485)

This commit enhances the ClusterStatsNodes response to include global
processor usage stats on a per-processor basis.

example output:

```
...
    "processor_stats": {
      "gsub": {
        "count": 0,
        "failed": 0
        "current": 0
        "time_in_millis": 0
      },
      "script": {
        "count": 0,
        "failed": 0
        "current": 0,
        "time_in_millis": 0
      }
    }
...
```

The purpose for this enhancement is to make it easier to collect stats on how specific processors are being used across the cluster beyond the current per-node usage statistics that currently exist in node stats.

Closes #46146.

* fix BWC of ingest stats

The introduction of processor types into IngestStats had a bug.
It was set to `null` and set as the key to the map. This would
throw a NPE. This commit resolves this by setting all the processor
types from previous versions that are not serializing it out to
`_NOT_AVAILABLE`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants