Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[terafoundation] Prom metrics exporter doesn't reset metrics between updates #3747

Merged
merged 8 commits into from
Sep 19, 2024

Conversation

busma13
Copy link
Contributor

@busma13 busma13 commented Sep 10, 2024

This PR makes the following changes:

  • The PromMetrics class needs to reset it's list of metrics on each scrape. If it doesn't do this, then all the executions are listed, not just the active ones. resetMetrics() functions were added to PromMetrics and Exporter to reset the prom-client register.
  • Add prom_metrics_display_url field to terafoundation. This value will be used as the url default label added to all prom metrics. Defaults to an empty string, making it more obvious that this field is missing from the config.
  • Include cluster analytics metrics (GET '/cluster/stats' endpoint results) in the cluster master

ref: #3743

@busma13 busma13 changed the title Prom metrics exporter doesn't reset metrics between updates [terafoundation] Prom metrics exporter doesn't reset metrics between updates Sep 10, 2024
@godber
Copy link
Member

godber commented Sep 11, 2024

Uh, I completely forgot about this ...

49d356c

curl -H "Accept: application/openmetrics-text;" -sS http://localhost:5678/cluster/stats
# TYPE teraslice_slices_processed counter
teraslice_slices_processed{cluster="teraslice-dev1"} 2
# TYPE teraslice_slices_failed counter
teraslice_slices_failed{cluster="teraslice-dev1"} 0
# TYPE teraslice_slices_queued counter
teraslice_slices_queued{cluster="teraslice-dev1"} 0
# TYPE teraslice_workers_joined counter
teraslice_workers_joined{cluster="teraslice-dev1"} 1
# TYPE teraslice_workers_disconnected counter
teraslice_workers_disconnected{cluster="teraslice-dev1"} 0
# TYPE teraslice_workers_reconnected counter
teraslice_workers_reconnected{cluster="teraslice-dev1"} 0

compared to:

curl -sS http://localhost:5678/cluster/stats
{
    "controllers": {
        "processed": 2,
        "failed": 0,
        "queued": 0,
        "job_duration": 3,
        "workers_joined": 1,
        "workers_disconnected": 0,
        "workers_reconnected": 0
    },
    "slicer": {
        "processed": 2,
        "failed": 0,
        "queued": 0,
        "job_duration": 3,
        "workers_joined": 1,
        "workers_disconnected": 0,
        "workers_reconnected": 0
    }
}

We should document the existence of this capability and we should really consider expanding the labels coming out of it or consolidating these metrics in with the new main built in exporter.

@busma13
Copy link
Contributor Author

busma13 commented Sep 11, 2024

teraslice_slices_processed{cluster="teraslice-dev1"} 2

curl localhost:5678
{
    "arch": "arm64",
    "clustering_type": "kubernetes",
    "name": "ts-dev1",
    "node_version": "v18.20.4",
    "platform": "linux",
    "teraslice_version": "v2.3.0"
}

cluster === teraslice.name from the terafoundation, correct? Which is what is returned in the request above.

@godber
Copy link
Member

godber commented Sep 11, 2024

cluster === teraslice.name from the terafoundation, correct? Which is what is returned in the request above.

Correct

@busma13 busma13 force-pushed the promMetrics-Exporter-Fix branch from 36abb5d to 138ce2b Compare September 18, 2024 17:47
@busma13 busma13 marked this pull request as ready for review September 18, 2024 22:43
@busma13
Copy link
Contributor Author

busma13 commented Sep 18, 2024

We should document the existence of this capability

I added an example in the endpoints-json.md file

@busma13 busma13 requested review from godber and sotojn September 18, 2024 22:47
Comment on lines +699 to +710
this.context.apis.foundation.promMetrics.set(
'master_info',
{
arch: this.context.arch,
clustering_type: cluster_manager_type,
name,
node_version: process.version,
platform: this.context.platform,
teraslice_version: getPackageJSON().version
},
1
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does url get added to master_info somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. url, assignment, and name are default labels that are added to all metrics automatically within the PromMetrics class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name label is being set as a default and in the cluster_info metric. Within promMetrics.set() the default labels will override the labels passed in as parameters, so there is no chance of labels repeating.

# HELP teraslice_master_info Information about Teraslice cluster master
# TYPE teraslice_master_info gauge
teraslice_master_info{arch="arm64",clustering_type="kubernetes",name="ts-dev1",node_version="v18.20.4",platform="linux",teraslice_version="2.3.1",assignment="master"} 1

@godber godber merged commit 5b37c44 into master Sep 19, 2024
67 checks passed
@godber godber deleted the promMetrics-Exporter-Fix branch September 19, 2024 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants