Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add network tag to metrics #12733

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

virajbhartiya
Copy link
Member

Closes #12715

Add network tag to metrics

@virajbhartiya
Copy link
Member Author

@rvagg I am trying to setup a Grafana dashboard on my device to test this out. Can you please just let me know if I am going in the right direction with this PR

@rvagg
Copy link
Member

rvagg commented Nov 28, 2024

Yep, I think this is the right direction. It's unfortunate that you need to add it to all of the views but that looks like the trickiest bit here.

You should be able to register the value at startup, like in here:

lotus/cmd/lotus/daemon.go

Lines 211 to 216 in 07f2f69

ctx, _ := tag.New(context.Background(),
tag.Insert(metrics.Version, build.NodeBuildVersion),
tag.Insert(metrics.Commit, build.CurrentCommit),
tag.Insert(metrics.NodeType, "chain"),
)
// Register all metric views

And in here:

ctx, _ := tag.New(lcli.DaemonContext(cctx),
tag.Insert(metrics.Version, build.MinerBuildVersion),
tag.Insert(metrics.Commit, build.CurrentCommit),
tag.Insert(metrics.NodeType, "miner"),
)
// Register all metric views

Currently I have grafana setup for my node and I'm scraping with prometheus, in my prometheus.yml scape_configs I have:

  - job_name: lotus-mainnet
    scrape_interval: 10s
    metrics_path: '/debug/metrics'
    static_configs:
      - targets: ['localhost:1234']
        labels:
          network: 'mainnet'
  - job_name: lotus-calibnet
    scrape_interval: 10s
    metrics_path: '/debug/metrics'
    static_configs:
      - targets: ['localhost:1235']
        labels:
          network: 'calibnet'

you can see my mainnet and calibnet tags being inserted there so I can do what you're doing here, but after the fact.

Then in grafana, when I explore metrics I can see the labels showing up that get reported for that metric type. All of mine have network but some have additional tags. In this one, network is there, but prometheus has also added instance and job, but the metric itself (block/failure) also has its own custom tag, from here:

lotus/chain/sub/incoming.go

Lines 466 to 473 in 07f2f69

func recordFailure(ctx context.Context, metric *stats.Int64Measure, failureType string) {
ctx, _ = tag.New(
ctx,
tag.Upsert(metrics.FailureType, failureType),
)
stats.Record(ctx, metric.M(1))
}

Screenshot 2024-11-28 at 4 57 22 pm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 📌 Triage
Development

Successfully merging this pull request may close these issues.

Tag metrics with a network name/network tag
2 participants