roachtest: add cluster stats utility #80724

kvoli · 2022-04-28T18:15:00Z

This patch introduces cluster_stats, a utility wrapper for prometheus
collected metrics. The motivation for this patch is to enable additional
classes of roachtests to generate run statistics, on relevant cluster
metrics. These run statistics are exported into a roachperf parseable
stats.json.

The utilities provided are:

(1) collect and export metrics, per-node and in aggregate over a run;
parseable by roachperf.
(2) register and stream metrics for test processing at a fixed
interval.

(1) enables a arbitrary roachtests to declare and export relevant
roachperf outputs. Where there is otherwise a binary outcome, we may now
track performance over time.

(2) reduces friction for testing assertions on cluster metrics, e.g. qps
balance. This also makes dynamic workload testing easier, where workload
application changes based on predicates of cluster statistics, e.g.
switch key access distribution when qps becomes balanced.

The allocator related roachtests do not currently export benchmark
related statistics. This patch introduces stat collection on
replicate/up/1to3, replicate/rebalance/3to5, rebalance/by-load/leases and
rebalance/by-load/ranges.

The benchmark metric for the nightly run is the time taken to reach
balance, from starting the test. In addition, cluster statistics are
exported for detailed view into: cpu usage, io bandwidth, qps and range
counts.

related: https://github.com/cockroachdb/roachperf/pull/94

Release note: None

cockroach-teamcity · 2022-04-28T18:15:07Z

This change is

tbg

Reviewed 3 of 19 files at r4, 24 of 41 files at r6, 19 of 19 files at r9, 7 of 14 files at r10, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

tbg

Great stuff! Still a few comments but I don't think this is too far off. Two main things are just making the setup compose a bit more amicably with custom prometheus setups, and a question about how arbitrary labels are handled.

Reviewed 7 of 14 files at r10, 4 of 4 files at r11.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli and @tbg)

pkg/cmd/roachtest/clusterstats/collector.go line 71 at r11 (raw file):

}

// CleanUp closes the prometheus client and collects a snapshot of the

Mhm, this strikes me as a bit muddled and I think would favorably separate with my suggestion above, to take the prometheus instantiation out of the collector. The collector would just close the prom client, and the snapshot saving would come from the prom instance.

pkg/cmd/roachtest/clusterstats/collector.go line 78 at r11 (raw file):

// NewStatsCollector returns an implementation of the StatCollector interface.
func NewStatsCollector(ctx context.Context, t test.Test, c cluster.Cluster) (StatCollector, error) {

This looks a little too opinionated and bound to combine poorly with #82656. Can we make it configurable who the stats collector will be and simply mandate that prometheus ought to be running there? If this becomes too onerous for many tests, I wouldn't be opposed to a helper that takes a PrometheusConfig instead (and spins up the prom instance itself) but the basic functionality to combine it together should be the foundation.

pkg/cmd/roachtest/clusterstats/collector.go line 110 at r11 (raw file):

// Exporter returns a StatExporter, using this StatCollector.
func (cs *clusterStatCollector) Exporter() StatExporter {

Curious why you even need this accessor, since cs itself implements the interface.

pkg/cmd/roachtest/clusterstats/collector.go line 117 at r11 (raw file):

func (cs *clusterStatCollector) CollectPoint(
	ctx context.Context, l *logger.Logger, at time.Time, q string,
) (map[string]map[string]StatPoint, error) {

(map[string /* label-key */]map[string /* label-value */]StatPoint, error)

I think it would also be helpful if you lifted the comment with the example below into the method comment, as seeing the example makes it much easier to understand what the map looks like.

pkg/cmd/roachtest/clusterstats/collector.go line 135 at r11 (raw file):

	// rebalancing_queriespersecond at time 110 and there were two stores (1,2)
	// with ip addresses 10.0.0.1 and 127.0.0.1, for store 2 and store 1
	// respectively.

Nice examples! Very helpful.

pkg/cmd/roachtest/clusterstats/collector.go line 167 at r11 (raw file):

func (cs *clusterStatCollector) CollectInterval(
	ctx context.Context, l *logger.Logger, interval Interval, q string,
) (map[string]map[string]StatSeries, error) {

ditto

pkg/cmd/roachtest/clusterstats/doc.go line 13 at r11 (raw file):

package clusterstats

/*

😍 😍 😍

pkg/cmd/roachtest/clusterstats/doc.go line 15 at r11 (raw file):

a cockroach cluster, its VMs, and running workloads.

pkg/cmd/roachtest/clusterstats/doc.go line 25 at r11 (raw file):

(3) StatExporter:  export a group of stats collected over a run to roachperf.

(1) StatCollector is the basic component used in (2) and (3) to query stats. It

Some of this description would change as the lifecycle management of the prom instance becomes external to the collector.

pkg/cmd/roachtest/clusterstats/doc.go line 52 at r11 (raw file):

CollectInterval requires a string, which is a query in the PromQL format, and
an time interval. It returns a map of label names to labeled timeseries values

nit: a time interval

pkg/cmd/roachtest/clusterstats/doc.go line 86 at r11 (raw file):

streamed values.

The two important methods are Register and Run .

.

pkg/cmd/roachtest/clusterstats/doc.go line 106 at r11 (raw file):

(3) StatExporter uses the StatCollector (1) in order to collect stats requested
for export to roachperf. The stats that are requested follow a hierarchical

the ProcessStatEventFn also exports stats to roachperf, right? I imagine that's what writes the stats.json files that has the per-second histograms, etc.; maybe this can be clarified a bit. 3) is to report "top-line numbers" to roachperf, i.e. it's sort of the non-streaming counterpart of 2). (If I got this right at least).

pkg/cmd/roachtest/clusterstats/doc.go line 114 at r11 (raw file):

e.g. The rebalancing_queriespersecond of each store. Would be defined as:

AggQuery.Stat = ClusterStat{Query: "rebalancing_queriespersecond", LabelName: "store"}

If we had a timeseries that has two labels that are not mutually exclusive, say

foo{a="0|1", b="0|1"} (so you have four timeseries), how will this work? Does it even make sense to specify the label here? After all, AggQuery.Query always has to produce a single number, so you would have to sum up the a and b labels anyway. It seems that the Label really just enables AggFn, so maybe an AggFn should "just" get the entire map (so it gets to see all labels) and can do whatever it wants?

pkg/cmd/roachtest/clusterstats/doc.go line 149 at r11 (raw file):

The StatExporter will collect, parse and then serialize the list of AggQuerys
provided. The serialization is specific to the roachperf format. The serialized
export is written to the perfartifacts directory.

nit: did you mean to say "perfartifacts" here?

pkg/cmd/roachtest/clusterstats/exporter.go line 31 at r11 (raw file):

type ClusterStat struct {
	Query     string
	LabelName string

not sure about this field, see doc.go

pkg/cmd/roachtest/clusterstats/exporter.go line 62 at r11 (raw file):

		to time.Time,
		queries []AggQuery,
		benchmarkFns ...func(map[string]StatSummary) (string, float64),

should explain benchmarkFns.

pkg/cmd/roachtest/clusterstats/exporter.go line 108 at r11 (raw file):

	}

	l.PrintfCtx(ctx, "roachtest export: summaries %+v, run-scalars: %+v", summaries, summaryValues)

nit: roachperf export?

pkg/cmd/roachtest/tests/allocator.go line 112 at r11 (raw file):

				t,
				startTime, endTime,
				joinSummaryQueries(actionsSummary, underReplicatedSummary, requestBalanceSummary, resourceBalanceSummary),

what's the purpose of including underReplicatedSummary here? It's basically going to be zero. Is it just interesting to see whether it isn't zero at any point?

pkg/cmd/roachtest/tests/allocator.go line 120 at r11 (raw file):

				},
				func(stats map[string]clusterstats.StatSummary) (string, float64) {
					return "t-uprepl", replicateTime.Sub(startTime).Seconds() - allocatorStableSeconds

That doesn't make sense, startTime is larger than replicateTime. I think you "just" want to construct a replicateDuration earlier and report it here.

pkg/cmd/roachtest/tests/rebalance_load.go line 152 at r11 (raw file):

				} else if done {
					t.Status("successfully achieved lease balance; waiting for kv to finish running")
					// TODO(kvoli): Support mixed version testing, currently it will attempt to init

is this sentence complete?

pkg/cmd/roachtest/tests/rebalance_stats.go line 21 at r11 (raw file):

var (
	qpsStat  = clusterstats.ClusterStat{LabelName: "instance", Query: "rebalancing_queriespersecond"}

these are intentionally on instance and not store? Mind explaining why? I would've expected store here.

kvoli

Updated. I agree with the prometheus setup. I'll wait for your patch to land and then update to pass in a prometheus.Client and look at making the setup of the client a little less cumbersome / shifted out of the TPCC tests pkg? I like the idea of a helper to take it the last step from a config and return a client in a helper somewhere.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli and @tbg)

pkg/cmd/roachtest/clusterstats/collector.go line 110 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Curious why you even need this accessor, since cs itself implements the interface.

It was mostly a nice to have, since the caller who holds a Collector interface, would need to typecast.

pkg/cmd/roachtest/clusterstats/collector.go line 117 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

(map[string /* label-key */]map[string /* label-value */]StatPoint, error)

I think it would also be helpful if you lifted the comment with the example below into the method comment, as seeing the example makes it much easier to understand what the map looks like.

updated

pkg/cmd/roachtest/clusterstats/collector.go line 167 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

ditto

updated

pkg/cmd/roachtest/clusterstats/doc.go line 15 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

a cockroach cluster, its VMs, and running workloads.

Updated

pkg/cmd/roachtest/clusterstats/doc.go line 25 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Some of this description would change as the lifecycle management of the prom instance becomes external to the collector.

Updated.

pkg/cmd/roachtest/clusterstats/doc.go line 52 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

nit: a time interval

Updated.

pkg/cmd/roachtest/clusterstats/doc.go line 86 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

.

Updated.

pkg/cmd/roachtest/clusterstats/doc.go line 106 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

the ProcessStatEventFn also exports stats to roachperf, right? I imagine that's what writes the stats.json files that has the per-second histograms, etc.; maybe this can be clarified a bit. 3) is to report "top-line numbers" to roachperf, i.e. it's sort of the non-streaming counterpart of 2). (If I got this right at least).

The ProcessStatEventFn is only for processing streamed stats. It's an alternative to using a channel for example, as a closure in the test. It isn't used in any test at the moment but the intent is for it to be used to assert on when the test is "done", or to trigger configuration/workload changes within a test run.

An example is a test which runs against a cluster with a changing workload, running them serially and changing workload once the cluster rebalancing has reached a steady state. This would be useful to determine the avg time taken to reach a steady state following a workload change.

Updated with an example and made this explicit.

pkg/cmd/roachtest/clusterstats/doc.go line 114 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

If we had a timeseries that has two labels that are not mutually exclusive, say

foo{a="0|1", b="0|1"} (so you have four timeseries), how will this work? Does it even make sense to specify the label here? After all, AggQuery.Query always has to produce a single number, so you would have to sum up the a and b labels anyway. It seems that the Label really just enables AggFn, so maybe an AggFn should "just" get the entire map (so it gets to see all labels) and can do whatever it wants?

AggQuery.Query has to produce a timeseries result - as the second tier or aggregation over the individual labels resulting from AggQuery.Stat. I think you're thinking of BenchmarkFns which do just return a scalar and they get the result of all the AggQuerys and can discard or do what they like so long as they return a string and float64.

Updated with an example here for clarity.

pkg/cmd/roachtest/clusterstats/doc.go line 149 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

nit: did you mean to say "perfartifacts" here?

Nope, updated.

pkg/cmd/roachtest/clusterstats/exporter.go line 31 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

not sure about this field, see doc.go

This is used for ensuring the aggregation is consistent with the timeseries results reported for a label name.

pkg/cmd/roachtest/clusterstats/exporter.go line 62 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

should explain benchmarkFns.

Added.

pkg/cmd/roachtest/clusterstats/exporter.go line 108 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

nit: roachperf export?

Updated.

pkg/cmd/roachtest/tests/allocator.go line 112 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

what's the purpose of including underReplicatedSummary here? It's basically going to be zero. Is it just interesting to see whether it isn't zero at any point?

Yeah it is interesting to track the under replication, however you're right it doesn't fit well here. As all stores will report the same number afaik. Removed.

pkg/cmd/roachtest/tests/allocator.go line 120 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

That doesn't make sense, startTime is larger than replicateTime. I think you "just" want to construct a replicateDuration earlier and report it here.

Yeah that seems off, updated and added a comment.

pkg/cmd/roachtest/tests/rebalance_load.go line 152 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

is this sentence complete?

Updated.

pkg/cmd/roachtest/tests/rebalance_stats.go line 21 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

these are intentionally on instance and not store? Mind explaining why? I would've expected store here.

In this case it was for consistency between graphs as the sys stats are only tagged by instance. Since there isn't really a mapping between store and instance that will present in roachperf.

I'm not sure which way to go on this one, doing any sort of mapping within roachtest seems like a special case.

Updated to store for the ones that are able, in the mean time.

tbg

Reviewed 40 of 40 files at r12, 32 of 32 files at r13, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli and @tbg)

pkg/cmd/roachtest/clusterstats/doc.go line 114 at r11 (raw file):
I'm probably just being thick, but my question above is what the label should be in the above example:

foo{a="0|1", b="0|1"}

The resulting timeseries has no label. But do I specify a or b? They're both aggregated away. What purpose does the label field have? Is it not possible to aggregate over foo since the system only allows aggregating one dimension away and here we have two? I just want to make sure we're not putting assumptions in that aren't necessary. For example, one might have a timeseries qps_per_store{tenantid, storeid} and you might have data for any (tenant,store) pair in the system, i.e. a quadratic number of label tuples.

pkg/cmd/roachtest/tests/rebalance_stats.go line 21 at r11 (raw file):

Previously, kvoli (Austen) wrote…

In this case it was for consistency between graphs as the sys stats are only tagged by instance. Since there isn't really a mapping between store and instance that will present in roachperf.

I'm not sure which way to go on this one, doing any sort of mapping within roachtest seems like a special case.

Updated to store for the ones that are able, in the mean time.

FWIW I think the final grafana stuff will also have the node label everywhere, so maybe standardizing on that is the way to go. Also, there seems to be a hidden assumption here that there is only one store per node, which is not true (though it happens to be true in most tests as it is the default).

tbg · 2022-06-16T17:06:36Z

PS master...msbutler:butler-roachprod-grafana is now the prereq, see https://cockroachlabs.slack.com/archives/C03JM5ES6DA/p1655261980634519?thread_ts=1654782726.816569&cid=C03JM5ES6DA, but shouldn't cause a big API change (just package movement and small tweaks I think)

kvoli

ack - I can't find a PR for that patch yet?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

pkg/cmd/roachtest/clusterstats/doc.go line 114 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I'm probably just being thick, but my question above is what the label should be in the above example:

foo{a="0|1", b="0|1"}

The resulting timeseries has no label. But do I specify a or b? They're both aggregated away. What purpose does the label field have? Is it not possible to aggregate over foo since the system only allows aggregating one dimension away and here we have two? I just want to make sure we're not putting assumptions in that aren't necessary. For example, one might have a timeseries qps_per_store{tenantid, storeid} and you might have data for any (tenant,store) pair in the system, i.e. a quadratic number of label tuples.

If this is for an AggQuery then it's fine, it just needs to return one time series - the label being missing isn't an issue it would just go under a "" label inside the returned map.

If you specify an AggQuery.Query as sum(X), in the above example it will still only produce a single time series result. So I don't think it's inflexible, the main purpose is just that the AggQuery.Query must be the summary over the multi-timeseries returned by querying AggQuery.Stat.

If this is for Stat, then this is invalid in the current API, as it expects tagged time series results.

pkg/cmd/roachtest/tests/rebalance_stats.go line 21 at r11 (raw file):

Previously, tbg (Tobias Grieger) wrote…

FWIW I think the final grafana stuff will also have the node label everywhere, so maybe standardizing on that is the way to go. Also, there seems to be a hidden assumption here that there is only one store per node, which is not true (though it happens to be true in most tests as it is the default).

It's unfortunate - probably could add a TODO for a mapping between instace and store, to convert instance tagged metrics into store tagged metrics.

tbg

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)

This patch introduces cluster_stats, a utility wrapper for prometheus collected metrics. The motivation for this patch is to enable additional classes of roachtests to generate run statistics, on relevant cluster metrics. These run statistics are exported into a roachperf parseable stats.json. The utilities provided are: (1) collect and export metrics, per-node and in aggregate over a run; parseable by roachperf. (2) register and stream metrics for test processing at a fixed interval. (1) enables a arbitrary roachtests to declare and export relevant roachperf outputs. Where there is otherwise a binary outcome, we may now track performance over time. (2) reduces friction for testing assertions on cluster metrics, e.g. qps balance. This also makes dynamic workload testing easier, where workload application changes based on predicates of cluster statistics, e.g. switch key access distribution when qps becomes balanced. Release note: None

The allocator related roachtests do not currently export benchmark related statistics. This patch introduces stat collection on replicate/up/1to3 and replicate/rebalance/3to5. The benchmark metric for the nightly run is the time taken to reach balance, from starting the test. In addition, cluster statistics are exported for detailed view into: cpu usage, io bandwidth, qps and range counts. Release note: None

kvoli · 2022-07-05T13:20:14Z

bors r+

craig · 2022-07-05T15:20:55Z

Build succeeded:

GitHub CI (Cockroach)

kvoli self-assigned this Apr 28, 2022

kvoli force-pushed the 220427.roachtest-clusterstats branch 22 times, most recently from 2e4ed08 to e3e5a55 Compare May 3, 2022 15:36

kvoli marked this pull request as ready for review May 3, 2022 17:12

kvoli requested a review from a team as a code owner May 3, 2022 17:12

kvoli requested review from otan and removed request for a team May 3, 2022 17:12

kvoli force-pushed the 220427.roachtest-clusterstats branch from e3e5a55 to cd96765 Compare May 3, 2022 23:15

kvoli requested a review from a team May 3, 2022 23:15

kvoli removed the request for review from a team June 8, 2022 18:43

tbg reviewed Jun 9, 2022

View reviewed changes

tbg self-requested a review June 9, 2022 12:25

tbg requested changes Jun 9, 2022

View reviewed changes

kvoli force-pushed the 220427.roachtest-clusterstats branch from 3bace94 to 522970f Compare June 9, 2022 15:32

kvoli commented Jun 9, 2022

View reviewed changes

kvoli force-pushed the 220427.roachtest-clusterstats branch 2 times, most recently from 6b482b5 to 1fcac56 Compare June 9, 2022 20:51

kvoli requested a review from tbg June 9, 2022 22:39

tbg reviewed Jun 16, 2022

View reviewed changes

kvoli commented Jun 20, 2022

View reviewed changes

tbg mentioned this pull request Jun 21, 2022

roachprod: add prometheus/grafana monitoring #83004

Merged

tbg approved these changes Jun 22, 2022

View reviewed changes

kvoli force-pushed the 220427.roachtest-clusterstats branch from 1fcac56 to 01abc32 Compare June 24, 2022 13:05

kvoli force-pushed the 220427.roachtest-clusterstats branch 6 times, most recently from 19f3ddb to 9e0ae24 Compare July 1, 2022 22:17

kvoli added 2 commits July 5, 2022 03:23

kvoli force-pushed the 220427.roachtest-clusterstats branch from 9e0ae24 to 4ce98a4 Compare July 5, 2022 03:28

craig bot merged commit f6d0d84 into cockroachdb:master Jul 5, 2022

tbg mentioned this pull request Jul 6, 2022

roachtest: replicate/up/1to3 failed #83881

Closed

renatolabs mentioned this pull request Jul 22, 2022

roachperf: regression around 2022-07-07 #84882

Closed

30 tasks

kvoli mentioned this pull request Feb 1, 2023

kvserver: benchmark upreplication #79940

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roachtest: add cluster stats utility #80724

roachtest: add cluster stats utility #80724

kvoli commented Apr 28, 2022 •

edited

Loading

cockroach-teamcity commented Apr 28, 2022

tbg left a comment

tbg left a comment

kvoli left a comment

tbg left a comment

tbg commented Jun 16, 2022

kvoli left a comment

tbg left a comment

kvoli commented Jul 5, 2022

craig bot commented Jul 5, 2022

roachtest: add cluster stats utility #80724

roachtest: add cluster stats utility #80724

Conversation

kvoli commented Apr 28, 2022 • edited Loading

cockroach-teamcity commented Apr 28, 2022

tbg left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

kvoli left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

tbg commented Jun 16, 2022

kvoli left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

kvoli commented Jul 5, 2022

craig bot commented Jul 5, 2022

kvoli commented Apr 28, 2022 •

edited

Loading