Skip to content

Commit

Permalink
stats: reproduce redundant TYPE tags problem in Prometheus. (envoypro…
Browse files Browse the repository at this point in the history
…xy#27239)

Commit Message: Reproduces a scenario where it's difficult to use a streaming optimization for Prometheus stats based on scopes without introducing a bug. Context:

Issue that /stats/prometheus endpoint takes too much much memory: Prometheus stats handler used too much memory. envoyproxy#16139
Solution for non-Prometheus endpoints to use less memory for /stats and run faster: admin: Streaming /stats implementation envoyproxy#19693 which I believe is working well.
This solution mapped to Prometheus: Prom stats perf improvements envoyproxy#24998
A case where this solution has a duplicate# TYPE line due to two stats with the same tag-extracted-stat name from two different scopes: stats: prometheus scrape failed envoyproxy#27173
Note that the existing unit tests did not exercise that scenario so this PR adds a testcase.

Signed-off-by: Joshua Marantz <[email protected]>
Signed-off-by: Ryan Eskin <[email protected]>
  • Loading branch information
jmarantz authored and reskin89 committed Jul 11, 2023
1 parent 2da8e1a commit f262984
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 0 deletions.
2 changes: 2 additions & 0 deletions test/server/admin/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,8 @@ envoy_cc_test(
name = "prometheus_stats_test",
srcs = envoy_select_admin_functionality(["prometheus_stats_test.cc"]),
deps = [
"//source/common/stats:tag_producer_lib",
"//source/common/stats:thread_local_store_lib",
"//source/server/admin:prometheus_stats_lib",
"//test/test_common:utility_lib",
],
Expand Down
47 changes: 47 additions & 0 deletions test/server/admin/prometheus_stats_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
#include <vector>

#include "source/common/stats/custom_stat_namespaces_impl.h"
#include "source/common/stats/tag_producer_impl.h"
#include "source/common/stats/thread_local_store.h"
#include "source/server/admin/prometheus_stats.h"

#include "test/mocks/stats/mocks.h"
Expand Down Expand Up @@ -261,6 +263,51 @@ envoy_histogram1_count{} 0
EXPECT_EQ(expected_output, response.toString());
}

// Replicate bug https://github.com/envoyproxy/envoy/issues/27173 which fails to
// coalesce stats in different scopes with the same tag-extracted-name.
TEST_F(PrometheusStatsFormatterTest, DifferentNamedScopeSameStat) {
Stats::CustomStatNamespacesImpl custom_namespaces;
Stats::ThreadLocalStoreImpl store(alloc_);
envoy::config::metrics::v3::StatsConfig stats_config;
store.setTagProducer(std::make_unique<Stats::TagProducerImpl>(stats_config));
Stats::StatName name = pool_.add("default.total_match_count");

Stats::ScopeSharedPtr scope1 = store.rootScope()->createScope("cluster.a");
counters_.push_back(Stats::CounterSharedPtr(&scope1->counterFromStatName(name)));

// To reproduce the problem from we will render
// cluster.a.default.total_match_count before we discover the existence of
// cluster.x.default.total_match_count. That will happen because "d" in
// "default" comes before "x" with
// https://github.com/envoyproxy/envoy/pull/24998
Stats::ScopeSharedPtr scope2 = store.rootScope()->createScope("cluster.x");
counters_.push_back(Stats::CounterSharedPtr(&scope2->counterFromStatName(name)));

constexpr absl::string_view expected_output =
R"EOF(# TYPE envoy_cluster_default_total_match_count counter
envoy_cluster_default_total_match_count{envoy_cluster_name="a"} 0
envoy_cluster_default_total_match_count{envoy_cluster_name="x"} 0
)EOF";

// Note: in the version of prometheus_stats_test.cc that works with the
// streaming GroupedStatsRequest, the test code is slightly different;
// it's based on the local 'store' object rather than being based on
// the counters_ member variable.
// StatsParams params;
// params.type_ = StatsType::Counters;
// params.format_ = StatsFormat::Prometheus;
// auto request = std::make_unique<GroupedStatsRequest>(store, params, custom_namespaces_);
// EXPECT_EQ(expected_output, response(*request));
// This code is left here so that we can verify the bug is fixed if we decide to
// re-try the streaming Prometheus implementation.

Buffer::OwnedImpl response;
const uint64_t size = PrometheusStatsFormatter::statsAsPrometheus(
counters_, gauges_, histograms_, textReadouts_, response, StatsParams(), custom_namespaces);
EXPECT_EQ(1, size);
EXPECT_EQ(expected_output, response.toString());
}

TEST_F(PrometheusStatsFormatterTest, HistogramWithNonDefaultBuckets) {
Stats::CustomStatNamespacesImpl custom_namespaces;
HistogramWrapper h1_cumulative;
Expand Down

0 comments on commit f262984

Please sign in to comment.