-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: allow specifying tags when creating metric objects #9743
Conversation
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
@jmarantz Let me know if this looks like the right approach and I'll do the same for the other metric objects. One question: the |
@@ -113,6 +113,11 @@ class IsolatedStoreImpl : public StoreImpl { | |||
Histogram& histogram = histograms_.get(name, unit); | |||
return histogram; | |||
} | |||
Histogram& histogramFromStatName(StatName name, const std::vector<Tag>&, | |||
Histogram::Unit unit) override { | |||
Histogram& histogram = histograms_.get(name, unit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is obviously not right, might need to key the metrics off a pair of name + tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depends. is 'name' here the tag-extracted name? Or does it have all the tags in it?
If it's the former, then I would expect that I could have two histograms which were identical in tag-extracted name but differ in tag-values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is that so far is that just using the original name of the histogram is sufficient to uniquely identify it: since the tag extracted name + tags is a function of original metric name + bootstrap config, tags can be ignored when looking up cached metric objects. It seems like the stat stores are using the full stat name as the hash key.
I'll play around with some ways of encoding the static tags into the hash key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the easiest mechanism would be to synthesize a StatName the incorporates the tag names & values, e.g.
scope.subsystem.counter1 [tag1=value1, tag2=value2]
could be encoded to have a StatName of:
scope.subsystem.counter1.TAGS.tag1.value1.tag2.value2
this would be a memory-efficient encoding because the tags and values would all be references to the same symbols.
Then you don't have to change the hashing at all, and they can be stored in the same table as the existing stats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side comment: it would be cool if we could preserve the positions of the tags in the synthesized string. That way we can optimize away 1) the regex to extract tags back for prometheus, 2) linear number of regexes for each tag name.
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
@@ -113,6 +113,11 @@ class IsolatedStoreImpl : public StoreImpl { | |||
Histogram& histogram = histograms_.get(name, unit); | |||
return histogram; | |||
} | |||
Histogram& histogramFromStatName(StatName name, const std::vector<Tag>&, | |||
Histogram::Unit unit) override { | |||
Histogram& histogram = histograms_.get(name, unit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depends. is 'name' here the tag-extracted name? Or does it have all the tags in it?
If it's the former, then I would expect that I could have two histograms which were identical in tag-extracted name but differ in tag-values.
RefcountPtr<ParentHistogramImpl> stat(new ParentHistogramImpl( | ||
final_stat_name, unit, parent_, *this, extraction.tagExtractedName(), extraction.tags())); | ||
final_stat_name, unit, parent_, *this, extraction.tagExtractedName(), tags)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need the central & TLS caches to look at the tags as part of their key, if they are not part of the passed-in StatName.
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
/wait @snowp lmk when you want me to pick this back up. |
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Back from vacation now and will try to get this into a reviewable state |
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
@@ -457,7 +457,24 @@ Gauge& ThreadLocalStoreImpl::ScopeImpl::gaugeFromStatName(StatName name, | |||
return gauge; | |||
} | |||
|
|||
SymbolTable::StoragePtr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmarantz the way I'm constructing the full stat name here seems to work at the surface, but when I try to fetch the stats back from the store it fails to find the existing stat, and it looks like the stat names in the store are garbage (a symbolTable().toString ends up printing out of bounds). Is there anything obviously wrong that I'm doing here?
I kinda suspect that the storage isn't living long enough to be valid after I exit this scope, but not sure how else to create the symbol names.
|
||
auto stat_name_no_tags = StatName(symbolTable().join({prefix_.statName(), name}).get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
store the return StoragePtr in a temp that outlives stat_name_no_tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend also not using 'auto' here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more thing here, just for your intuition: StatName
is a lot like string_view
; you would probably not want to do:
absl::string_view view(std::string("xx"));
because the std::string would be destructed before you could use the view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I think my confusion comes from the fact that it seems like it's okay to destruct the Storage
after the stat name is handed to the store/scope, but I'm guessing the store takes ownership over that in some sense, or perhaps copies it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copies it.
Going through and updating call sites made me realize that this won't be compatible with hot restart stats transfer in its current form due to the stats transfer just sending the name/value of the stats. Presumably we could pack the tag names into the name and parse them out on the receiving end, but that can probably wait until a follow up PR |
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
source/common/grpc/context_impl.cc
Outdated
@@ -49,7 +49,9 @@ void ContextImpl::chargeStat(const Upstream::ClusterInfo& cluster, Protocol prot | |||
symbol_table_.join({protocolStatName(protocol), request_names.service_, request_names.method_, | |||
status_stat_name}); | |||
|
|||
cluster.statsScope().counterFromStatName(Stats::StatName(stat_name_storage.get())).inc(); | |||
cluster.statsScope() | |||
.counterFromStatName(Stats::StatName(stat_name_storage.get()), absl::nullopt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does make the PR a lot bigger (and was probably annoying to do; sorry about that). On the plus side, having only a single implementation in the various stores is a big win.
I think I agree with the rationale in the Google Style Guide pre-C++11.
https://google.github.io/styleguide/cppguide.html#Default_Arguments
However, with C++11 I feel that the argument against default-args on virtual functions misses the distinction between PURE interfaces and overrides, which are more explicit in C++11. i.e. if we define the default value only at the interface, and disallow changing the default for overrides, I don't see how what scenario doesn't behave correctly.
Moreover there is precedent in the Envoy code base for this pattern of default-args in interfaces. A simple grep finds:
./envoy/json/json_object.h:80: virtual ObjectSharedPtr getObject(const std::string& name, bool allow_empty = false) const PURE;
./envoy/json/json_object.h:107: bool allow_empty = false) const PURE;
./envoy/json/json_object.h:132: bool allow_empty = false) const PURE;
./envoy/common/scope_tracker.h:26: virtual void dumpState(std::ostream& os, int indent_level = 0) const PURE;
./envoy/common/pure.h:7:#define PURE = 0
./envoy/stream_info/filter_state.h:78: StateType state_type, LifeSpan life_span = LifeSpan::FilterChain) PURE;
./envoy/event/timer.h:43: const ScopeTrackedObject* object = nullptr) PURE;
./envoy/event/timer.h:53: const ScopeTrackedObject* object = nullptr) PURE;
./envoy/http/api_listener.h:27: bool is_internally_created = false) PURE;
./envoy/http/header_map.h:602: virtual void dumpState(std::ostream& os, int indent_level = 0) const PURE;
./envoy/http/codec.h:489: bool is_internally_created = false) PURE;
WDYT?
The other alternative is to just use a different interface name for the methods that take tag-vectors.
I'm also OK with what you have if you strongly prefer but I'd be interested in @mattklein123 's opinion on this. There are other cases where we deviate from the Google style guide, and document that fact in https://github.com/envoyproxy/envoy/blob/master/STYLE.md .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with allowing default values on interface functions. I honestly don't know the history of why they are generically blocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found out why this is disallowed; it doesn't work as I thought it ought to:
struct Base {
virtual void f(int i = 42) = 0;
};
struct Derived : Base {
void f(int) override;
};
int main() {
Derived d;
Base& b = d; // Use the same object via a different name and static type.
b.f(); // OK, equivalent to d.Derived::f(42).
d.f(); // Error. Derived::f requires one argument.
}
I think we either have to explicitly give the nullopt arg at every counterFromStatName() call, or we'll have to use a non-virtual helper. I'm OK either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem I ran into with the non-virtual helper is that StatName
isn't a complete type at this point so I can't actually define the non-virtual helper on the top level Scope
class. We could define StatName
in include/envoy/common/stats/symbol_table.h, but that puts a fair bit of implementation logic into the include/ which seems undesirable. The other way to make this possible would be to use a virtual interface class for it, but I'm guessing we don't want that for the sake of keeping StatName
simple?
Using a default value on the top level definition would be helpful for most of the use cases since most things are invoking the functions from a Scope
reference, so it might still be useful even if the default value overload doesn't inherit? It might end up being confusing, but if we're just trying to solve for common case it does make the API a bit simpler.
That said I'm happy to keep the explicit value if we think that's the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you define the non-virtual helper if it's passing through const StatName&
rather than StatName
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to get it working by using const StatName&
and StatNameTagVectorOptRef
everywhere, ptal
Signed-off-by: Snow Pettersen <[email protected]>
…t pass empty Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
include/envoy/stats/scope.h
Outdated
* @param tags optionally specified tags. | ||
* @return a counter within the scope's namespace. | ||
*/ | ||
virtual Counter& counterFromStatName(const StatName& name, StatNameTagVectorOptRef tags) PURE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we rename this method so we don't have a virtual and non-virtual method with the same name?
include/envoy/stats/stats.h
Outdated
@@ -16,6 +16,7 @@ namespace Stats { | |||
|
|||
class Allocator; | |||
struct Tag; | |||
using TagVector = std::vector<Tag>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we can we just include tag.h here rather than predeclaring Tag and repeating the using
?
@@ -136,15 +145,15 @@ class IsolatedStoreImpl : public StoreImpl { | |||
|
|||
Counter& counter(const std::string& name) override { | |||
StatNameManagedStorage storage(name, symbolTable()); | |||
return counterFromStatName(storage.statName()); | |||
return counterFromStatName(storage.statName(), absl::nullopt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can drop these absl::nullopt args now right?
@@ -45,7 +45,7 @@ class ThreadLocalStorePerf { | |||
|
|||
void accessCounters() { | |||
for (auto& stat_name_storage : stat_names_) { | |||
store_.counterFromStatName(stat_name_storage->statName()); | |||
store_.Stats::Store::counterFromStatName(stat_name_storage->statName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this hack is due to having virtual/non-virtual overloads of the same name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, ill rename one of them to get rid of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting much closer; rename the methods that take the tag-vector (e.g. append withTags
) and a few other minor nits and we are good to go.
Thank you so much for sticking with this; this is great.
Signed-off-by: Snow Pettersen <[email protected]>
Signed-off-by: Snow Pettersen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thank you!
@mattklein123 I assume you'll want to take the next pass.
Signed-off-by: Snow Pettersen [email protected]
Risk Level: Low
Testing: UTs
Docs Changes: n/a
Release Notes: n/a
#9194