-
Notifications
You must be signed in to change notification settings - Fork 12
Report CounterEvents as cumulative metrics to Stackdriver #162
Conversation
structure is good, just nit stuff. With this change do we need to say only run a single instance of the nozzle (until consistent nozzling lands)? Reviewed 9 of 9 files at r1. src/stackdriver-nozzle/messages/metric.go, line 31 at r1 (raw file):
good catch since we're appending below. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 101 at r1 (raw file):
suggest naming: UpdateCounter, Track, Update, ... Get isn't generally stateful. Curious why you went with passing/returning eventTime as a pointer? src/stackdriver-nozzle/nozzle/counter_tracker.go, line 114 at r1 (raw file):
suggest adding another return parameter, src/stackdriver-nozzle/nozzle/counter_tracker.go, line 117 at r1 (raw file):
is there a reason not to reset our counter as well in this scenario? src/stackdriver-nozzle/nozzle/metric_sink_test.go, line 241 at r1 (raw file):
Comments from Reviewable |
Review status: all files reviewed at latest revision, 10 unresolved discussions. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 31 at r1 (raw file):
Naming nit: "total" instead of "count" for consistency? or just drop the "count" (it's cleaner!) src/stackdriver-nozzle/nozzle/counter_tracker.go, line 66 at r1 (raw file):
Similarly here time.Duration is really just an int64 so this should probably not be a pointer. I can't find any "pointer is nil" overloading for this particular field either. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 79 at r1 (raw file):
Is this frequent enough? It means that the default TTL of 70s really means "between 70 and 105 seconds", which is a relatively large interval. I understand why basing it on the TTL makes sense -- to scale down to small TTLs effectively -- but maybe there needs to be an upper limit on the tick time to prevent large TTLs having a lot of variance? src/stackdriver-nozzle/nozzle/counter_tracker.go, line 101 at r1 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
+1 to time not being a pointer. https://golang.org/pkg/time/#Time says: "Programs using times should typically store and pass them as values, not pointers. That is, time variables and struct fields should be of type time.Time, not *time.Time." src/stackdriver-nozzle/nozzle/counter_tracker.go, line 114 at r1 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
I agree that a bool that indicates whether the metric should be published is a better approach than having the nil-ness of a time.Time pointer convey the same information. src/stackdriver-nozzle/nozzle/counter_tracker_test.go, line 42 at r1 (raw file):
Isn't context.Background() OK to use in tests? src/stackdriver-nozzle/nozzle/counter_tracker_test.go, line 69 at r1 (raw file):
I think with a little effort you could significantly reduce the amount of copy/paste code here... src/stackdriver-nozzle/nozzle/metric_sink_test.go, line 208 at r1 (raw file):
Ugh, many integer types. I was going to say that you should probably prefer to use time.Second here, but the amount of typecasting required for this to work is awkward. Even so, eventTime.UnixNano() + int64(time.Second)*int64(idx) or similar might be better. src/stackdriver-nozzle/nozzle/metric_sink_test.go, line 209 at r1 (raw file):
Is the cast required here? eventValues is [][]uint64 already. Comments from Reviewable |
Oh yes, good point: this requires running a single copy of the nozzle. Should I document that somewhere, or maybe keep the old behavior making the counterTracker path enabled by a configuration option? I have also discussed this with @fluffle this morning and made some significant changes:
Review status: 2 of 9 files reviewed at latest revision, 10 unresolved discussions. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 31 at r1 (raw file): Previously, fluffle (Alex Bee) wrote…
Done. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 66 at r1 (raw file): Previously, fluffle (Alex Bee) wrote…
Done. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 79 at r1 (raw file): Previously, fluffle (Alex Bee) wrote…
Good point, thanks. Capped at 10 seconds. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 101 at r1 (raw file): Previously, fluffle (Alex Bee) wrote…
Renamed to Update(). Switched time types to be passed as values. Instead of interpreting a nil pointer as indication of the first value seen for a counter, metric sink just uses the cumulative value of 0 (which seems more intuitive anyway). src/stackdriver-nozzle/nozzle/counter_tracker.go, line 114 at r1 (raw file): Previously, fluffle (Alex Bee) wrote…
Currently that decision is made based on returned cumulative value being more than 0. This seems logical, since the returned value is in fact correct and it could be used if Stackdriver accepted points where event time == start time. So the decision of whether to publish that value or not should be made by the caller, I think. Please let me know if you disagree. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 117 at r1 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
Well, I don't see a reason to reset it, given that we have all the state to continue adding to the existing total. src/stackdriver-nozzle/nozzle/counter_tracker_test.go, line 42 at r1 (raw file): Previously, fluffle (Alex Bee) wrote…
It does seem like it should be OK, however other tests are using context.TODO(). src/stackdriver-nozzle/nozzle/counter_tracker_test.go, line 69 at r1 (raw file): Previously, fluffle (Alex Bee) wrote…
Done. Not sure if this is the right way to do such things in Ginkgo. src/stackdriver-nozzle/nozzle/metric_sink_test.go, line 208 at r1 (raw file): Previously, fluffle (Alex Bee) wrote…
Done. src/stackdriver-nozzle/nozzle/metric_sink_test.go, line 209 at r1 (raw file): Previously, fluffle (Alex Bee) wrote…
Done. Comments from Reviewable |
I think we should make it enabled by configuration. This is the ideal behavior but if we enforce single node deployments then we break scaling where it works fine today (for Stackdriver Logging). Reviewed 7 of 7 files at r2. src/stackdriver-nozzle/messages/metric.go, line 38 at r2 (raw file):
very helpful comment! src/stackdriver-nozzle/messages/metric.go, line 70 at r2 (raw file):
is this TODO relevant or copy/pasted? src/stackdriver-nozzle/nozzle/counter_tracker.go, line 79 at r1 (raw file): Previously, knyar (Anton Tolchanov) wrote…
I think a fixed period would be just fine and more predicable (1? 5? 10 seconds?). Agree with the TTL increase being larger than two metric periods. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 101 at r1 (raw file): Previously, knyar (Anton Tolchanov) wrote…
New header looks good. Resolving as the 'ok' discussion is below. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 114 at r1 (raw file): Previously, knyar (Anton Tolchanov) wrote…
0 seems like a potential real value to me. If a counter was 'outOfMemoryErrors' then it very well may emit a cumulative 0 for a long period of time. I think we want to record that value in Stackdriver Monitoring and not hide it. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 165 at r2 (raw file):
suggest logging before resetting the value. is it necessary to reset the value? it looks like newCounterData will take care of that if it reappears. src/stackdriver-nozzle/nozzle/counter_tracker_test.go, line 42 at r1 (raw file): Previously, knyar (Anton Tolchanov) wrote…
agree Background is correct. Would appreciate that clean up in another commit. src/stackdriver-nozzle/nozzle/counter_tracker_test.go, line 69 at r1 (raw file): Previously, knyar (Anton Tolchanov) wrote…
It's totally reasonable and a good clean up. Ginkgo allows you to create custom matchers[1] but it may not be the right fit here. [1] https://onsi.github.io/gomega/#adding-your-own-matchers src/stackdriver-nozzle/stackdriver/metric_adapter.go, line 155 at r2 (raw file):
works for me, seems like a positive change Comments from Reviewable |
ea04f33
to
69ecbb1
Compare
Sounds good. I've added a new configuration option for this, which can be enabled in the manifest. Review status: 1 of 11 files reviewed at latest revision, 12 unresolved discussions. src/stackdriver-nozzle/messages/metric.go, line 70 at r2 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
It has been copy/pasted from metric_adapter.go. Should I remove it? src/stackdriver-nozzle/nozzle/counter_tracker.go, line 79 at r1 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
As Alex mentioned, there's value in having period decrease for short TTLs, mostly to make tests pass faster. In production it will probably always be capped by src/stackdriver-nozzle/nozzle/counter_tracker.go, line 114 at r1 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
Very good point, thanks. I've actually changed the condition in the metric sink to check for non-zero time interval, which is what Stackdriver actually expects. src/stackdriver-nozzle/nozzle/counter_tracker.go, line 165 at r2 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
Added logging. Resetting to -1 makes expired counters distinguishable from active ones on the src/stackdriver-nozzle/nozzle/counter_tracker_test.go, line 69 at r1 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
Thanks! src/stackdriver-nozzle/stackdriver/metric_adapter.go, line 155 at r2 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
Cool, thanks for the feedback. Comments from Reviewable |
Reviewed 10 of 10 files at r3. jobs/stackdriver-nozzle/spec, line 71 at r3 (raw file):
Just a note- I think it's good to leave this out of the tile.yaml.erb for now. Likely we'll want to just enable this feature once the upstream work lands in PCF. src/stackdriver-nozzle/messages/metric.go, line 70 at r2 (raw file): Previously, knyar (Anton Tolchanov) wrote…
yes please! no clue what it's a TODO - to do :) src/stackdriver-nozzle/nozzle/counter_tracker.go, line 79 at r1 (raw file): Previously, knyar (Anton Tolchanov) wrote…
Ah ok thanks! src/stackdriver-nozzle/nozzle/counter_tracker.go, line 165 at r2 (raw file): Previously, knyar (Anton Tolchanov) wrote…
thanks! that makes sense Comments from Reviewable |
This adds a "counter tracker" which is used to maintain start time and a separate cumulative value for each counter metric, which need to be reported to Stackdriver.
Review status: 10 of 11 files reviewed at latest revision, 11 unresolved discussions. Comments from Reviewable |
Reviewed 1 of 1 files at r4. Comments from Reviewable |
Review status: all files reviewed at latest revision, 3 unresolved discussions. jobs/stackdriver-nozzle/spec, line 71 at r3 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
Sounds good. src/stackdriver-nozzle/messages/metric.go, line 70 at r2 (raw file): Previously, johnsonj (Jeff Johnson) wrote…
Done. Comments from Reviewable |
@johnsonj, I believe this is now ready for you to merge. I don't seem to be able to merge myself because of some pending comments in Reviewable, even though I believe I've addressed all of them. |
This adds a "counter tracker" which is used to maintain start time and a separate cumulative value for each counter metric, which need to be reported to Stackdriver.
As the result, each counter is properly exported as a cumulative Stackdriver metric instead of two gauge metrics (
.total
and.delta
).This change is