-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report tracing_unbounded
channel size to prometheus
#1489
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -24,7 +24,10 @@ use prometheus::{ | |||||
Error as PrometheusError, Registry, | ||||||
}; | ||||||
|
||||||
use prometheus::{core::GenericCounterVec, Opts}; | ||||||
use prometheus::{ | ||||||
core::{GenericCounterVec, GenericGaugeVec}, | ||||||
Opts, | ||||||
}; | ||||||
|
||||||
lazy_static! { | ||||||
pub static ref TOKIO_THREADS_TOTAL: GenericCounter<AtomicU64> = | ||||||
|
@@ -36,18 +39,32 @@ lazy_static! { | |||||
} | ||||||
|
||||||
lazy_static! { | ||||||
pub static ref UNBOUNDED_CHANNELS_COUNTER : GenericCounterVec<AtomicU64> = GenericCounterVec::new( | ||||||
Opts::new("substrate_unbounded_channel_len", "Items in each mpsc::unbounded instance"), | ||||||
&["entity", "action"] // 'name of channel, send|received|dropped | ||||||
pub static ref UNBOUNDED_CHANNELS_COUNTER: GenericCounterVec<AtomicU64> = GenericCounterVec::new( | ||||||
Opts::new( | ||||||
"substrate_unbounded_channel_len", | ||||||
"Items sent/received/dropped on each mpsc::unbounded instance" | ||||||
), | ||||||
&["entity", "action"], // name of channel, send|received|dropped | ||||||
).expect("Creating of statics doesn't fail. qed"); | ||||||
pub static ref UNBOUNDED_CHANNELS_SIZE: GenericGaugeVec<AtomicU64> = GenericGaugeVec::new( | ||||||
Opts::new( | ||||||
"substrate_unbounded_channel_size", | ||||||
"Size (number of messages to be processed) of each mpsc::unbounded instance", | ||||||
), | ||||||
&["entity"], // name of channel | ||||||
).expect("Creating of statics doesn't fail. qed"); | ||||||
|
||||||
} | ||||||
|
||||||
pub static SENT_LABEL: &'static str = "send"; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The original label name was "send", and I'm not sure if it's a good idea to break the compatibility — there are probably a lot of automation scripts using this. |
||||||
pub static RECEIVED_LABEL: &'static str = "received"; | ||||||
pub static DROPPED_LABEL: &'static str = "dropped"; | ||||||
|
||||||
/// Register the statics to report to registry | ||||||
pub fn register_globals(registry: &Registry) -> Result<(), PrometheusError> { | ||||||
registry.register(Box::new(TOKIO_THREADS_ALIVE.clone()))?; | ||||||
registry.register(Box::new(TOKIO_THREADS_TOTAL.clone()))?; | ||||||
registry.register(Box::new(UNBOUNDED_CHANNELS_COUNTER.clone()))?; | ||||||
registry.register(Box::new(UNBOUNDED_CHANNELS_SIZE.clone()))?; | ||||||
|
||||||
Ok(()) | ||||||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure that a single
gauge
will provide enough data here. Since a gauge is a simple counter cell, Prometheus will get just a single value on the next collection. However, we will not track any spikes of this value between the collection intervals. I would suggest to use histogram here, as it includes both Gauge functionality and buckets that will allow to track anomalities and peaks easily. On the other hand, it is more expensive.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll go with a simple gauge here, because it's hard to estimate the performance impact of using a histogram, considering we use channels extensively. So far this is merely a fix for the instant channel size calculation, which was done on the CI side previously using metrics not intended for this. But thanks for the suggestion anyway.