-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stats endpoint #1384
Stats endpoint #1384
Conversation
|
||
monitored_pools() -> | ||
% TODO(jaym) 08-23-17: Move this out to configuration | ||
[sqerl, oc_chef_authz_http, chef_index_http, chef_depsolver]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure, but I suspect that keygen, chef_objects and data_collector may also have pools to monitor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried those, didn't work :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may have been thinking of this:
(oc_erchef@api)1> chef_keygen_cache:status().
[{keys,10},
{max,10},
{max_workers,2},
{cur_max_workers,2},
{inflight,[]},
{avail_workers,2},
{start_size,0}]
@@ -56,6 +56,12 @@ | |||
{error_logger_hwm, <%= @log_rotation['max_messages_per_second'] %>} | |||
]}, | |||
|
|||
{prometheus, [{collectors, [default, | |||
<% if node['private_chef']['postgresql']['enable'] && !node['private_chef']['postgresql']['external'] -%> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything about the stats we collect that would make it hard to collect for external DBs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not hard, but I was more thinking about who owns the service. The idea was, if the service is not on the box, it's not our problem to monitor. I could be convinced otherwise, however there might be weirdness if say multiple frontends were to report a service that they didn't own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read too late that this was WIP -- So, for what it's worth, some in-progress-nitpicks...
-module(chef_wm_stats). | ||
|
||
-ifdef(TEST). | ||
-compile(export_all). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] This doesn't export more than -export()
already exports, does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should export everything in the file (for any tests to use them).
{"text/plain", to_text} | ||
], Req, State}. | ||
|
||
to_json(Req, State) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant for later inclusion? I wonder if this should provide application/json if it doesn't actually provide it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'll be working on this today. Prometheus is a secondary format and json will be default
-define(A2B(X), erlang:atom_to_binary(X, utf8)). | ||
|
||
init(_Any) -> | ||
{ok, <<"{}">>}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the use of passing <<"{}">>
around as State
? (Am I missing something?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, I just copy pasted from elsewhere. Will take a closer look here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaned up.
to_text/2]). | ||
|
||
-include_lib("webmachine/include/webmachine.hrl"). | ||
-define(A2B(X), erlang:atom_to_binary(X, utf8)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] Is this used?
Callback = fun (MF) -> | ||
Data = get(?PROCESS_DICT_STORAGE), | ||
put(?PROCESS_DICT_STORAGE, [mf_to_erl(MF) | Data]) | ||
end, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit of a bummer that the API here doesn't let us pass around an accumulator instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea :(
Would have been a lot cleaner
" SUM(heap_blks_read) as heap_blks_read, SUM(heap_blks_hit) as heap_blks_hit," | ||
" SUM(idx_blks_read) as idx_blks_read, SUM(idx_blks_hit) as idx_blks_hit, SUM(toast_blks_read) as toast_blks_read," | ||
" SUM(toast_blks_hit) as toast_blks_hit, SUM(tidx_blks_read) as tidx_blks_read," | ||
" SUM(tidx_blks_hit) as tidx_blks_hit FROM pg_stat_all_tables, pg_statio_all_tables">>}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is inaccurate. "," does a cartesian join and inflates the number of final rows. I'm still trying to figure out if there was a join here we wanted or if we just wanted to get a composite query and how to do that (because reading postgresql to me is like spaniard trying to read Brazilian Portuguese... so close but yet so far).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no join. just a composite of 2 queries, perhaps i was trying to be too clever
c88ac97
to
6e84408
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some comments regarding the password generation here. Other than that it looks good to me.
I'm a bit nervous about the use of the process dictionary, but I don't have a lot of reasoning to back up that nervousness so we can probably roll with it.
command "openssl passwd -apr1 '#{stats_api_passwd}' >> #{stats_passwd_file}" | ||
action :nothing | ||
sensitive true | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to store this password back into the secrets store somewhere. Perhaps rather than doing this as an execute we can move this into a library and then generate the password when we generate the other passwords in the initial configuration parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait I get this password from the secret store (unless I don't understand how this works). I do chef-server-ctl add-secret opscode_erchef stats_password "12345". This derived password is just a salted hash of that password.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored this so that it autogenerates a password if not provided and also just generates the file in one go (instead of 2 steps). This way the nginx server can get restarted properly only when the password file actually changes.
group OmnibusHelper.new(node).ownership['group'] | ||
notifies :run, 'execute[stats_api_append_password]', :immediately | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of the veil work, we try not to put any passwords outside of the veil store when the user has insecure_addon_compat false
. We've never really talked about how strict we want to be about that moving forward since that work was done to address some particular customer concerns.
Unfortunately, from the docs it looks like it might be hard/impossible to use the basic_auth module with out sticking the credential in that file. I suppose we could change over to using access_by_lua, but that seems a bit complex for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The password is salted and hashed from my understanding.
cat /var/opt/opscode/nginx/stats_htpasswd
foo:$apr1$S.a9NOEq$YiWDrTTJb68HjyFjm9CmH/```
So it might be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http://nginx.org/en/docs/http/ngx_http_auth_basic_module.html does seem to mention that the actual algorithms used might be... weak? I get suspicious when I see MD5 in there. But there seems to be no other nice way to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ksubrama Ah, yeah, somehow I overlooked the fact that this is in fact NOT a "plain text password" in that file. I think that means we are OK.
|
||
-spec format(Registry :: prometheus_registry:registry()) -> binary(). | ||
format(Registry) -> | ||
put(?PROCESS_DICT_STORAGE, []), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only concern with using the process dictionary here is out of ignorance. I don't know if putting what could be a lot of data in the process dictionary will cause us any problems down the road. The prometheus code uses the ram_disk module for all of it's formatters; however I have no idea if that is better or worse. Perhaps for now this is just something we live with but keep an eye on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I wanted to use ram file, but I couldn't figure out how to do it without rolling my own json serializer. That being said, there shouldn't be a lot of data. The other option I could think of was creating a gen server to store the state.
If it makes you feel better, webmachine already makes fairly heavy use of the process dictionary: https://github.com/webmachine/webmachine/blob/82472528d8a735d28fa92a7c6e03fc97cb4d1913/src/webmachine_decision_core.erl#L28
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm +1 on leaving this as is for now. Thanks for following up.
@@ -647,6 +648,14 @@ def gen_redundant(node_name, topology) | |||
end | |||
end | |||
|
|||
def ensure_stats_password |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RFC for this feature says:
A username and password will be generated if it not provided.
So, I think we should probably be treating this like the other non-optional passwords where we generate the username and password if they are not provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I missed that. Ok - I need to figure out how to save a generated password.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
{<<"seq_tup_read">>, {pg_stat_seq_tup_read, counter, "Number of live rows fetched by sequential scans", fun erlang:binary_to_integer/1}}, | ||
{<<"idx_scan">>, {pg_stat_idx_scan, counter, "Number of index scans initiated", fun erlang:binary_to_integer/1}}, | ||
{<<"idx_tup_fetch">>, {pg_stat_tup_fetch, counter, "Number of live rows fetched by index scans", fun erlang:binary_to_integer/1}}, | ||
{<<"n_tup_ins">>, {pg_stat_n_tup_ins, counter, "Number of rows inserted", fun erlang:binary_to_integer/1}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make it clear in these descriptions that these are sums across all tables perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@ksubrama you mentioned you had some pedant tests. If so, could you push them up. |
9c33b3f
to
9675732
Compare
stat["metrics"].each do |metric| | ||
expect(metric).to have_key("value") | ||
if type == "GAUGE" || type == "COUNTER" | ||
expect(Integer(metric["value"])).to be_a(Integer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think guages can be float as well
response_type_map.each do |name, type| | ||
stat = response.find { |s| s["name"] == name } | ||
expect(stat["metrics"]).not_to be_empty | ||
stat["metrics"].each do |metric| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of iterating inside the it
, is it possible to create an it
for each so that if there is a failure, you know what is broken?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like https://coderwall.com/p/5kfxhg/dynamic-rspec-tests
% I think there is a bug in webmachine where it wont allow us to use | ||
% 'text/plain; version=0.0.4'. | ||
% TODO: Understand https://github.com/basho/webmachine/blob/develop/src/webmachine_util.erl#L140-L158 | ||
{{"text/plain",[{"version","0.0.4"}]}, to_text}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to follow the rfc, we're supposed to accept &format=
.
Seeing as we dont do that, we should probably remove text/plain
for now. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather us just quickly implement a format parameter, but I understand that there is probably some desire to get this PR wrapped up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done :)
@@ -17,6 +17,15 @@ def ownership | |||
{"owner" => owner, "group" => group} | |||
end | |||
|
|||
def apr1_password(password) | |||
cmd = Mixlib::ShellOut.new("openssl passwd -apr1 '#{password}'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a full path to our embedded openssl install here?
@jaym Do we need to do anything here for FIPS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OPENSSL_FIPS=1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i highly doubt that will work with fips. It might be safer to use crypt()
instead. Not sure you can have a working system where crypt()
doesn't work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a couple of comments but am marking this as approved. Feel free to address the comments as you see fit.
We should err on the side of getting this in and then iterating on it since it isn't in the critical path of any requests.
{"text/plain", to_text} | ||
], Req, State}. | ||
{"text/plain", to_text}], | ||
case wrq:get_qs_value("format", Req) of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool
42290b5
to
3d0114d
Compare
# Run command once and cache the value. | ||
STATS_PASSWORD = | ||
begin | ||
cmd = Mixlib::ShellOut.new("chef-server-ctl show-secret opscode_erchef stats_password") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively to this, we can assume the password is in the environment:
https://github.com/chef/chef-server/blob/master/oc-chef-pedant/lib/pedant.rb#L79-L80
And then do something like this to feed it into the env at runtime:
Perhaps also putting a guard around the tests to disable them if the secret doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd also need to add the stats password here: https://github.com/chef/chef-server/blob/master/dev/defaults.yml#L250-L255
3d0114d
to
9b8879f
Compare
👍 |
Currently returns some basic erlang vm statistics in the prometheus text format. This is the first step towards RFC #93. Basic auth support is present out of the box. The user can edit the ['opscode-erchef']['stats_auth_enable'] flag to change this. The actual password is stored in opscode_erchef.stats_password in chef veil. A default password is generated if the user does not specify one before running chef-server-ctl reconfigure. The end-point is accessible at /_stats. You can use ?format=json or ?format=text to access the available metric in json or prometheus formats. The stats returned are documented at #1385. ``` # TYPE erlang_vm_memory_atom_bytes_total gauge # HELP erlang_vm_memory_atom_bytes_total The total amount of memory currently allocated for atoms. This memory is part of the memory presented as system memory. erlang_vm_memory_atom_bytes_total{usage="used"} 1455697 erlang_vm_memory_atom_bytes_total{usage="free"} 12848 # TYPE erlang_vm_memory_bytes_total gauge # HELP erlang_vm_memory_bytes_total The total amount of memory currently allocated. This is the same as the sum of the memory size for processes and system. erlang_vm_memory_bytes_total{kind="system"} 45984768 erlang_vm_memory_bytes_total{kind="processes"} 35098912 # TYPE erlang_vm_dets_tables gauge # HELP erlang_vm_dets_tables Erlang VM DETS Tables count erlang_vm_dets_tables 0 # TYPE erlang_vm_ets_tables gauge # HELP erlang_vm_ets_tables Erlang VM ETS Tables count erlang_vm_ets_tables 64 # TYPE erlang_vm_memory_processes_bytes_total gauge # HELP erlang_vm_memory_processes_bytes_total The total amount of memory currently allocated for the Erlang processes. erlang_vm_memory_processes_bytes_total{usage="used"} 35091312 erlang_vm_memory_processes_bytes_total{usage="free"} 7600 # TYPE erlang_vm_memory_system_bytes_total gauge # HELP erlang_vm_memory_system_bytes_total The total amount of memory currently allocated for the emulator that is not directly related to any Erlang process. Memory presented as processes is not included in this memory. erlang_vm_memory_system_bytes_total{usage="atom"} 1468545 erlang_vm_memory_system_bytes_total{usage="binary"} 803360 erlang_vm_memory_system_bytes_total{usage="code"} 33479913 erlang_vm_memory_system_bytes_total{usage="ets"} 2372400 erlang_vm_memory_system_bytes_total{usage="other"} 7860550 # TYPE erlang_vm_statistics_context_switches counter # HELP erlang_vm_statistics_context_switches Total number of context switches since the system started erlang_vm_statistics_context_switches 29549971 # TYPE erlang_vm_statistics_garbage_collection_number_of_gcs counter # HELP erlang_vm_statistics_garbage_collection_number_of_gcs Garbage collection: number of GCs erlang_vm_statistics_garbage_collection_number_of_gcs 1632986 # TYPE erlang_vm_statistics_garbage_collection_words_reclaimed counter # HELP erlang_vm_statistics_garbage_collection_words_reclaimed Garbage collection: words reclaimed erlang_vm_statistics_garbage_collection_words_reclaimed 12664754817 # TYPE erlang_vm_statistics_garbage_collection_bytes_reclaimed counter # HELP erlang_vm_statistics_garbage_collection_bytes_reclaimed Garbage collection: bytes reclaimed erlang_vm_statistics_garbage_collection_bytes_reclaimed 101318038536 # TYPE erlang_vm_statistics_bytes_received_total counter # HELP erlang_vm_statistics_bytes_received_total Total number of bytes received through ports erlang_vm_statistics_bytes_received_total 795724439 # TYPE erlang_vm_statistics_bytes_output_total counter # HELP erlang_vm_statistics_bytes_output_total Total number of bytes output to ports erlang_vm_statistics_bytes_output_total 604157373 # TYPE erlang_vm_statistics_reductions_total counter # HELP erlang_vm_statistics_reductions_total Total reductions erlang_vm_statistics_reductions_total 4946085703 # TYPE erlang_vm_statistics_run_queues_length_total gauge # HELP erlang_vm_statistics_run_queues_length_total Total length of the run-queues erlang_vm_statistics_run_queues_length_total 0 # TYPE erlang_vm_statistics_runtime_milliseconds counter # HELP erlang_vm_statistics_runtime_milliseconds The sum of the runtime for all threads in the Erlang runtime system. Can be greater than wall clock time erlang_vm_statistics_runtime_milliseconds 573750 # TYPE erlang_vm_statistics_wallclock_time_milliseconds counter # HELP erlang_vm_statistics_wallclock_time_milliseconds Information about wall clock. Same as erlang_vm_statistics_runtime_milliseconds except that real time is measured erlang_vm_statistics_wallclock_time_milliseconds 5495986 # TYPE erlang_vm_ets_limit gauge # HELP erlang_vm_ets_limit The maximum number of ETS tables allowed. erlang_vm_ets_limit 2053 # TYPE erlang_vm_logical_processors gauge # HELP erlang_vm_logical_processors The detected number of logical processors configured in the system. erlang_vm_logical_processors 4 # TYPE erlang_vm_logical_processors_available gauge # HELP erlang_vm_logical_processors_available The detected number of logical processors available to the Erlang runtime system. erlang_vm_logical_processors_available 4 # TYPE erlang_vm_logical_processors_online gauge # HELP erlang_vm_logical_processors_online The detected number of logical processors online on the system. erlang_vm_logical_processors_online 4 # TYPE erlang_vm_port_count gauge # HELP erlang_vm_port_count The number of ports currently existing at the local node. erlang_vm_port_count 61 # TYPE erlang_vm_port_limit gauge # HELP erlang_vm_port_limit The maximum number of simultaneously existing ports at the local node. erlang_vm_port_limit 65536 # TYPE erlang_vm_process_count gauge # HELP erlang_vm_process_count The number of processes currently existing at the local node. erlang_vm_process_count 395 # TYPE erlang_vm_process_limit gauge # HELP erlang_vm_process_limit The maximum number of simultaneously existing processes at the local node. erlang_vm_process_limit 262144 # TYPE erlang_vm_schedulers gauge # HELP erlang_vm_schedulers The number of scheduler threads used by the emulator. erlang_vm_schedulers 4 # TYPE erlang_vm_schedulers_online gauge # HELP erlang_vm_schedulers_online The number of schedulers online. erlang_vm_schedulers_online 4 # TYPE erlang_vm_smp_support untyped # HELP erlang_vm_smp_support 1 if the emulator has been compiled with SMP support, otherwise 0. erlang_vm_smp_support 1 # TYPE erlang_vm_threads untyped # HELP erlang_vm_threads 1 if the emulator has been compiled with thread support, otherwise 0. erlang_vm_threads 1 # TYPE erlang_vm_thread_pool_size gauge # HELP erlang_vm_thread_pool_size The number of async threads in the async thread pool used for asynchronous driver calls. erlang_vm_thread_pool_size 10 # TYPE erlang_vm_time_correction untyped # HELP erlang_vm_time_correction 1 if time correction is enabled, otherwise 0. erlang_vm_time_correction 1 # TYPE erchef_pooler_members_in_use gauge # HELP erchef_pooler_members_in_use Number of pool members currently being used. erchef_pooler_members_in_use{pool_name="sqerl"} 0 erchef_pooler_members_in_use{pool_name="oc_chef_authz_http"} 0 erchef_pooler_members_in_use{pool_name="chef_index_http"} 0 erchef_pooler_members_in_use{pool_name="chef_depsolver"} 0 # TYPE erchef_pooler_members_free gauge # HELP erchef_pooler_members_free Number of pool members currently available. erchef_pooler_members_free{pool_name="sqerl"} 20 erchef_pooler_members_free{pool_name="oc_chef_authz_http"} 25 erchef_pooler_members_free{pool_name="chef_index_http"} 25 erchef_pooler_members_free{pool_name="chef_depsolver"} 5 # TYPE erchef_pooler_members_max gauge # HELP erchef_pooler_members_max Max number of pool members allowed in the pool. erchef_pooler_members_max{pool_name="sqerl"} 20 erchef_pooler_members_max{pool_name="oc_chef_authz_http"} 100 erchef_pooler_members_max{pool_name="chef_index_http"} 100 erchef_pooler_members_max{pool_name="chef_depsolver"} 5 # TYPE erchef_pooler_queued_requestors gauge # HELP erchef_pooler_queued_requestors Number of requestors blocking to take a pool member. erchef_pooler_queued_requestors{pool_name="sqerl"} 0 erchef_pooler_queued_requestors{pool_name="oc_chef_authz_http"} 0 erchef_pooler_queued_requestors{pool_name="chef_index_http"} 0 erchef_pooler_queued_requestors{pool_name="chef_depsolver"} 0 # TYPE erchef_pooler_queued_requestors_max gauge # HELP erchef_pooler_queued_requestors_max Max number of requestors allowed to block on taking pool member. erchef_pooler_queued_requestors_max{pool_name="sqerl"} 20 erchef_pooler_queued_requestors_max{pool_name="oc_chef_authz_http"} 50 erchef_pooler_queued_requestors_max{pool_name="chef_index_http"} 50 erchef_pooler_queued_requestors_max{pool_name="chef_depsolver"} 50 # TYPE pg_stat_seq_scan counter # HELP pg_stat_seq_scan Number of sequential scans initiated pg_stat_seq_scan 5047095 # TYPE pg_stat_seq_tup_read counter # HELP pg_stat_seq_tup_read Number of live rows fetched by sequential scans pg_stat_seq_tup_read 415028001 # TYPE pg_stat_idx_scan counter # HELP pg_stat_idx_scan Number of index scans initiated pg_stat_idx_scan 23751732 # TYPE pg_stat_tup_fetch counter # HELP pg_stat_tup_fetch Number of live rows fetched by index scans pg_stat_tup_fetch 25953870 # TYPE pg_stat_n_tup_ins counter # HELP pg_stat_n_tup_ins Number of rows inserted pg_stat_n_tup_ins 52452 # TYPE pg_stat_n_tup_upd counter # HELP pg_stat_n_tup_upd Number of rows updated pg_stat_n_tup_upd 73461 # TYPE pg_stat_n_tup_del counter # HELP pg_stat_n_tup_del Number of rows deleted pg_stat_n_tup_del 31020 # TYPE pg_stat_n_live_tup gauge # HELP pg_stat_n_live_tup Estimated number of live rows pg_stat_n_live_tup 92355 # TYPE pg_stat_n_dead_tup gauge # HELP pg_stat_n_dead_tup Estimated number of dead rows pg_stat_n_dead_tup 29046 # TYPE pg_stat_heap_blocks_read counter # HELP pg_stat_heap_blocks_read Number of disk blocks read pg_stat_heap_blocks_read 69231 # TYPE pg_stat_heap_blocks_hit counter # HELP pg_stat_heap_blocks_hit Number of buffer hits pg_stat_heap_blocks_hit 38834925 # TYPE pg_stat_idx_blks_read counter # HELP pg_stat_idx_blks_read Number of disk blocks read from all indexes pg_stat_idx_blks_read 49632 # TYPE pg_stat_idx_blks_hit counter # HELP pg_stat_idx_blks_hit Number of buffer hits in all indexes pg_stat_idx_blks_hit 35615613 # TYPE pg_stat_toast_blks_read counter # HELP pg_stat_toast_blks_read Number of disk blocks read from TOAST tables pg_stat_toast_blks_read 1410 # TYPE pg_stat_toast_blks_hit counter # HELP pg_stat_toast_blks_hit Number of buffer hits in TOAST tables pg_stat_toast_blks_hit 2397 # TYPE pg_stat_tidx_blks_read counter # HELP pg_stat_tidx_blks_read Number of disk blocks read from TOAST tables pg_stat_tidx_blks_read 1410 # TYPE pg_stat_tidx_blks_hit counter # HELP pg_stat_tidx_blks_hit Number of buffer hits in TOAST table indexes pg_stat_tidx_blks_hit 3948 ``` Same stats in JSON format: ``` [ { "metrics" : [ { "value" : "1" } ], "type" : "UNTYPED", "name" : "erlang_vm_time_correction", "help" : "1 if time correction is enabled, otherwise 0." }, { "metrics" : [ { "value" : "10" } ], "type" : "GAUGE", "help" : "The number of async threads in the async thread pool used for asynchronous driver calls.", "name" : "erlang_vm_thread_pool_size" }, { "type" : "UNTYPED", "metrics" : [ { "value" : "1" } ], "help" : "1 if the emulator has been compiled with thread support, otherwise 0.", "name" : "erlang_vm_threads" }, { "help" : "1 if the emulator has been compiled with SMP support, otherwise 0.", "name" : "erlang_vm_smp_support", "type" : "UNTYPED", "metrics" : [ { "value" : "1" } ] }, { "help" : "The number of schedulers online.", "name" : "erlang_vm_schedulers_online", "metrics" : [ { "value" : "4" } ], "type" : "GAUGE" }, { "type" : "GAUGE", "metrics" : [ { "value" : "4" } ], "name" : "erlang_vm_schedulers", "help" : "The number of scheduler threads used by the emulator." }, { "name" : "erlang_vm_process_limit", "help" : "The maximum number of simultaneously existing processes at the local node.", "type" : "GAUGE", "metrics" : [ { "value" : "262144" } ] }, { "metrics" : [ { "value" : "395" } ], "type" : "GAUGE", "help" : "The number of processes currently existing at the local node.", "name" : "erlang_vm_process_count" }, { "metrics" : [ { "value" : "65536" } ], "type" : "GAUGE", "name" : "erlang_vm_port_limit", "help" : "The maximum number of simultaneously existing ports at the local node." }, { "type" : "GAUGE", "metrics" : [ { "value" : "60" } ], "name" : "erlang_vm_port_count", "help" : "The number of ports currently existing at the local node." }, { "metrics" : [ { "value" : "4" } ], "type" : "GAUGE", "help" : "The detected number of logical processors online on the system.", "name" : "erlang_vm_logical_processors_online" }, { "type" : "GAUGE", "metrics" : [ { "value" : "4" } ], "help" : "The detected number of logical processors available to the Erlang runtime system.", "name" : "erlang_vm_logical_processors_available" }, { "help" : "The detected number of logical processors configured in the system.", "name" : "erlang_vm_logical_processors", "type" : "GAUGE", "metrics" : [ { "value" : "4" } ] }, { "metrics" : [ { "value" : "2053" } ], "type" : "GAUGE", "name" : "erlang_vm_ets_limit", "help" : "The maximum number of ETS tables allowed." }, { "metrics" : [ { "value" : "21716981" } ], "type" : "COUNTER", "help" : "Information about wall clock. Same as erlang_vm_statistics_runtime_milliseconds except that real time is measured", "name" : "erlang_vm_statistics_wallclock_time_milliseconds" }, { "type" : "COUNTER", "metrics" : [ { "value" : "2089510" } ], "name" : "erlang_vm_statistics_runtime_milliseconds", "help" : "The sum of the runtime for all threads in the Erlang runtime system. Can be greater than wall clock time" }, { "type" : "GAUGE", "metrics" : [ { "value" : "0" } ], "help" : "Total length of the run-queues", "name" : "erlang_vm_statistics_run_queues_length_total" }, { "type" : "COUNTER", "metrics" : [ { "value" : "16208617229" } ], "name" : "erlang_vm_statistics_reductions_total", "help" : "Total reductions" }, { "name" : "erlang_vm_statistics_bytes_output_total", "help" : "Total number of bytes output to ports", "metrics" : [ { "value" : "2374560576" } ], "type" : "COUNTER" }, { "metrics" : [ { "value" : "2375163718" } ], "type" : "COUNTER", "help" : "Total number of bytes received through ports", "name" : "erlang_vm_statistics_bytes_received_total" }, { "metrics" : [ { "value" : "359118308568" } ], "type" : "COUNTER", "help" : "Garbage collection: bytes reclaimed", "name" : "erlang_vm_statistics_garbage_collection_bytes_reclaimed" }, { "help" : "Garbage collection: words reclaimed", "name" : "erlang_vm_statistics_garbage_collection_words_reclaimed", "type" : "COUNTER", "metrics" : [ { "value" : "44889788571" } ] }, { "help" : "Garbage collection: number of GCs", "name" : "erlang_vm_statistics_garbage_collection_number_of_gcs", "type" : "COUNTER", "metrics" : [ { "value" : "5263551" } ] }, { "name" : "erlang_vm_statistics_context_switches", "help" : "Total number of context switches since the system started", "type" : "COUNTER", "metrics" : [ { "value" : "109981973" } ] }, { "help" : "The total amount of memory currently allocated for the emulator that is not directly related to any Erlang process. Memory presented as processes is not included in this memory.", "name" : "erlang_vm_memory_system_bytes_total", "type" : "GAUGE", "metrics" : [ { "value" : "1468545", "labels" : { "usage" : "atom" } }, { "labels" : { "usage" : "binary" }, "value" : "942728" }, { "value" : "33503265", "labels" : { "usage" : "code" } }, { "value" : "2496320", "labels" : { "usage" : "ets" } }, { "value" : "8025046", "labels" : { "usage" : "other" } } ] }, { "help" : "The total amount of memory currently allocated for the Erlang processes.", "name" : "erlang_vm_memory_processes_bytes_total", "type" : "GAUGE", "metrics" : [ { "value" : "24208976", "labels" : { "usage" : "used" } }, { "labels" : { "usage" : "free" }, "value" : "21304" } ] }, { "metrics" : [ { "value" : "66" } ], "type" : "GAUGE", "help" : "Erlang VM ETS Tables count", "name" : "erlang_vm_ets_tables" }, { "name" : "erlang_vm_dets_tables", "help" : "Erlang VM DETS Tables count", "type" : "GAUGE", "metrics" : [ { "value" : "0" } ] }, { "metrics" : [ { "labels" : { "kind" : "system" }, "value" : "46435904" }, { "value" : "24230280", "labels" : { "kind" : "processes" } } ], "type" : "GAUGE", "help" : "The total amount of memory currently allocated. This is the same as the sum of the memory size for processes and system.", "name" : "erlang_vm_memory_bytes_total" }, { "name" : "erlang_vm_memory_atom_bytes_total", "help" : "The total amount of memory currently allocated for atoms. This memory is part of the memory presented as system memory.", "type" : "GAUGE", "metrics" : [ { "labels" : { "usage" : "used" }, "value" : "1455537" }, { "value" : "13008", "labels" : { "usage" : "free" } } ] }, { "help" : "Max number of requestors allowed to block on taking pool member.", "name" : "erchef_pooler_queued_requestors_max", "type" : "GAUGE", "metrics" : [ { "value" : "20", "labels" : { "pool_name" : "sqerl" } }, { "value" : "50", "labels" : { "pool_name" : "oc_chef_authz_http" } }, { "labels" : { "pool_name" : "chef_index_http" }, "value" : "50" }, { "value" : "50", "labels" : { "pool_name" : "chef_depsolver" } } ] }, { "metrics" : [ { "value" : "0", "labels" : { "pool_name" : "sqerl" } }, { "value" : "0", "labels" : { "pool_name" : "oc_chef_authz_http" } }, { "labels" : { "pool_name" : "chef_index_http" }, "value" : "0" }, { "value" : "0", "labels" : { "pool_name" : "chef_depsolver" } } ], "type" : "GAUGE", "help" : "Number of requestors blocking to take a pool member.", "name" : "erchef_pooler_queued_requestors" }, { "name" : "erchef_pooler_members_max", "help" : "Max number of pool members allowed in the pool.", "type" : "GAUGE", "metrics" : [ { "labels" : { "pool_name" : "sqerl" }, "value" : "20" }, { "value" : "100", "labels" : { "pool_name" : "oc_chef_authz_http" } }, { "labels" : { "pool_name" : "chef_index_http" }, "value" : "100" }, { "labels" : { "pool_name" : "chef_depsolver" }, "value" : "5" } ] }, { "name" : "erchef_pooler_members_free", "help" : "Number of pool members currently available.", "type" : "GAUGE", "metrics" : [ { "value" : "20", "labels" : { "pool_name" : "sqerl" } }, { "value" : "25", "labels" : { "pool_name" : "oc_chef_authz_http" } }, { "value" : "25", "labels" : { "pool_name" : "chef_index_http" } }, { "labels" : { "pool_name" : "chef_depsolver" }, "value" : "5" } ] }, { "metrics" : [ { "value" : "0", "labels" : { "pool_name" : "sqerl" } }, { "value" : "0", "labels" : { "pool_name" : "oc_chef_authz_http" } }, { "value" : "0", "labels" : { "pool_name" : "chef_index_http" } }, { "value" : "0", "labels" : { "pool_name" : "chef_depsolver" } } ], "type" : "GAUGE", "name" : "erchef_pooler_members_in_use", "help" : "Number of pool members currently being used." }, { "name" : "pg_stat_tidx_blks_hit", "help" : "Number of buffer hits in TOAST table indexes", "type" : "COUNTER", "metrics" : [ { "value" : "6486" } ] }, { "help" : "Number of disk blocks read from TOAST tables", "name" : "pg_stat_tidx_blks_read", "type" : "COUNTER", "metrics" : [ { "value" : "1410" } ] }, { "help" : "Number of buffer hits in TOAST tables", "name" : "pg_stat_toast_blks_hit", "metrics" : [ { "value" : "3666" } ], "type" : "COUNTER" }, { "help" : "Number of disk blocks read from TOAST tables", "name" : "pg_stat_toast_blks_read", "type" : "COUNTER", "metrics" : [ { "value" : "1410" } ] }, { "name" : "pg_stat_idx_blks_hit", "help" : "Number of buffer hits in all indexes", "type" : "COUNTER", "metrics" : [ { "value" : "72120936" } ] }, { "metrics" : [ { "value" : "49632" } ], "type" : "COUNTER", "name" : "pg_stat_idx_blks_read", "help" : "Number of disk blocks read from all indexes" }, { "metrics" : [ { "value" : "85697544" } ], "type" : "COUNTER", "name" : "pg_stat_heap_blocks_hit", "help" : "Number of buffer hits" }, { "help" : "Number of disk blocks read", "name" : "pg_stat_heap_blocks_read", "type" : "COUNTER", "metrics" : [ { "value" : "69513" } ] }, { "metrics" : [ { "value" : "29610" } ], "type" : "GAUGE", "help" : "Estimated number of dead rows", "name" : "pg_stat_n_dead_tup" }, { "type" : "GAUGE", "metrics" : [ { "value" : "92355" } ], "help" : "Estimated number of live rows", "name" : "pg_stat_n_live_tup" }, { "type" : "COUNTER", "metrics" : [ { "value" : "31020" } ], "name" : "pg_stat_n_tup_del", "help" : "Number of rows deleted" }, { "help" : "Number of rows updated", "name" : "pg_stat_n_tup_upd", "metrics" : [ { "value" : "87702" } ], "type" : "COUNTER" }, { "metrics" : [ { "value" : "52452" } ], "type" : "COUNTER", "help" : "Number of rows inserted", "name" : "pg_stat_n_tup_ins" }, { "help" : "Number of live rows fetched by index scans", "name" : "pg_stat_tup_fetch", "type" : "COUNTER", "metrics" : [ { "value" : "44669364" } ] }, { "type" : "COUNTER", "metrics" : [ { "value" : "48594381" } ], "name" : "pg_stat_idx_scan", "help" : "Number of index scans initiated" }, { "name" : "pg_stat_seq_tup_read", "help" : "Number of live rows fetched by sequential scans", "type" : "COUNTER", "metrics" : [ { "value" : "1167619731" } ] }, { "help" : "Number of sequential scans initiated", "name" : "pg_stat_seq_scan", "metrics" : [ { "value" : "9793719" } ], "type" : "COUNTER" } ] ``` Signed-off-by: Kartik Null Cating-Subramanian <[email protected]>
9b8879f
to
77ab628
Compare
No description provided.