-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supervisord check #1165
Supervisord check #1165
Conversation
That looks great! |
for proc in processes: | ||
proc_name = proc['name'] | ||
tags = ['supervisord', | ||
'server:%s' % server_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried that this could grow unboundedly. What is going to prevent this?
Also could you add some tests for it ? |
def _extract_uptime(proc): | ||
desc = proc['description'] | ||
if proc['statename'] == 'RUNNING' and 'uptime' in desc: | ||
h, m, s = desc.split('uptime ')[1].split(':') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not account for days, etc. Needs a re-write.
42477df
to
fbf523f
Compare
'STOPPED': AgentCheck.CRITICAL, | ||
'STARTING': AgentCheck.OK, | ||
'RUNNING': AgentCheck.OK, | ||
'BACKOFF': AgentCheck.UNKNOWN, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the documentation BACKOFF is a state where:
The process entered the STARTING state but subsequently exited too quickly to move to the RUNNING state.
This should be critical.
# Report service checks and uptime for each process | ||
for proc in processes: | ||
proc_name = proc['name'] | ||
tags = ['supervisord', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the "supervisord" tags is not useful here and should probably be removed as it can't be used for grouping and is redundant with the metric name.
Looks almost good! Just a few comments to fix and it should be ready to merge. |
@isaacdd Any update on the comments made ? |
cfb2327
to
d65f203
Compare
@isaacdd any update on the travis test ? |
@remh have not had the chance to tackle it, and I don't think I will be able to tackle anytime soon. |
Ok changing the milestone then. |
if not server_name or not server_name.strip(): | ||
raise Exception("Supervisord server name not specified in yaml configuration.") | ||
|
||
supervisor = self._connect(instance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this overwrite the 'supervisor' instantiated at import on line 9?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. It actually does, but it does not cause issues because supervisor (the library) is used only in the _connect method. But worth renaming it a different name to avoid future conflicts.
@isaacdd any chance to work on this to add a test ? |
e9f7f70
to
86f35c2
Compare
86f35c2
to
4b4970f
Compare
|
||
FORMAT_TIME = lambda x: time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(x)) | ||
|
||
@attr(requires='supervisord') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be part of the checks.d, but part of the test.
4b4970f
to
c130c99
Compare
This adds a new agent check for supervisord. The check reports whether supervisord is running or not and whether one or more processes are runnnig or not. It also reports prcesses' uptime. The check works over http inet server and works with sockets. See supervisord.yaml.example for more details.
c130c99
to
46db43d
Compare
CI passed ! Merging ! Thanks @isaacdd ! |
No description provided.