Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[service_discovery] Add a Zookeeper service discovery implementation. #3038

Merged
merged 1 commit into from
Dec 20, 2016

Conversation

manolama
Copy link
Contributor

Note: Please remember to review the Datadog Contribution Guidelines
if you have not yet done so.

What does this PR do?

Adds support for using Apache Zookeeper as the backend for Service Discovery features when running DataDog agents in a cluster.

Motivation

Zookeeper is widely deployed and used as the default for a DC/OS Mesos cluster. Instead of using resources for a Console or Etcd cluster just for DataDog configs, it makes sense to use the existing Zookeeper instance. And many other systems still use Zookeeper as a key value store.

Testing Guidelines

NA

Additional Notes

Works similar to the other stores, just has to iterate each time, unfortunately, to fetch all of the children. One way to bypass that would be to just change the "datadog/config_checks" mtime on each update but users would have to know to do that.

@hkaj hkaj self-assigned this Nov 21, 2016
@hkaj hkaj added this to the 5.11.0 milestone Nov 21, 2016
@hkaj
Copy link
Member

hkaj commented Nov 21, 2016

Hi @manolama
Thanks a lot for contributing! That's a feature we've been wanting for a while but couldn't find time to write, so cheers :)
I'll review it shortly.

@manolama
Copy link
Contributor Author

Happy to help @hkaj. It looks like it works the same with the Etcd plugin performs but there may be cases I missed. And I need to learn how to UT in Python ;)

Copy link
Member

@hkaj hkaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, but it looks great overall!

children = self.client.get_children(path)
if children is not None and len(children) > 0:
for child in children:
if path[len(path) - 1] is "/":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

children = self.client.get_children(path)
if children is not None:
for child in children:
if path[len(path) - 1] is "/":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally you can use path.endswith('/') for this, but here it would be shorter to write new_path = '/'.join([path.rstrip('/'), child])


if data is not None:
str = data.decode("utf-8")
if str[0] == '[':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not necessary. The json is loaded later and if it's invalid an error is logged, which is better than silently ignoring it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately Zookeeper won't let you write an empty znode with the ZK CLI (API maybe) so for the SD plugin to work, I create the znodes with "null" as the value. E.g. create datadog null then create datadog/check_configs null , create datadog/check_configs/custom_nginx null and the real values as create datadog/check_configs/custom-nginx/check_names [{"nginx"}]".

So I make sure to check for the bracket here to avoid polluting the map with keys like {datadog: null, datadog/check_configs:null}, etc.

One alternative would be to white list only the final path names check_names, init_configs and instances but if you need to modify the schema later on with more branches, you'd have to add them in the white-list. What do you recommend? Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manolama can you use an empty string instead? kazoo allows that it seems. This would allow us to do if node_as_string: there and avoid swallowing formatting errors (people forgetting that they need to use a list instead of a single dict, etc.).

Except for that your changes look good to me 🎉 We'll merge once a good solution is found for this.

try:
data, stat = self.client.get(path)

if data is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if data: would be better. data can be an empty string here, which doesn't qualify as None and will trigger an IndexError L59.

data, stat = self.client.get(path)

if data is not None:
str = data.decode("utf-8")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str is a Python type, let's use another name


if data:
node_as_string = data.decode("utf-8")
if node_as_string[0] == '[':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manolama would an empty string work for you here instead of '['? This would allow us to do if not node_as_string (cleaner) and would avoid swallowing formatting errors (people forgetting that they need to use a list instead of a single dict, etc.). If that works for you I can add this change and merge your PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @hkaj, yeah that'd work. The ZK CLI won't let you create an empty node but the API does. I could write a little CLI to include with DD that would make it easier to create and configure the nodes. In the mean time, please make that tweak and merge this if you don't mind. Thanks!

@manolama
Copy link
Contributor Author

Hi @hkaj, I tweaked the code as per your last comment. Lemme know if it's ok. thanks!

@hkaj
Copy link
Member

hkaj commented Dec 20, 2016

This looks great. CI passes and i tested it manually. I can't wait to see it in 5.11!
Thanks a lot @manolama!

@hkaj hkaj merged commit 2a98280 into DataDog:master Dec 20, 2016
@masci masci modified the milestones: 5.11.0, 5.12.0 Jan 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants