-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Service Discovery] create a template cache to reduce calls to the KV store #3060
Conversation
9122643
to
6defada
Compare
6defada
to
e3094d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good. Just a few minor comments/questions.
init_tpl = auto_conf.get('init_config') or {} | ||
instance_tpl = auto_conf.get('instances', []) | ||
if image not in auto_conf_images: | ||
auto_conf_images[image] = [[check_name], [init_tpl], [instance_tpl]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it end up being less error-prone to have the init
and instance
configs/templates in a dict with check_name
as the key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean like auto_conf_images[(image, check_name)] = [init_tpl, instance_tpl]
?
Problem with that is if users have several instances of the same check for the same image (and a customer actually asked how to do that yesterday so it happens). In this case we still need to wrap init_tpl and instance_tpl in lists and end up with the same logic as we have now.
But maybe I didn't understand correctly your suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I originally meant auto_conf_images[image] = {'checkname': {'init': init_tpl, 'instances: [intance_tpl_0, instance_tpl_1,...]}}
But then I read on, and I saw it wasn't necessarily a good idea given the current structure/logic. I felt it would make the code a little more maintainable (perhaps). We can maybe just keep in mind for the future (it'd be a pretty big refactor), there's nothing wrong with the current logic.
log.warning("Conflicting templates in auto_conf for image %s and check %s. " | ||
"Please check your template files." % (image, check_name)) | ||
continue | ||
self.auto_conf_templates[image][0].append(check_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, I'm not sure if this wouldn't be better as self.auto_conf_templates[image][check_name] = (init_tpl, instance_tpl)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I get it, we stick to lists for "compatibility" across the board when dealing with the configs. I think we could maybe refactor in the future, but it's a legit choice.
@@ -50,7 +50,7 @@ def get_check_class(agentConfig, check_name): | |||
return None | |||
|
|||
|
|||
def get_auto_conf(agentConfig, check_name): | |||
def get_auto_conf(check_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like get_auto_conf()
and get_auto_config()
as names are a bit too close to each other...
# if _issue_read doesn't return anything we still create a key in the cache | ||
# so that subsequent reads don't trigger issue_read every time | ||
else: | ||
self.kv_templates[identifier] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is a NOOP but I noticed that we do include in the results templates[CONFIG_FROM_TEMPLATE] = self.kv_templates[identifier]
even when self.kv_templates[identifier] is None
if we have a hit, but we don't do that with a miss.
Just saying because it occured to me we could refactor this as:
if identifier not in self.kv_templates:
tpls = self._issue_read(identifier)
if tpls:
templates[CONFIG_FROM_TEMPLATE] = tpls
self.kv_templates[identifier] = tpls
else:
self.kv_templates[identifier] = None
templates[CONFIG_FROM_TEMPLATE] = self.kv_templates[identifier]
because we always populate after a hit or fetch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
man that snippet is on point! 👌
# lastly fallback to auto_conf | ||
return set(self.identifier_to_checks[self._get_image_ident(identifier)]) | ||
# lastly fallback to legacy name for auto-conf | ||
return self.template_cache.get_check_names(self._get_image_ident(identifier)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: can we have checks under the legacy name that should also be applied? So a case where identifier
and the legacy self._get_image_ident(identifier)
would both apply to an image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! That's a use case we didn't support before (if user-defined templates were found we skipped auto config) but I added logic for that here and forgot to update this part. Will do it now.
@mock.patch.object(AbstractConfigStore, '_issue_read', side_effect=issue_read) | ||
def test_read_config_from_store(self, issue_read): | ||
@mock.patch.object(_TemplateCache, '_issue_read', side_effect=issue_read) | ||
def test_read_config_from_store(self, args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big 🙇 for all these awesome tests!
e3094d3
to
999f7bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't see anything wrong with this!
7543301
to
2eaab69
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of nits, but I approve! ;)
@@ -245,7 +245,7 @@ def run(self, config=None): | |||
if self._agentConfig.get('service_discovery'): | |||
self.sd_backend = get_sd_backend(self._agentConfig) | |||
|
|||
if _is_affirmative(self._agentConfig.get('sd_jmx_enable')): | |||
if _is_affirmative(self._agentConfig.get('sd_jmx_enable', '')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would probably be more readable as: _is_affirmative(self._agentConfig.get('sd_jmx_enable', False))
.... Am I missing something?
log.exception('Failed to retrieve a template for %s.' % identifier) | ||
# create a key in the cache even if _issue_read doesn't return a tpl | ||
# so that subsequent reads don't trigger issue_read | ||
self.kv_templates[identifier] = tpls or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure you need the or None
? tpls
will already be something if self._issue_read(identifier)
returns either a list or None
, or the None
we've set it in the except
blocks.
This cache is invalidated when a template change is detected. It doesn't detect changes to the auto_conf folder or JMX checks, only to the KV store. Refreshing configs from files still requires a SIGHUP or a restart.
2eaab69
to
13d8a24
Compare
Thanks @truthbk ! I addressed your comments and squashed the commits. Will merge once the tests pass 🎉 |
Note: Please remember to review the Datadog Contribution Guidelines
if you have not yet done so.
What does this PR do?
Create a cache for configuration templates
SIGHUP
.Minor clean ups and test updates
Motivation
Using this cache avoids querying the config store at every container update.
Testing Guidelines
Tests were updated accordingly. A custom docker image using this build is also available at the
TODO
tag.