Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ansible role is challenging to use for custom per-host configurations #366

Open
kbreit opened this issue Jun 3, 2021 · 10 comments
Open

Comments

@kbreit
Copy link

kbreit commented Jun 3, 2021

I am trying to develop an Ansible playbook which tailors the datadog-agent configuration based on the groups an endpoint is in within Ansible. For example, lets say there are the following two devices:

  • Ubuntu box with nginx
  • Red Hat box with redis

Using Ansible pre-tasks I need to structure the datadog_config and datadog_checks variable within Ansible on a per-host basis. Red Hat uses /var/log/messages and Ubuntu uses something else. For this example, lets assume Ubuntu uses /var/log/syslog. I want to report contents of those files to Datadog. Plus, Apache and redis need to be configured on each system respectively. A rough playbook vars section may look like this:

    datadog_checks:
      nginx:
        init_config:
          instances:
            - ...
      logs:
        - type: file
          path: /var/log/syslog
          ...

However this is specific to my Red Hat configuration and would need a very different structure for Red Hat with Redis. My current though is I would need to develop this structure during pre-tasks using when statements checking for os_family and inventory groups. However, Ansible's set_fact module does not allow for updating a data structure and update_fact allows for only modifying mutable objects, which a dict doesn't appear to be. I don't see a good way to accomplish what I'm looking for and think there are better ways the role should work, such as having more built-out variables that aren't a single data structure. For example, datadog_check_nginx and then nginx_init_config and whatnot.

@KSerrania
Copy link
Contributor

KSerrania commented Jun 3, 2021

Hi @kbreit,

Thanks for the report!

If I understand correctly, you'd like to have per-host configuration of role variables. I believe this can be achieved using Ansible host variables or group variables. That way you can assign specific configuration to hosts or groups of hosts.

For instance, you can put all your Ubuntu hosts in a group named ubuntu, and for this group define the datadog_checks variable so that it contains the nginx check configuration and sets the logs path to /var/log/syslog. Then you can create a redhat group, for which datadog_checks contains the redis check configuration, and where the logs path is /var/log/messages.

Would that fit your needs?

@kbreit
Copy link
Author

kbreit commented Jun 4, 2021

@KSerrania

Thank you for the fast response! I know I can use host and group variables and that's my plan. There are two issues with it though. First, I shouldn't have to have the Linux distribution in the inventory as gather_facts will collect that and I shouldn't have to maintain that list. Second, piecing the group vars together into a single datadog_checks data structure isn't trivial. I'm working through using the combine() jinja2 function but it's messy.

@KSerrania
Copy link
Contributor

Hi again @kbreit,

I tried some things around the initial problems you had with update_fact not updating dicts. I tried a setup with one Ubuntu and one CentOS host (using Ansible 2.10.7), with the following playbook file:

Playbook file
- hosts: all
  roles:
          - { role: ansible-datadog, become: true }
  vars:
    datadog_site: "datadoghq.com"
    datadog_api_key: "<api_key>"
    datadog_enabled: true
    datadog_checks:
      custom_logs:
        logs:
          - type: file
            path: /var/log/messages
  pre_tasks:
    - name: Print datadog_checks before
      debug:
        var: datadog_checks

    - name: Update datadog_checks on Ubuntu / Debian
      ansible.utils.update_fact:
        # Include here all updates that need to be made on Debian- or Ubuntu- based hosts
        updates:
          - path: datadog_checks.custom_logs.logs.0.path
            value: /var/log/syslog
          - path: datadog_checks.nginx
            value:
              init_config:
              instances:
                - example: value
      # update_fact doesn't modify in place, you need to register the result of the task
      # Here, new_datadog_checks.datadog_checks contains the updated value.
      register: new_datadog_checks
      when: ansible_facts.os_family == "Debian"

    - name: Set new value in original var on Ubuntu / Debian
      set_fact:
        datadog_checks: "{{ new_datadog_checks.datadog_checks }}"
      when: ansible_facts.os_family == "Debian"

    - name: Update datadog_checks on CentOS / RedHat
      ansible.utils.update_fact:
        # Include here all updates that need to be made on Debian- or Ubuntu- based hosts
        updates:
          - path: datadog_checks.redis
            value:
              init_config:
              instances:
                - other_example: other_value
      # update_fact doesn't modify in place, you need to register the result of the task
      # Here, new_datadog_checks.datadog_checks contains the updated value.
      register: new_datadog_checks
      when: ansible_facts.os_family == "RedHat"

    - name: Set new value in original var on CentOS / RedHat
      set_fact:
        datadog_checks: "{{ new_datadog_checks.datadog_checks }}"
      when: ansible_facts.os_family == "RedHat"

    - name: Print datadog_checks after
      debug:
        var: datadog_checks

which gave me the following result:

Ansible playbook run
$ ansible-playbook ./playbook.yml

PLAY [all] ***********************************************************************************************************************************************************************

TASK [Gathering Facts] ***********************************************************************************************************************************************************
ok: [centos]
ok: [ubuntu]

TASK [Print datadog_checks before] ***********************************************************************************************************************************************
ok: [ubuntu] => {
    "datadog_checks": {
        "custom_logs": {
            "logs": [
                {
                    "path": "/var/log/messages",
                    "type": "file"
                }
            ]
        }
    }
}
ok: [centos] => {
    "datadog_checks": {
        "custom_logs": {
            "logs": [
                {
                    "path": "/var/log/messages",
                    "type": "file"
                }
            ]
        }
    }
}

TASK [Update datadog_checks on Ubuntu / Debian] **********************************************************************************************************************************
changed: [ubuntu]
skipping: [centos]

TASK [Set new value in original var on Ubuntu / Debian] **************************************************************************************************************************
skipping: [centos]
ok: [ubuntu]

TASK [Update datadog_checks on CentOS / RedHat] **********************************************************************************************************************************
skipping: [ubuntu]
changed: [centos]

TASK [Set new value in original var on CentOS / RedHat] **************************************************************************************************************************
skipping: [ubuntu]
ok: [centos]

TASK [Print datadog_checks after] ************************************************************************************************************************************************
ok: [ubuntu] => {
    "datadog_checks": {
        "custom_logs": {
            "logs": [
                {
                    "path": "/var/log/syslog",
                    "type": "file"
                }
            ]
        },
        "nginx": {
            "init_config": null,
            "instances": [
                {
                    "example": "value"
                }
            ]
        }
    }
}
ok: [centos] => {
    "datadog_checks": {
        "custom_logs": {
            "logs": [
                {
                    "path": "/var/log/messages",
                    "type": "file"
                }
            ]
        },
        "redis": {
            "init_config": null,
            "instances": [
                {
                    "other_example": "other_value"
                }
            ]
        }
    }
}

< rest of the ansible-datadog role run >

so I think doing OS-based (or other kinds of ansible fact-based) arbitrary modifications to datadog_checks should be doable with update_fact in pre-tasks. To make the playbook cleaner, the two Debian tasks above can be put in a separate file and be included in the pre-tasks when ansible_facts.os_family == "Debian", same for RedHat.

Would doing something like that help with your issue?

@erikhjensen
Copy link
Contributor

Hi @kbreit,

I have a solution which is working for our team and it was my Datadog customer success contact that clued me into it.

We're very early in our journey and I'm new to Ansible but what I have is Jinja-based.

In repo.. folder called templates which has structured layouts of checks-yaml which contains the static elements needed for the various config but also {{ replacement_tokens }} where necessary.

I have a datadog checks template as such:
'#the indentation of each check inside datadog_checks must be part of the child-template. This is a limitation of Jinja
datadog_checks:
{% if iis_check|d(false)|bool %}
{% include 'datadog_checks.iis.yml.j2' %}
{% endif %}
{% if sqlserver_check|d(false)|bool %}
{% include 'datadog_checks.sqlserver.yml.j2' %}
{% endif %}
{% if loadrunner_check|d(false)|bool %}
{% include 'datadog_checks.loadrunner.yml.j2' %}
{% endif %}
{% if (ansible_facts.os_family == "Windows")|d(false)|bool %}
{% include 'datadog_checks.win32_event_log.yml.j2' %}
{% endif %}

then I use an include_vars statement to load that file into a var and then the role consumes it.

@kbreit
Copy link
Author

kbreit commented Jun 12, 2021

@erikhjensen - That's an interesting solution and one I'll give a try to. It's cleaner than anything else I've tried or thought of.

I guess my thought still stands that I'd like to see the variables broke out a little more so there's more flexibility and we don't need to resort to jinja2 templating in a situation like mine. Indeed, it would require a very significant change to variable structure and breaking backward compatibility, so I wouldn't expect it for another major release or two even if it was accepted.

@kbreit
Copy link
Author

kbreit commented Jun 15, 2021

@erikhjensen I'm working on your setup and I think once I get it going it'll be the best option. Here are snippets of what I have but I'm getting syntax errors. Anything obvious I'm missing/?

  pre_tasks:
    - include_vars:
        file: templates/main.j2
        name: included_vars

...

datadog_checks:
{% if ansible_facts.os_family == "Debian"|d(false)|bool %}
{% include "logs-debian.yaml" %}
{% elif ansible_facts.os_family == "Red Hat"|d(false)|bool %}
{% include "logs-redhat.yaml" %}
{% endif %}
TASK [include_vars] *****************************************************************************************
fatal: [e7ubnt0ddgtest01.datalinklabs.local]: FAILED! => {"ansible_facts": {"included_vars": {}}, "ansible_included_var_files": [], "changed": false, "message": "We were unable to read either as JSON nor YAML, these are the errors we got from each:\nJSON: Expecting value: line 1 column 1 (char 0)\n\nSyntax Error while loading YAML.\n  found character that cannot start any token\n\nThe error appears to be in '/ansible/templates/main.j2': line 2, column 6, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\ndatadog_checks:\n    {% if ansible_facts.os_family == \"Debian\"|d(false)|bool %}\n     ^ here\n"}

@erikhjensen
Copy link
Contributor

@erikhjensen I'm working on your setup and I think once I get it going it'll be the best option. Here are snippets of what I have but I'm getting syntax errors. Anything obvious I'm missing/?

  pre_tasks:
    - include_vars:
        file: templates/main.j2
        name: included_vars

...

datadog_checks:
{% if ansible_facts.os_family == "Debian"|d(false)|bool %}
{% include "logs-debian.yaml" %}
{% elif ansible_facts.os_family == "Red Hat"|d(false)|bool %}
{% include "logs-redhat.yaml" %}
{% endif %}
TASK [include_vars] *****************************************************************************************
fatal: [e7ubnt0ddgtest01.datalinklabs.local]: FAILED! => {"ansible_facts": {"included_vars": {}}, "ansible_included_var_files": [], "changed": false, "message": "We were unable to read either as JSON nor YAML, these are the errors we got from each:\nJSON: Expecting value: line 1 column 1 (char 0)\n\nSyntax Error while loading YAML.\n  found character that cannot start any token\n\nThe error appears to be in '/ansible/templates/main.j2': line 2, column 6, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\ndatadog_checks:\n    {% if ansible_facts.os_family == \"Debian\"|d(false)|bool %}\n     ^ here\n"}

Did you template out that first. The pseudo code is

Set a bunch of vars based on hostvars etc
Call template: with input being datadog_checks.yaml.j2 and output being datadog_checks.yaml then
Include vars that output.

You can debug out this intermediate result post template and pre include vars to check your result. You’re trying to replicate an yaml structure for checks that you can technically write by hand so you should have a structure in mind and then work towards that.

@pookey
Copy link

pookey commented Jul 13, 2021

I am also having these issues.

I was ALL hosts to have some basic checks, there could go in group_vars/all

Then I might want nginx servers to have some additional checks, which might go in group_vars/nginx.yaml, however - because of how ansible handles maps, the contents of nginx.yaml would over-ride those in group_vars/all.

This gets even more complicated when a host might be in multiple groups.

I have ended up duplicating datadog_checks in many areas of my variable structure - but this means when I want to add a new check that applies to all hosts, I need to modify many var files.

The core issue I questioned on reddit here: https://www.reddit.com/r/ansible/comments/o6akfn/merging_variables_from_group_varsall_multiple/

This is less of an issue with this datadog role, but more of a problem with how Ansible works IMO.

@rockaut
Copy link
Contributor

rockaut commented Nov 14, 2022

We basically also do it like @pookey here. We "categorize" the hosts before calling the datadog.datadog role and combining the vars together.

datadog_config: "{{ {} | combine(moded_datadog_config, recursive=True, list_merge='append_rp') }}"
datadog_checks: "{{ {} | combine(moded_datadog_global_checks, recursive=True, list_merge='append_rp') | combine(moded_datadog_os_checks, recursive=True, list_merge='append_rp') | combine(moded_datadog_host_checks, recursive=True, list_merge='append_rp') }}"

Additionally we also then have some additional roles afterwards to generate some new config files (not "managed" by the datadog role). It's also tricky as you can't "decouple" check configurations and agent configurations currently.

That said, we also think this is not an ideal solution. Especially with view on the upcomming collection it would be greatly appreciated to have a seperate check role which decouples the agent installation from the check things.

Another solution would be to add an additional layer into the datadog_checks and instead of only creating one conf.yml per check enabling additional one's - and through the default ansible/jinja filters even "altering" the dicts.

datadog_configs:
  postgres:
    instance_a: # should create conf.d/postgres.d/instance_a.yml
       init_config:
       instances:
         - name: ...
    instance_b: # should create conf.d/postgres.d/instance_b.yml
       init_config:
       instances:
         - name: ...

Thought that would be a breaking change so it might get tricky to sanitize that.

@rockaut
Copy link
Contributor

rockaut commented Nov 14, 2022

Also, and please correct me if I am wrong, there is currently no way to "remove" a configuration which was previously configured without also using datadog_disable_untracked_checks: true and needing to list all tracked checks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants