Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pulp services #34

Merged
merged 18 commits into from
Nov 15, 2024
Merged

add pulp services #34

merged 18 commits into from
Nov 15, 2024

Conversation

archanaserver
Copy link
Collaborator

implementation of pulp services(api, content and worker)

@archanaserver
Copy link
Collaborator Author

archanaserver commented Oct 15, 2024

There are few things which I don't understand clearly yet, first how can we ensure dependencies here like we have depends_on in compose way to do it https://github.com/pulp/pulp-oci-images/blob/latest/images/compose/compose.yml#L93. I didn't find the much info. Another things is with multiple services that have dependencies on each other, how should secrets be managed? What is the recommended way to configure secrets to ensure secure communication between these services?
Also how when testing the deployment manually like podman run way, it works as expected. However, when using my Ansible configuration, I encounter a timeout error while waiting for the pulp-api service to be accessible. @evgeni please need your help here? 🥺

@ehelms
Copy link
Member

ehelms commented Oct 15, 2024

  • Dependency handling

This could be handled via systemd Wants/Requires and the After option to provide some ordering. I do feel we should also make the container services smart enough to check for connections before performing initialization operations like migrating. You can see examples of this last part (https://github.com/pulp/pulp-oci-images/blob/latest/images/assets/pulp-api#L3-L4) (https://github.com/pulp/pulp-oci-images/blob/latest/images/assets/wait_on_postgres.py)

  • Secrets handling

We are using podman secrets right now, and I recently added to the repository a naming scheme for how to define the secrets (https://github.com/ehelms/foreman-quadlet?tab=readme-ov-file#naming-convention). For services that need the same secret, we can define the secret once with podman secrets and then it can be declared within each quadlet file. See https://github.com/ehelms/foreman-quadlet/blob/master/roles/candlepin/tasks/main.yml#L59-L71

  • Timeout issue

I am not sure about this. If you ssh into the VM and try to start it manually what happens?

@evgeni
Copy link
Member

evgeni commented Oct 17, 2024

  • Dependency / Secrets -- I think Eric answered it well already.
  • Timeout - could it be that the DB doesn't get migrated (or not migrated in time) and thus services don't start? The compose file you linked has an explicit migration_service with runs pulpcore-manager migrate --noinput and the other services depend on.

roles/pulp/tasks/main.yaml Outdated Show resolved Hide resolved
roles/pulp/tasks/main.yaml Show resolved Hide resolved
@archanaserver archanaserver force-pushed the pulp branch 3 times, most recently from 62017a0 to 66de065 Compare October 28, 2024 11:36
Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all Pulp containers depend on PostgreSQL. How do they know where to connect to?

roles/pulp/tasks/main.yaml Outdated Show resolved Hide resolved
roles/pulp/tasks/main.yaml Outdated Show resolved Hide resolved
roles/pulp/defaults/main.yaml Outdated Show resolved Hide resolved
@ehelms
Copy link
Member

ehelms commented Oct 28, 2024

roles/pulp/tasks/main.yaml Outdated Show resolved Hide resolved
@ehelms
Copy link
Member

ehelms commented Oct 28, 2024

This fixed some issues for me:

diff --git a/roles/pulp/tasks/main.yaml b/roles/pulp/tasks/main.yaml
index c7b019c..87c5e23 100644
--- a/roles/pulp/tasks/main.yaml
+++ b/roles/pulp/tasks/main.yaml
@@ -31,20 +31,6 @@
     name: pulp-settings-py
     data: "{{ lookup('ansible.builtin.template', 'settings.py.j2') }}"
 
-- name: Deploy Pulp Container
diff --git a/roles/pulp/tasks/main.yaml b/roles/pulp/tasks/main.yaml
index c7b019c..87c5e23 100644
--- a/roles/pulp/tasks/main.yaml
+++ b/roles/pulp/tasks/main.yaml
@@ -31,20 +31,6 @@
     name: pulp-settings-py
     data: "{{ lookup('ansible.builtin.template', 'settings.py.j2') }}"
 
-- name: Deploy Pulp Container
-  containers.podman.podman_container:
-    name: "{{ pulp_container_name }}"
-    image: "{{ pulp_image }}"
-    state: quadlet
-    ports: "{{ pulp_ports }}"
-    volumes: "{{ pulp_volumes }}"
-    secrets:
-      - 'pulp-settings-py,type=mount,target=/etc/pulp/settings.py'
-    quadlet_options:
-      - |
-        [Install]
-        WantedBy=default.target
-  
 - name: Deploy Pulp API Container
   containers.podman.podman_container:
     name: "{{ pulp_api_container_name }}"
@@ -60,28 +46,6 @@
         [Install]
         WantedBy=default.target
 
-- name: Run daemon reload to make Quadlet create the service files
-  ansible.builtin.systemd:
-    daemon_reload: true
-
-- name: Start the Pulp Service
-  ansible.builtin.systemd:
-    name: pulp
-    enabled: true
-    state: restarted
-
-- name: Wait for Pulp service to be accessible
-  ansible.builtin.wait_for:
-    host: "{{ ansible_hostname }}"
-    port: 8080
-    timeout: 300
-
-- name: Wait for Pulp API service to be accessible
-  ansible.builtin.wait_for:
-    host: "{{ ansible_hostname }}"
-    port: 24817 
-    timeout: 600
-
 - name: Deploy Pulp Content Container
   containers.podman.podman_container:
     name: "{{ pulp_content_container_name }}"
@@ -133,6 +97,12 @@
     enabled: true
     state: started
 
+- name: Wait for Pulp API service to be accessible
+  ansible.builtin.wait_for:
+    host: "{{ ansible_hostname }}"
+    port: 24817
+    timeout: 600
+
 - name: Wait for Pulp Content service to be accessible
   ansible.builtin.wait_for:
     host: "{{ ansible_hostname }}"

The pulp-api service still fails to be ready but with this error that can be fixed:

Oct 28 13:54:51 quadlet.example.com pulp-api[74976]: django.core.exceptions.ImproperlyConfigured: Could not load DB_ENCRYPTION_KEY file '/etc/pulp/certs/database_fields.symmetric.key': [Errno 13] Permission denied: '/etc/pulp/certs/database_fields.symmetric.key'

@archanaserver
Copy link
Collaborator Author

The pulp-api service still fails to be ready but with this error that can be fixed:

Oct 28 13:54:51 quadlet.example.com pulp-api[74976]: django.core.exceptions.ImproperlyConfigured: Could not load DB_ENCRYPTION_KEY file '/etc/pulp/certs/database_fields.symmetric.key': [Errno 13] Permission denied: '/etc/pulp/certs/database_fields.symmetric.key'

I did some config setup to resolve it, but I still see the pulp-api.service up and then failing due to the same issue, I'm not sure why this is happening, Can anyone point me why this is still persist?

@archanaserver archanaserver marked this pull request as ready for review November 4, 2024 10:18
@archanaserver archanaserver force-pushed the pulp branch 3 times, most recently from 8a0ec40 to 97ebc9d Compare November 4, 2024 11:08
@archanaserver archanaserver marked this pull request as draft November 4, 2024 11:09
roles/pulp/tasks/main.yaml Outdated Show resolved Hide resolved
roles/pulp/tasks/main.yaml Show resolved Hide resolved
Comment on lines 29 to 38
- name: Generate database symmetric key
ansible.builtin.command: "bash -c 'openssl rand -base64 32 | tr \"+/\" \"-_\" > /var/lib/pulp/database_fields.symmetric.key'"
args:
creates: /var/lib/pulp/database_fields.symmetric.key

- name: Create database symmetric key secret
containers.podman.podman_secret:
state: present
name: pulp-symmetric-key
data: "{{ lookup('file', '/var/lib/pulp/database_fields.symmetric.key') }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stores the secret an additional time, which I'd prefer to avoid. Can't you change Generate database symmetric key to output something on stdout and store that here? Or would that not be idempotent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be not idempotent, yeah :(

roles/pulp/defaults/main.yaml Outdated Show resolved Hide resolved
@@ -1,11 +1,11 @@
import json

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This empty line is common in Python. It separates the built in modules (json) from third party ones (pytest). isort will place it back.

tests/pulp_test.py Outdated Show resolved Hide resolved
@evgeni
Copy link
Member

evgeni commented Nov 4, 2024

The pulp-api service still fails to be ready but with this error that can be fixed:

Oct 28 13:54:51 quadlet.example.com pulp-api[74976]: django.core.exceptions.ImproperlyConfigured: Could not load DB_ENCRYPTION_KEY file '/etc/pulp/certs/database_fields.symmetric.key': [Errno 13] Permission denied: '/etc/pulp/certs/database_fields.symmetric.key'

I did some config setup to resolve it, but I still see the pulp-api.service up and then failing due to the same issue, I'm not sure why this is happening, Can anyone point me why this is still persist?

I think I understand the issue now.

When you look at the whole output (via journalctl), the error is:

Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: Waiting on postgresql to start...
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: Traceback (most recent call last):
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/asgiref/local.py", line 89, in _lock_storage
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     asyncio.get_running_loop()
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: RuntimeError: no running event loop
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: 
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: During handling of the above exception, another exception occurred:
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: 
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: Traceback (most recent call last):
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/django/utils/connection.py", line 58, in __getitem__
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     return getattr(self._connections, alias)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/asgiref/local.py", line 118, in __getattr__
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     return getattr(storage, key)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: AttributeError: '_thread._local' object has no attribute 'default'
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: 
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: During handling of the above exception, another exception occurred:
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: 
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: Traceback (most recent call last):
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/pulpcore/app/settings.py", line 476, in <module>
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     with open(DB_ENCRYPTION_KEY, "rb") as key_file:
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: PermissionError: [Errno 13] Permission denied: '/etc/pulp/certs/database_fields.symmetric.key'
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: 
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: During handling of the above exception, another exception occurred:
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: 
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: Traceback (most recent call last):
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/bin/wait_on_postgres.py", line 12, in <module>
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     connection.ensure_connection()
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/django/utils/connection.py", line 15, in __getattr__
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     return getattr(self._connections[self._alias], item)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/django/utils/connection.py", line 60, in __getitem__
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     if alias not in self.settings:
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/django/utils/functional.py", line 57, in __get__
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     res = instance.__dict__[self.name] = self.func(instance)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/django/utils/connection.py", line 45, in settings
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     self._settings = self.configure_settings(self._settings)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/django/db/utils.py", line 148, in configure_settings
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     databases = super().configure_settings(databases)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/django/utils/connection.py", line 50, in configure_settings
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     settings = getattr(django_settings, self.settings_name)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/django/conf/__init__.py", line 102, in __getattr__
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     self._setup(name)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/django/conf/__init__.py", line 89, in _setup
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     self._wrapped = Settings(settings_module)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/django/conf/__init__.py", line 217, in __init__
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     mod = importlib.import_module(self.SETTINGS_MODULE)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/lib64/python3.9/importlib/__init__.py", line 127, in import_module
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     return _bootstrap._gcd_import(name[level:], package, level)
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "<frozen importlib._bootstrap_external>", line 850, in exec_module
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:   File "/usr/local/lib/python3.9/site-packages/pulpcore/app/settings.py", line 479, in <module>
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]:     raise ImproperlyConfigured(
Nov 04 11:59:55 quadlet.example.com pulp-api[60981]: django.core.exceptions.ImproperlyConfigured: Could not load DB_ENCRYPTION_KEY file '/etc/pulp/certs/database_fields.symmetric.key': [Errno 13] Permission denied: '/etc/pulp/certs/database_fields.symmetric.key'

So it tries to reach the DB ("Waiting on postgresql to start..."), and while doing so it fails and also fails to load the secret we're mounting. Especially, the "permission denied" is confusing, right?

After a bit of poking, I realized that the permission error is because we both try to have a secret, but also actually mount things on-top of /etc/pulp:

- /var/lib/pulp/settings:/etc/pulp:Z

Once I drop this line, and add network: host to the container definitions it can actually reach the database and the pulp services start.
The DB is still not getting migrated and that's what breaks pulp from actually working, but at least things start)

(and the lookup thing I mention inline)

roles/pulp/tasks/main.yaml Outdated Show resolved Hide resolved
@archanaserver archanaserver marked this pull request as ready for review November 4, 2024 19:52
@archanaserver archanaserver marked this pull request as draft November 4, 2024 20:26
Comment on lines 13 to 14
- /var/lib/pulp/pulp_storage:/var/lib/pulp
- /var/lib/pulp/containers:/var/lib/containers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the containers mount is needed anymore with minimal containers and we only really need storage.

At that point, should we just expose /var/lib/pulp as /var/lib/pulp? And I still think you need the SELinux label.

Suggested change
- /var/lib/pulp/pulp_storage:/var/lib/pulp
- /var/lib/pulp/containers:/var/lib/containers
- /var/lib/pulp:/var/lib/pulp:Z

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I still think you need the SELinux label.

I wish. It didn't work with :Z in my tests :/
That's we set label=disable on the containers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yes, you're right, we don't need the containers mount, see https://discourse.pulpproject.org/t/what-is-var-lib-containers-used-for-when-deploing-oci/1782

@archanaserver archanaserver marked this pull request as ready for review November 5, 2024 23:18
roles/pulp/defaults/main.yaml Outdated Show resolved Hide resolved
@@ -1,7 +1,20 @@
CONTENT_ORIGIN="http://{{ ansible_hostname }}:8080"
CONTENT_ORIGIN="http://{{ ansible_hostname }}:24816"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK this should be wherever Apache (httpd) presents the content, so end user clients talk to. They shouldn't talk to the internal endpoint.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, would you mind expanding a little more?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoting https://pulpproject.org/pulpcore/docs/admin/reference/settings/#content_origin

A required string containing the protocol, fqdn, and port where the content app is reachable by users. This is used by pulpcore and various plugins when referring users to the content app. For example if the API should refer users to content at using http to pulp.example.com on port 24816, (the content default port), you would set: https://pulp.example.com:24816.

In our deployment we have httpd in front of Pulp which presents it at on https://{{ ansible_hostname }}. In other words, this was always wrong, but I'm noticing it now.

Copy link
Collaborator Author

@archanaserver archanaserver Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ekohl so we basically have to set it up without port? if we want user to point at the right place? to this CONTENT_ORIGIN="https://{{ ansible_hostname }}" (sorry pinging on this again, i'm not good at this part yet)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what we do in puppet-pulpcore, yeah:
https://github.com/theforeman/puppet-pulpcore/blob/master/templates/settings.py.erb#L20

You could also argue, that this sort-of blurs the lines between the roles (the plain, port 443, HTTPS is not available w/o the httpd role), and the pulp role should have a default for the content origin being http://{{ ansible_fqdn }}:port, and then we override this in deploy.yaml with https://{{ ansible_fqdn }} as that's the value the overall deployment uses

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it should be a variable on the role that the playbook sets (because that knows both httpd and pulp)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/variable/default/, but yeah, that's how I'd do it.

tests/pulp_test.py Outdated Show resolved Hide resolved
roles/pulp/tasks/main.yaml Outdated Show resolved Hide resolved
@archanaserver archanaserver force-pushed the pulp branch 2 times, most recently from ca29d61 to 7d3b249 Compare November 7, 2024 14:17
roles/pulp/tasks/main.yaml Outdated Show resolved Hide resolved
roles/pulp/tasks/main.yaml Outdated Show resolved Hide resolved
@archanaserver archanaserver force-pushed the pulp branch 2 times, most recently from 7565d3c to f87525f Compare November 13, 2024 11:59
roles/pulp/tasks/main.yaml Outdated Show resolved Hide resolved
roles/pulp/defaults/main.yaml Outdated Show resolved Hide resolved
- "24817:80"
pulp_content_ports:
- "24816:80"
pulp_worker_count: 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also unused, but I think this could still be needed to tune the worker count.

'USER': 'pulp',
'PASSWORD': '{{ pulp_db_password }}',
'HOST': 'localhost',
'PORT': '5432',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.djangoproject.com/en/5.1/ref/settings/#std-setting-PORT says an empty string (default) means the default port. IMHO there's no value in hardcoding the default port

Suggested change
'PORT': '5432',

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unsure why my suggestion wasn't followed. IMHO the config should be as short as possible and not copy all the defaults.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, i did updated it 3d1f220

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion was to remove the line.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, i understand now! sorry, updating it in another PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these variables are no longer used in the playbook due to the host network
configuration, which eliminates the need for explicit port mappings.
@ehelms ehelms merged commit fb87fb4 into theforeman:master Nov 15, 2024
3 checks passed
@ehelms ehelms mentioned this pull request Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants