Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: configure flavors in Nova automatically #499

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/containers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ jobs:
matrix:
container:
- name: ironic-nautobot-client
- name: nova-flavors

steps:
- name: setup docker buildx
Expand Down
5 changes: 5 additions & 0 deletions components/keystone/aio-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,11 @@ bootstrap:
# give 'argoworkflow' 'admin' over the 'baremetal' project
openstack role add --user-domain infra --project-domain infra --user argoworkflow --project baremetal admin

# create 'flavorsync' user to allow synchronization of the flavors to nova
openstack user create --or-show --domain service --password abcd1234 flavorsync
openstack role create --or-show flavorsync
openstack role add --user flavorsync --user-domain service --domain default --inherited flavorsync

# create 'monitoring' user for monitoring usage
openstack user create --or-show --domain infra --password monitoring_demo monitoring
# give 'monitoring' the 'admin' over the 'baremetal' project
Expand Down
49 changes: 49 additions & 0 deletions containers/nova-flavors/Dockerfile.nova-flavors
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
FROM ghcr.io/rackerlabs/understack/argo-python3.12.2-alpine3.19 AS builder

# This section needs to be repeated in child images
ARG APP_PATH=/app
ARG APP_USER=appuser
ARG APP_GROUP=appgroup
ARG APP_USER_UID=1000
ARG APP_GROUP_GID=1000


Comment on lines +2 to +10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just feels really heavy in every container. Can we not just do:

        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsNonRoot: true

for the pod?

if we really wanted to be slim we can use the distroless. https://github.com/GoogleContainerTools/distroless/blob/main/examples/python3/Dockerfile

Copy link
Collaborator Author

@skrobul skrobul Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just feels really heavy in every container.

If the verbosity is the problem, we can remove some of these by hardcoding the values in the commands. However having them as arguments is no "Shotgun Surgery" when the UID needs to be adjusted or path changed.

Can we not just do:

          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsNonRoot: true

for the pod?

Configuring securityContext in Kubernetes and setting user/group IDs in a Dockerfile are two complementary, but not identical things.

If we don't declare USER in a Dockerfile, the container will default to uid=0 / root. Now, for example, if you configured a Pod or container securityContext with runAsNonRoot: true and used such an image, the kubelet will simply refuse to start it.

In theory, you can extend your securityContext with runAsUser: 1001 and the container will start, but that brings another set of problems:

  • Application files inside the container are still owned by root, because you have not configured the USER during the build. This can be especially problematic for files that need to be writable (pids, logs, caches, etc.).
  • Processes running inside the container run with the EUID of a user that does not exist in the system from their point of view.
  • Files owned by the specified UID will not have a corresponding username displayed, causing confusion during debugging or file inspection.
  • Commands like whoami and tools relying on user information might fail or return errors, leading to broken workflows or logging issues.
  • Logs may show only numeric UIDs instead of meaningful usernames, complicating troubleshooting and security audits.
  • If the same image is started in development on a developer's laptop, it will start as root, which could lead to a false sense of things working until they are pushed to Kubernetes.

if we really wanted to be slim we can use the distroless. https://github.com/GoogleContainerTools/distroless/blob/main/examples/python3/Dockerfile

The distroless are generally larger than Alpine images. python:3.12.2-alpine3.19 is 51.8MB and gcr.io/distroless/python3 is 52.8MB. The Alpine one includes at least some of the debugging tools and have a shell. Having said that, we can certainly look into this further and maybe use the distroless ones for production as theoretically they would provide better security.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think I'd be more okay with it if we pinned ourselves to something rather than making it variable and having to carry that around. Because then we'd do the security piece and the runAsUser: 1001.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to work on that change, but it will not be part of this PR as it requires changes to multiple Dockerfiles that are not related to this feature

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #515 - if it gets merged before this one, I'll adjust the code as necessary

RUN --mount=type=cache,target=/var/cache/apk apk add --virtual build-deps gcc python3-dev musl-dev linux-headers
RUN --mount=type=cache,target=/root/.cache/.pip pip install 'wheel==0.43.0'
RUN --mount=type=cache,target=/root/.cache/.pip \
python -m venv /opt/poetry && \
/opt/poetry/bin/pip install 'poetry==1.7.1' && \
/opt/poetry/bin/poetry self add 'poetry-dynamic-versioning[plugin]==1.3.0'

# copy in the code
COPY --chown=${APP_USER}:${APP_GROUP} operators/nova-flavors /app
COPY --chown=${APP_USER}:${APP_GROUP} python/understack-flavor-matcher /understack-flavor-matcher
# need watchdog and psutil built AS a wheel
RUN --mount=type=cache,target=/root/.cache/.pip pip wheel --wheel-dir /app/dist watchdog psutil
CMD ["nova-flavors-sync"]

WORKDIR /app
RUN cd /app && /opt/poetry/bin/poetry build -f wheel && /opt/poetry/bin/poetry export --without-hashes -f requirements.txt -o dist/requirements.txt

######################## PROD ########################
FROM ghcr.io/rackerlabs/understack/argo-python3.12.2-alpine3.19 AS prod

ARG APP_PATH=/app
ARG APP_USER=appuser
ARG APP_GROUP=appgroup
ARG APP_USER_UID=1000
ARG APP_GROUP_GID=1000

ENV FLAVORS_DIR="/flavors"
ENV NOVA_FLAVOR_MONITOR_LOGLEVEL="info"

LABEL org.opencontainers.image.description="Nova-Flavors synchronizer"

RUN mkdir -p /opt/venv/wheels/
COPY --from=builder /app/dist/*.whl /app/dist/requirements.txt /opt/venv/wheels/
COPY --chown=${APP_USER}:${APP_GROUP} python/understack-flavor-matcher /python/understack-flavor-matcher

RUN --mount=type=cache,target=/root/.cache/.pip cd /app && /opt/venv/bin/pip install --find-links /opt/venv/wheels/ --only-binary watchdog psutil -r /opt/venv/wheels/requirements.txt nova-flavors

USER $APP_USER
CMD ["nova-flavors-sync"]
8 changes: 8 additions & 0 deletions operators/nova-flavors/.env-example
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
OS_USERNAME=flavorsync
OS_PASSWORD=abcd1234
OS_AUTH_URL=https://keystone.environment.undercloud.rackspace.net/v3
OS_USER_DOMAIN_NAME=service
OS_PROJECT_NAME=admin
OS_PROJECT_DOMAIN_NAME=default
FLAVORS_DIR=/home/someuser/flavors/
FLAVORS_ENV=nonprod
Empty file.
Empty file.
101 changes: 101 additions & 0 deletions operators/nova-flavors/nova_flavors/flavor_synchronizer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
from functools import cached_property

from flavor_matcher.flavor_spec import FlavorSpec
from novaclient import client as novaclient

from nova_flavors.logger import setup_logger

logger = setup_logger(__name__)


class FlavorSynchronizer:
def __init__(
self,
username: str | None = "",
password: str = "",
project_name: str | None = "admin",
project_domain_name: str = "default",
user_domain_name="service",
auth_url: str | None = None,
) -> None:
self.username = username
self.password = password
self.project_name = str(project_name)
self.project_domain_name = str(project_domain_name)
self.user_domain_name = user_domain_name
self.auth_url = auth_url

@cached_property
def _nova(self):
return novaclient.Client(
"2",
username=self.username,
password=self.password,
project_name=self.project_name,
project_domain_name=self.project_domain_name,
user_domain_name=self.user_domain_name,
auth_url=self.auth_url,
)

def reconcile(self, desired_flavors: list[FlavorSpec]):
if len(desired_flavors) < 1:
raise Exception(f"Empty desired_flavors list.")

existing_flavors = self._nova.flavors.list()
for flavor in desired_flavors:
nova_flavor = next(
(flv for flv in existing_flavors if flv.name == flavor.stripped_name),
None,
)

update_needed = False
if nova_flavor:
logger.info(
f"Flavor: {flavor.stripped_name} already exists. Syncing values"
)
if nova_flavor.ram != flavor.memory_mib:
logger.info(
f"{flavor.name} RAM mismatch - {nova_flavor.ram=} {flavor.memory_mib=}"
)
update_needed = True

if nova_flavor.disk != max(flavor.drives):
logger.info(
f"{flavor.name} Disk mismatch - {nova_flavor.disk=} {flavor.drives=}"
)
update_needed = True

if nova_flavor.vcpus != flavor.cpu_cores:
logger.info(
f"{flavor.name} CPU mismatch - {nova_flavor.vcpus=} {flavor.cpu_cores=}"
)
update_needed = True

if update_needed:
logger.debug(
f"{flavor.name} is outdated. Deleting so it can be recreated."
)
nova_flavor.delete()

else:
update_needed = True

if update_needed:
logger.info(f"Creating {flavor.name}")
self._create(flavor)

def _create(self, flavor: FlavorSpec):
nova_flavor = self._nova.flavors.create(
flavor.stripped_name,
flavor.memory_mib,
flavor.cpu_cores,
min(flavor.drives),
)
nova_flavor.set_keys(
{
"resources:DISK_GB": 0,
"resources:MEMORY_MB": 0,
"resources:VCPU": 0,
flavor.baremetal_nova_resource_class: 1,
}
)
19 changes: 19 additions & 0 deletions operators/nova-flavors/nova_flavors/logger.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import logging


def setup_logger(name: str | None = None, level: int = logging.DEBUG):
"""Standardize our logging.

Configures the root logger to prefix messages with a timestamp
and to output the log level we want to see by default.

params:
name: logger hierarchy or root logger
level: default log level (DEBUG)
"""
logging.basicConfig(
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S %z",
level=level,
)
return logging.getLogger(name)
48 changes: 48 additions & 0 deletions operators/nova-flavors/nova_flavors/reconcile.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import logging
import os
import time

from flavor_matcher.flavor_spec import FlavorSpec
from watchdog.observers import Observer

from nova_flavors.flavor_synchronizer import FlavorSynchronizer
from nova_flavors.logger import setup_logger
from nova_flavors.spec_changed_handler import SpecChangedHandler

loglevel = getattr(logging, os.getenv("NOVA_FLAVOR_MONITOR_LOGLEVEL", "info").upper())
logging.getLogger().setLevel(loglevel)
logger = setup_logger(__name__, level=loglevel)


def main():
# nonprod vs prod
flavors_dir = os.getenv("FLAVORS_DIR", "")
if not os.path.isdir(flavors_dir):
raise ValueError(f"flavors_dir '{flavors_dir}' is not a directory")
synchronizer = FlavorSynchronizer(
username=os.getenv("OS_USERNAME", ""),
password=os.getenv("OS_PASSWORD", ""),
project_name=os.getenv("OS_PROJECT_NAME", "admin"),
project_domain_name=os.getenv("OS_PROJECT_DOMAIN_NAME", "default"),
user_domain_name=os.getenv("OS_USER_DOMAIN_NAME", "service"),
auth_url=os.getenv("OS_AUTH_URL"),
)

handler = SpecChangedHandler(
synchronizer, lambda: FlavorSpec.from_directory(flavors_dir)
)
observer = Observer()
observer.schedule(handler, flavors_dir, recursive=True)
logger.info(f"Watching for changes in {flavors_dir}")
observer.start()

try:
while True:
time.sleep(1)
finally:
observer.stop()
observer.join()


if __name__ == "__main__":
main()
38 changes: 38 additions & 0 deletions operators/nova-flavors/nova_flavors/spec_changed_handler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import time
from typing import Callable

from watchdog.events import DirModifiedEvent
from watchdog.events import FileModifiedEvent
from watchdog.events import FileSystemEventHandler

from nova_flavors.flavor_synchronizer import FlavorSynchronizer
from nova_flavors.logger import setup_logger

logger = setup_logger(__name__)


class SpecChangedHandler(FileSystemEventHandler):
COOLDOWN_SECONDS = 30

def __init__(
self, synchronizer: FlavorSynchronizer, flavors_cback: Callable
) -> None:
self.last_call = None
self.synchronizer = synchronizer
self.flavors_cback = flavors_cback

def on_modified(self, event: DirModifiedEvent | FileModifiedEvent) -> None:
if isinstance(event, DirModifiedEvent):
self._run(event)

def _run(self, event):
now = time.time()
if not self.last_call:
self.last_call = now
else:
if self.last_call + self.COOLDOWN_SECONDS > now:
logger.debug("Cooldown period.")
return
self.last_call = now
logger.info(f"Flavors directory {event.src_path} has changed.")
self.synchronizer.reconcile(self.flavors_cback())
Loading