Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A solution for podman containers max log size #100

Closed
PavelKuzub opened this issue Jan 18, 2021 · 3 comments
Closed

A solution for podman containers max log size #100

PavelKuzub opened this issue Jan 18, 2021 · 3 comments

Comments

@PavelKuzub
Copy link
Contributor

PavelKuzub commented Jan 18, 2021

Problem description

Today I ran out of disk space on UDM Pro's 12.2G storage area that negatively impacted UDM's original functionality.
Upon investigation, the disk was consumed by custom container logs (homebridge, hoobs) that reached 8GB+:

# Get container logs
ls -lah /mnt/data/podman/storage/overlay-containers/*/userdata/ctr.log

The catalyst of the log file to grow was an error printed by one of the Homebridge addons that went unnoticed for a while.

Our custom containers are set up without any logs rotation and the default setting is to grow unlimited.
It is a matter of time when each one of us using custom containers will get a disk full unless a solution is added to limit the max log size a container can have and thus maintain a stable disk usage footprint.

Research

As per documentation of podman and this issue, an ability to configure max log size per container was added in podman 2.2.0, while as of today, UDM Pro runs podman 1.6.1.

There was a reference to changing the setting for all containers via containers.conf file. This documentation covers log_size_max property.

However, podman 1.6.1 does not yet support containers.conf file - it was first mentioned in release notes of podman 1.9.0

I have investigated the source code of podman 1.6.1 and traced down the default setting to be coming from config file libpod.conf:

/etc/containers/libpod.conf

property:

max_log_size = -1

The content of this config file is reset on reboot, so an extra early on_boot.d script allowed to change default before the custom containers start.

# cat /mnt/data/on_boot.d/05-max_log_size.sh
#!/bin/sh
# Set a limit for container logs. 104857600 Bytes = 100 Megabytes
sed -i 's/max_log_size = -1/max_log_size = 104857600/g' /etc/containers/libpod.conf;

You can verify that the setting has been applied by looking for conmon parameters

ps -ef | grep conmon | grep log-size-max

I have verified that the log is getting truncated by setting up a limit of 10 kilobytes. It is not a rotation, but a truncation.
unifi-os container does not write anything in the log, so UniFi is not suffering from the lack of log size limits. Only custom additional containers.

Proposal

Set a limit to the containers log size.

Would the above be a strategic approach worth a Pull Request?
If so, what would be the right place in this repo to place a generic script like this that impacts all containers?

@boostchicken
Copy link
Member

Hey,

Thanks and excellent write up. I would put this in own folder called container-common or something like DNS common. Also, make sure to update the README.md to describe it and maybe even highlight it in red and update other README's if needed. If you can get the script in, I can help with that as well.

Thanks!

@PavelKuzub
Copy link
Contributor Author

Hello John,

I have raised a PR #102 with proposed changes. Let me know if any additional commits are required or wanted. Thanks

PK

boostchicken pushed a commit that referenced this issue Jan 25, 2021
* Added container-common

Initial release of container-common section that includes setting a limit of container log size any container can have, to prevent filling up UDM storage with excessive logging.

* Update README.md

Clarified description of max log size

Co-authored-by: TRUPaC <[email protected]>
brandonsoto pushed a commit to brandonsoto/udm-utilities that referenced this issue Jan 27, 2021
* Added container-common

Initial release of container-common section that includes setting a limit of container log size any container can have, to prevent filling up UDM storage with excessive logging.

* Update README.md

Clarified description of max log size

Co-authored-by: TRUPaC <[email protected]>
@boostchicken
Copy link
Member

Merged!@

sf-project-io pushed a commit to softwarefactory-project/sf-infra that referenced this issue Jun 13, 2023
Sometimes the logs can take over 20GB in few weeks.
By setting 1GB as log_size_max should avoid situation that we are
out of the disk few times per week.
The feature has been added into the podman containers.conf file in
podman 2.2.0 release [1], but on Centos 7, version is below 2.2.0.
According to the libpod.conf man [2], that option should be also
available in podman 1.6.4, but it is located in libpod.conf file.
More info [3].

[1] https://github.com/containers/podman/releases/tag/v2.2.0
[2] https://manpages.debian.org/unstable/podman/libpod.conf.5.en.html
[3] unifi-utilities/unifios-utilities#100

Change-Id: Ic6d01e11606c9526d1880583876d76c4415250ac
sf-project-io pushed a commit to softwarefactory-project/sf-config that referenced this issue Jun 14, 2023
The service logs after a while can be really huge.
This change is limiting log file size to 1GB.
The feature has been added into the podman containers.conf file in
podman 2.2.0 release [1], but on Centos 7, version is below 2.2.0.
According to the libpod.conf man [2], that option should be also
available in podman 1.6.4, but it is located in libpod.conf file.
More info [3].

[1] https://github.com/containers/podman/releases/tag/v2.2.0
[2] https://manpages.debian.org/unstable/podman/libpod.conf.5.en.html
[3] unifi-utilities/unifios-utilities#100

Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/28529

Change-Id: Ia6071e5214644bdd126cf696cd437c140fa95c94
sf-project-io pushed a commit to softwarefactory-project/sf-config that referenced this issue Jun 10, 2024
Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/31675
Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/31690

Here are the stashed commits from the common 3.8.3 tag.

git format-patch -N  20d7af3..origin/3.8
git am *.patch

There was some conflicts that have been fixed manually.

Remove Opensearch Dashboards autologin feature

After moving to Keycloak, such feature is not required.

Fixes - After d/s upgrade

- logprocessing clean of old components
- opensearch-dashboard and opensearch use CA chain ca-trust
- add sf_purgelogs_additional_params vars (mount addtional volume)

Set host network binding for some services and contenerized tools

Almost all containers that we are starting in Software Factory
are using host binding.

Render zuul_api_url as python list

The logscraper tool gets zuul_api_url parameter as a list and
there can be multiple values provided.

Change url path for Opensearch Dashboards

The new URL will not use autologin feature.

Add condition to verify that stdout item exists

The item might not exists when infrastructure is updated
each time when Software Factory is released.

sf-keycloak: quote passwords in parameters

Passwords may include special characters that break command lines.

Add option gerrit_use_truststore

Enable increase innodb_log_file_size and innodb_buffer_pool_size

After increasing parameters, some queries performed by Zuul are working
faster.
This change is mostly helpful for those Zuul deployments, where
some scripts are making a complicated query with many job_name variables
to Zuul web to receive latest build results and the SQL "inner join"
takes long time.

Ensure backup dir exists; change backup host

After changing service name from Kibana to Opensearch Dashboards,
when the arch.yaml file was not updated to new values, the backup
directory for opensearch-dashboards service might not be available
on the host.

Use new mysql container version

Depends-on: https://softwarefactory-project.io/r/c/containers/+/27429

Adding conditional for zuul-web check on grafana postconfig stage

Add debug flag for purgelogs; remove :Z flag for log dir in purgelogs

The log directory might have a lot of files, so restarting the purgelogs
script might take ages until the SELinux labeling is done.
Also added debug flag parameter into the purgelogs service to see
removal progress logs.

Logserver trailing slash fix

This change fixes the trailing slash problem raised by OSP CI team.
The issue is due to requests not working when made to logserver without an ending trailing
slash.

Mount MariaDB cache dir

Without mounting the cache dir, the container delta overlay dir
might be very big.

Change retention policy in influxdb; increase buffer

This commit fixes various issues related to the telegraf and influxdb errors:

    Metric buffer overflow; 831 metrics have been dropped

Also changed retention policy to wipe data after 4 weeks.

Update purgelogs container image

The new purgelogs container image will provide log messages about its
progress.

config-repo: Pull centos image from quay rather than registry.centos.org

registry.centos.org seems down, investigation pending. This breaks
config-update jobs, which rebuild containers defined in the config repo.
In the meantime, switch to quay.io for pulling.

zuul-web: mount /var/lib/zuul/

When a connection requires a SSH key, it is stored in
/var/lib/zuul/.ssh - which isn't exposed to zuul-web, resulting in
errors when the configuration is loaded.

Use zuul-executor-ubi-sf38 to benefit last managesf release

See https://softwarefactory-project.io/cgit/containers/commit/images-sf/3.8?id=87dea1ceae4719e48193e85a8bc7fdfd5553216f

Set log_size_max size for podman logs

The service logs after a while can be really huge.
This change is limiting log file size to 1GB.
The feature has been added into the podman containers.conf file in
podman 2.2.0 release [1], but on Centos 7, version is below 2.2.0.
According to the libpod.conf man [2], that option should be also
available in podman 1.6.4, but it is located in libpod.conf file.
More info [3].

[1] https://github.com/containers/podman/releases/tag/v2.2.0
[2] https://manpages.debian.org/unstable/podman/libpod.conf.5.en.html
[3] unifi-utilities/unifios-utilities#100

Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/28529

Use managesf-sf38 last container image; drop encoding parameter in managesf

The "encoding" parameter is raising an error on starting managesf
service.

Ensure nodepool services are restarted when config files is updated

Nodepool services must be restarted when labels are added

zuul/nodepool: bump to the latest version (10.0.0)

This change sets the ansible_root zuul.conf variable to
avoid ansible installation on startup.

Also bump MariaDB version 10.5 because of the renaming index
feature (needed for Zuul DB Migration) not available in 10.3.

Depends-On: https://softwarefactory-project.io/r/c/containers/+/31361
Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/31362
Depends-On: https://softwarefactory-project.io/r/c/containers/+/31412

Provided fixes to enable mariadb upgrade from 10.3 to 10.5

Running the sfconfig --upgrade is then required.

Depends-On: https://softwarefactory-project.io/r/c/software-factory/sf-ci/+/31390

arch allinone - add missing zuul-merger component

Update sf-gerrit to latest build

3.7.8-2 was built somewhat recently[1] and addresses a couple of CVEs.

[1] https://quay.io/repository/software-factory/gerrit-sf38?tab=tags

Add --golden-tests feature to validate generated playbooks

This change enables testing the deployment playbooks without
installing sf-config. Run with:

  PYTHONPATH=$(pwd) python3 ./sfconfig/cmd.py        \
    --golden-tests ./refarch-golden-tests/           \
    --arch ./refarch/softwarefactory-project.io.yaml \
    --config ./defaults/sfconfig.yaml --share $(pwd)

Remove unused host_public_url facts

This change remove a fact that is no longer used.

Sort the /etc/hosts alias to avoid random update

This change ensures the /etc/hosts is defiened in a fixed order

Combine zuul-executor and zuul-merger hosts in the generated deployment playbook

This change improves the deployment process by combining the common host into a
single target so that the roles can be applied in parallel

Setup user_namespaces before the restore tasks

When restoring a backup on a fresh instance, make sure that the
userns is configured to ensure the container can be created correctly.

Do not use the zuul_wrapper for restore tasks

When restoring a backup on a fresh instance, the zuul_wrapper command
does not exist.

Restore zookeeper lib ownership after a restore

This change ensure the zookeeper setup is correct after restore.

Revert "Combine zuul-executor and zuul-merger hosts in the generated deployment playbook"

Change-Id: I1742905336af06de3d35814413932f7558317036
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants