-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[gh-actions] make linux package installation tests resilient to container image failures #623
Comments
Makes sense to me, removing the |
Pinging code owners for testbed: @open-telemetry/collector-approvers. See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
I think this issue should be moved into https://github.com/open-telemetry/opentelemtry-collector-releases since the package tests were moved there in #604 |
@cwegener Hello! I'm here to help you with any questions or issues you have. Let's work together to solve this bug. To address the issue of silent failures in the Linux Package installation tests, you can implement a health check to ensure the container is fully operational before proceeding with further commands. Here are the steps to make the Linux Packaging script execution more resilient:
By integrating this health check, you can ensure that the container is fully operational before copying the |
I am working on a fix for this |
@cwegener the fix is ready for review :) |
Component(s)
No response
Describe the issue you're reporting
Problem
In issue open-telemetry/opentelemetry-collector-contrib#16450 it has been discovered that the Linux Package installation tests can fail for non-obvious reasons due to the complexity of the test bed setup.
The critical part of the Linux Package testing that was failing in the referenced issue was:
.deb
/.rpm
package to be tested onto the running containerIn the issue, the execution was silently failing at step 1 without raising an error.
The cause of the silent failure is because the container image used for the installation testing has no health check configured when the image is run. Therefore, the only failure conditions that can be captured are the ones that
podman run
/docker run
are reporting, which are limited to:source: https://docs.podman.io/en/latest/markdown/podman-run.1.html#exit-status
Since
docker run
/podman run
is used with the--detach
switch in the Linux Packaging tests, a non-zero exit code from the contained command itself will never be returned bypodman run
/docker run
Solution
In order to make the transition from step 1. to step 2. and beyond more resilient, it is sufficient to simply make the Linux Packaging script execution wait until the SystemD manager inside of the container is confirmed to be up and running.
This can be achieved with the following:
systemctl is-system-running --wait
inside of the container in order to wait for the system inside of the container to be fully operationalsystemctl --machine=<nameofcontainer> is-system-running --wait
directly in the Linux Packaging test scriptpodman run
command line to run thesystemctl is-system-running --wait
command as a Startup Health Check script and therefore make the execution ofpodman run
wait until the container is fully operational.The text was updated successfully, but these errors were encountered: