-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[21.05] agent backport, part 3 #832
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Scheduling requests that are already running or are in a state to be archived isn't useful and may cause unneccessary errors. PL-131813 (cherry picked from commit 91466ec)
PL-131813 (cherry picked from commit 528be4f)
… permission errors Output of nix commands in UpdateActivity.run() is now logged. A typical problem are failing services on system activation which we want to be logged to a separate file like for `fc-manage switch`. Main log file permissions problems are now ignored. No need to abort when the main log file cannot be used. We still have journal logging and console output. PL-131813 (cherry picked from commit 4bea2c5)
Before, exceptions just bubbled up to Typer's excepthook which pretty-prints them. This is nice for interactive use but not for agent tasks run by systemd units. They ended up in stdout while other log messages are send to journal directly. This means that exceptions had a different SYSLOG_IDENTIFIER than log messages which is annoying for debugging. Also, we didn't log unhandled exceptions that occurred in interactive use at all. This change adds a new FCTyperApp which is used for fc-manage and fc-maintenance commands. The class extends Typer's app class and adds exception logging. It still passes exceptions to Typer's excepthook when used interactively. PL-131813 (cherry picked from commit 9db289d)
We now separate non-invasive and invasive code paths better. Moves around existing methods to a more logical order, grouping invasive and non-invasive methods. Some read-only commands for showing requests, metrics and the Sensu check can now be called by non-root users without the need for sudo. Clean up some uses of rm.scan() which are now handled by __enter__ which must be called for all invasive methods. This also takes care of loading requests and creating missing directory now. Add missing @require_lock decorators for invasive methods and give internal methods a underscore prefix. The latter don't have the decorator but it should be fairly obvious how to use them. PL-131813 (cherry picked from commit 12abf01)
We need to use timezone-aware objects here. PL-131813 (cherry picked from commit 85859de)
Use the `stamina` library for automatic retries with integrated logging, exponential backoff and jitter. PL-131813 (cherry picked from commit f07eb19)
This has some advantages: * Dmidecode as external tool is called less often, only when really needed (init, just before reboot). Before, this happened on every agent run. * Less debug log messages. * Activity doesn't change just by loading it. * Non-privileged users can show the activity now. PL-131813 (cherry picked from commit ad0e262)
Exceptions from _enter_maintenance don't bubble up anymore but are logged and treated like temporary failures now. Output from enter commands is now shown directly on the trace log level and added to the exception if the command fails or logged when postpone/tempfail is requested. PL-131813 (cherry picked from commit e4593f7)
Now, when merging requests the comment of the new request is only concatenated to the old one when it's not contained in the old commit to avoid repeating content. We saw this with RebootActivity which is created on every fc-agent run when a reboot for the kernel is needed. When the kernel version changed again, the new comment was concatenated over and over again.
ctheune
approved these changes
Nov 20, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of #818 to NixOS 21.05, with a small, additional fix for a request comment merging bug and a sudo rule to allow admins to request a scheduled reboot. In combination with
fc-maintenance run --run-all-now
it's possible to reboot a system immediately in a safe way.PL-131813
@flyingcircusio/release-managers
Release process
Impact:
Changelog: (internal)
Security implications