-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🆕⌚️ Automatic updates #247
Comments
If we do have hands-off upgrades that's going to drive a more immediate need for automated rollbacks. That's #177 |
I am now thinking the default model for automatic updates should involve automatic downloading/queuing. Having to download just the rpmdb to display diffs sucks for multiple reasons. Among them it's going to be hard to support if we move to OCI images. Plus I'd like to support a "deltas only" ostree repo mode. Or a combination. Beyond that, for the majority of cases such as standalone desktop, enterprise desktop, enterprise server this is what I think is a good default. Enterprise particularly if we encourage local mirroring. One case where people may not want this is standalone embedded systems, but we can obviously support the status quo of typing |
So specifically for Cockpit, I'd like to move them away from the |
So.. I have an idea for this (probably not a very good one). See #558 where in the 2nd paragraph I say
That way we don't need the rpmdb to do a diff. Also we can choose to only use the rpm data from the commit message if the rpmdb doesn't exist locally, i.e. we only have metadata about the commit. WDYT? |
Yeah, I think we can put at least the NEVRAs in the commit header.
which isn't too bad. |
I think this also blocks on ostreedev/ostree#545 |
Had a chat with @cgwalters about this today. Here are the notes from that: High-level expectations:
I feel like between all of these steps, at least for the desktop we need to think about having Implementation:
Other considerations:
Mock-up status outputs:
|
In that status output I think |
Definitely, we need to describe the state as well. Another interesting piece of information that would be worth displaying is the size of the download. Interestingly, this is something we can easily calculate for jigdo remotes. In the ostree remote case, we can only display that if there are static deltas. |
ok. one other thing I wonder if we're covering: |
Right, this is #177. I'm open to discuss whether to hide the |
cool |
WIP in #1147. |
OK so let's try to agree on what happens with the "first cut" of this. Are we thinking that we'll land this but it will just be disabled by default and people who want it can opt-in for now? I'm generally OK with that. But there are definitely issues in turning on even Now a good thing here is we're not triggering the updates out of PAM. But we still have the problem for example that a whole lot of people need to configure a proxy. What I'd like to see for example is adding the notion of "auto-cancellable transactions" or so. Basically if while the Further a whole big conceptual issue the degree to which our systemd units are "special". We also need to support e.g. gnome-software, Cockpit, and also Ansible at least; @jlebon mentioned that in
I think in the "personal desktop" case it's pretty clear gnome-software could just frob the settings in the config file (do we own the polkit gateway for that? expose an API?) BTW down the line for the "CSB laptop" case I'd actually like to support a mode where if e.g. someone has their laptop suspended/turned off for a month while they go on vacation, when they boot up Internet access is disabled for everything except rpm-ostree upgrades until they get updated. I'm sure some people would despise this idea but if we make updates fast and painless we can get a lot closer to having both security and convenience. |
(Actually for the desktop case implementing that is probably a gnome-software thing given flatpaks need updating too) |
Instead of a config file, I think it may be easier to have gnome-software drive the automatic updates over dbus -- it already has a session service specifically for that purpose. This way it could also make sure that base OS and flatpak updates are applied at the same time, reducing user interruptions etc. Would that make sense? |
That makes sense and is part of the design in #1147. Basically, gnome-software could just turn off the timer and call |
That sounds great! Let me see if I can quickly hack up gnome-software to make use of the new goodness and then report back on Monday or so. |
@kalev, is there any sort of gnome-software cli? gnome-software incorporates rpms, faltpaks, firmware, ostree??, it would be really nice to have something like that on my Atomic Host (not workstation) system in a cli form to report potential updates and allow me to choose what to install. related discussion in #405 (comment) |
True. But that's something they'd have to set up to upgrade manually as well, right?
That's a really interesting idea. Let's split that out into its own issue once #1147 is merged?
It depends what you mean by "support Ansible" here but yeah, configuring automatic updates can be done solely through editing Cockpit could do something similar to the rpm-ostree client, querying the Basically, since we're just using systemd units and a config file, we're in pretty familiar territory for any application that wants to control this stuff. There is no stored state elsewhere that's exclusively managed by rpm-ostree. One example of this is that |
Hi, I had a try for the autoupdate. It works nicely for me =). I do have a few questions though (hopefully you won't mind =) ). Note: the test output might be long (but content should not be that much). I also did not read many of the comments above, so if I happen to miss something, please let me know =P 1: When apply the auto-update patch, rpm-ostree status does take noticeably longer than before. Is that expected?
vs
2: It seems like I have to do an upgrade --preview in order to make rpm-ostree status show the available update, is that the expected behavior?
3: Last question, how do I generate a test output so that last run is no longer unknown in the status? Other than that, the functionality looks nice =). Sorry it took long, had to spend time understanding the testing procedure. And this is the complete test log if you are interested: |
Thanks @peterbaouoft for trying it out! :)
Ahh, you're probably hitting fedora-selinux/selinux-policy-contrib#45. You can either use the same hack we use in the testsuite, or just
Right. The 3: Last question, how do I generate a test output so that last run is no longer unknown in the status? That's due to the SELinux policy issue above. |
Yup, applying setenforce 0 does make it a lot faster, and seems like also solve the unknown status problem. 2 birds with one stone! =P
I see, makes sense. Thanks for the explanation! I am more and more excited about this new feature now(auto-update)! =D |
Yeah, I think there's a lot of use cases where you don't want your updater to auto-download in the background. E.g. for FAW, I'd feel comfortable shipping with
I think it helps to reason out things explicitly to make sure we're going the right way!
To get back to this, I do see where you're coming from. I think in that case, I'd rather we not ship such a timer at all for now? My initial thoughts before were to add some of these "policy engine"-like settings to rpm-ostree itself, such as "auto reboot only for security erratas", or "auto reboot, but not for layered packages". I still think there's some value in doing this for the lone server/IoT case, because even though it's not very hard to implement manually, it makes things really easy to configure OOTB. But I guess that should be a separate discussion from whether to have a dumb reboot policy at all. So I'd vote for leaving this out for now until we gain more experience in the managed workflows like GNOME Software and cluster cases. |
I tend to think anything that will reboot the node needs to be handled outside of the daemon directly. The daemon itself (unless I'm mistaken) isn't aware of how many other nodes it lives with and can't initiate a restart without that possibility of downtime. Instead, having Edit: s/agent/daemon/g |
Yeah...it's tempting to take the This all ties back into the (just posted) https://pagure.io/atomic-wg/issue/453 |
I think having the |
One question is whether |
@jlebon isn't
This is a tricky subject though. My initial feeling is to put as much orchestration in the agent and as little in rpm-ostree. What keeps me from outright pushing for that is any agent that is used will likely be tied to a specific orchestration system or tool. If we try to make a generic agent then we are basically providing an interface and, to me, that would seem more at home in rpm-ostree anyway. |
Yeah, that's the core tension. I guess my core feeling is let's not delete anything that exists in rpm-ostreed today, but I would vote that the Kube agent initiates updates itself rather than relying on the timer. |
So, from discussions here, I think what we want is "both". I.e. we do want a "stage" mode that We could slice this further even into node agents that could still rely on rpm-ostree's "check" mode to know that a node has an update vs a more controlled environment where the "update available" signal comes directly to the agent OOB from some other metadata protocol (in which case, the rpm-ostree timer/policy is completely off). |
Note that |
To clarify, stage mode would download updates and have them ready for deployment (not actually deploy) correct?
👍
That makes sense. Assuming staging means downloading and being ready to deploy, I say lets get that in. Having a timer to deploy and reboot that's configurable is fine too as long as we can disable the timer deploy portion. Being able to disable the auto staging would be a nice to have but could be added at a later time. |
Do we have a path forward on this? |
I think if we do this though I'd like to have something like:
And the daemon then tracks (somewhere) the name passed. The idea here is that then
So administrators understand what's going on. And we should probably explicitly throw an error if the built-in timer is enabled and anything else executes the auto-update policy. This "tracking the last agent" though only works after things have run at least once. But I think that's OK. |
🤔 Though...today we probably could use |
The high level goal is to render in a better way what caused an update: coreos#247 (comment) This gets us for Cockpit: `Initiated txn DownloadUpdateRpmDiff for client(dbus:1.28 unit:session-6.scope uid:0): /org/projectatomic/rpmostree1/fedora_atomic` which isn't as good as I'd hoped; I was thinking we'd get `cockpit.service` but actually Cockpit does invocations as a real login for good reason. We get a similar result from the CLI.
The high level goal is to render in a better way what caused an update: #247 (comment) This gets us for Cockpit: `Initiated txn DownloadUpdateRpmDiff for client(dbus:1.28 unit:session-6.scope uid:0): /org/projectatomic/rpmostree1/fedora_atomic` which isn't as good as I'd hoped; I was thinking we'd get `cockpit.service` but actually Cockpit does invocations as a real login for good reason. We get a similar result from the CLI. Closes: #1368 Approved by: jlebon
In FCOS and RHCOS now, automatic updates are driven by higher-level software like Zincati and the MCO. It's likely there will be more work to make integrating with automatic update drivers like this in the future. But it's unlikely that we will switch to a model where rpm-ostree takes the full responsibility of the automatic update mechanism because it's highly context dependent. |
EDIT 20181206:
Today with rpm-ostree if you want to enable automatic background updates, edit
/etc/rpm-ostreed.conf
, and ensure that theDaemon
section looks like:Next then,
systemctl enable rpm-ostree-automatic.timer
.This won't automatically reboot though.
This thread though contains a lot of background information/design around higher level issues.
Initial PR: #1147
The text was updated successfully, but these errors were encountered: