Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reboot coordination: locksmith successor #3

Closed
cgwalters opened this issue Jul 10, 2018 · 22 comments
Closed

reboot coordination: locksmith successor #3

cgwalters opened this issue Jul 10, 2018 · 22 comments

Comments

@cgwalters
Copy link
Member

CL encourages using locksmith + etcd by default as a "cluster". Do we want to do that out of the box, or focus on e.g. https://github.com/ashcrow/container-linux-update-operator/tree/spike ?

Another option is to document how to "roll your own" coordination with e.g. Ansible; we have APIs.

@bgilbert
Copy link
Contributor

I think there are several interlocking questions here:

  • Should FCOS nodes be able to update themselves, or should an external operator be mandatory?
  • If FCOS nodes can self-update, should they have reboot coordination, or should self-update only be an option for non-clustered nodes?
  • If reboot coordination is desired, should it involve locksmith, some new tool, or instructions to roll your own? Is there some baseline functionality that should be provided by the node for external reboot coordination (including maybe a CLUO-like system) to hook into?
  • What coordination strategies (e.g. etcd) should be provided by default?

Would it make sense to break these out into separate issues?

@lucab
Copy link
Contributor

lucab commented Jul 23, 2018

We had some out-of-band discussion on this, and here I'm summarizing the points we covered:

  • while we don't focus on non-clustered deployments, the lack of a mean to auto-apply updates (without coordination) on a single node would be problematic. So we would likely keep providing something like locksmith for FCOS.
  • while locksmith design is generally ok, its codebase is not particularly modern and could use some love. We may likely re-use most of the code, but having a dedicated project for a new iteration would be cleaner.
  • for the coordinated case, locksmith currently relies on etcd. If we don't ship (a specific version) etcd as a on-host service, we may want to de-emphasize it in favor or something not tied to a specific DB.
  • to uncouple the hard-dependency on the DB, we can provide a simpler external-permission mode where we move the higher-level coordination into an external service. The on-host agent would simply issue a request to get a permission to reboot and notify back when done (initial strawman: http-permission a pair of HTTPS GET/POST requests)
  • the above external-permission mode would provide the baseline functionality for any external reboot coordination service (including a k8s-based CLUO-like one). Such service can be community-provided, as soon a we stabilize the request-notify protocol
  • out of scope for this component: a way to externally trigger arbitrary reboots at any point in time. The reboot initiator would still be a signal from the on-host update service (i.e. update-engine replacement)
  • default reboot mode would be a local immediate decision to reboot (i.e. similar to locksmith reboot mode)

This is not yet a final design, but if there are no controversies or radically different suggestions we can move forward with it.

@cgwalters
Copy link
Member Author

the lack of a mean to auto-apply updates (without coordination) on a single node would be problematic.

Isn't that just systemctl enable rpm-ostreed-automatic.timer from the rpm-ostree auto-updates discussion?

@lucab
Copy link
Contributor

lucab commented Jul 23, 2018

@cgwalters from my shallow understanding of rpm-ostree upgrade, that is likely the complementary of this: the timer unit will trigger fetching/deploying updates, while this one just controls when the machine goes down for reboot.

@cgwalters
Copy link
Member Author

Yeah, you're right. That said I lament the lack of reboot management in rpm-ostree itself for the single node case today - you can see that in the discussion threads. I'd like to support simple logic like "reboot if an update is ready and there are no active sessions" as a systemd timer unit that we can also render in rpm-ostree status.

@jlebon
Copy link
Member

jlebon commented Jul 23, 2018

That makes sense to me. We could add a trivial reboot policy in rpm-ostree (and maybe we should), though supporting logic for maintenance windows and active session detection would better belong in something like locksmith/its successor.

@jlebon
Copy link
Member

jlebon commented Jul 23, 2018

For reference, here is the rpm-ostreed.conf knob in question: https://github.com/projectatomic/rpm-ostree/blob/b66337e0cbd94024ce249c022482d03978db81c1/man/rpm-ostreed.conf.xml#L71. There is also the undocumented ex-stage, which is still experimental for now, though likely to be stabilized soon (at which point it will be added to that man page as stage).

@cgwalters
Copy link
Member Author

supporting logic for maintenance windows and active session detection would better belong in something like locksmith/its successor.

Maybe...we were discussing this in the rpm-ostree ticket too. There are also the "headless IoT" and "desktop" cases.

@bgilbert
Copy link
Contributor

bgilbert commented Jul 24, 2018

I'm wary of having separate reboot flows for the reboot-coordinated and uncoordinated cases, not only because of the code complexity but because it'd be another point of confusion when configuring the system: do I enable rpm-ostree reboots or locksmith reboots? What if I enable both?

@cgwalters How would you feel about moving all reboot handling into whatever is replacing locksmith?

Edit: or at least disabling/hiding the rpm-ostree knob on FCOS.

@dustymabe dustymabe changed the title etcd/locksmith reboot coordination: locksmith successor Sep 20, 2018
@cgwalters
Copy link
Member Author

@cgwalters How would you feel about moving all reboot handling into whatever is replacing locksmith?

Do we see "locksmith2" handling the degenerate case of a single node system? That's one possible approach; if you deploy a single node it skips using etcd and all of that and just talks directly to the local rpm-ostreed.

Edit: or at least disabling/hiding the rpm-ostree knob on FCOS.

Definitely for sure upstream in rpm-ostree will continue to support being completely driven by an external agent using the DBus API. The current rpm-ostree-automatic.service default "agent" will always be easy to disable. The thing that is important to me though is that the agent is visible to admins.

@cgwalters
Copy link
Member Author

If we scope in more than reboot management but actually "channel management" (see #22 ) - then an approach here is for this agent to point rpm-ostree at commit objects rather than refs. That UX is better now and we're going to be using this for RHCOS, where host updates are always cluster-driven.

@lucab
Copy link
Contributor

lucab commented Oct 11, 2018

I'd say that:

  • indeed the idea is to handle the single-node case too (as initially stated)
  • the external DBus agent approach fits well here. I didn't check the currently exposed API, but a level trigger for locksmith2 to consume would be enough (@jlebon later checked and suggested to watch/compare DefaultDeployment)
  • directly pointing to commits sounds fine too (from my shallow knowledge of rpm-ostree). I'm not completely sure about absorbing channel management logic here though (a dedicated component sitting between remote-cincinnati<->rpm-ostreed seems cleaner to me)

@lucab
Copy link
Contributor

lucab commented Feb 21, 2019

Followup with some feedback after an initial exploratory experiment.

In #83 (comment) we discussed keeping the on-host logic to a minimum and moving the etcd semaphore management into a container reachable over HTTP. The latter would be the locksmith successor (locksmith2?), which I tried to explore at https://github.com/lucab/exp-locksmith2.

These are my experimental findings, starting from original locksmith code:

  • locksmith locking code relies on etcdv2-client and Compare-And-Swap primitives. Porting to etcdv3-client is a breaking change which would be better done at the time of this rewrite. CAS primitives don't exist in etcdv3-client and must be replaced with proper transactional logic.
  • locksmith semaphore has few fields that can be trimmed down (some even unused in original code). I think the underlying logic only requires the total number of slots and a list of current lock holders.
  • we should spec out a new protocol between the on-host agent and the semaphore manager (locksmith2). I currently modeled it as two HTTP POSTs for recursive locking/unlocking the semaphore. In both cases, client request can be a small JSON hashmap with few well-known keys.
  • CLI for lock/unlock/status/resize can be provided in the same binary and run by exec-ing into the container (I didn't sketch those as I don't foresee blockers there).
  • The whole thing is stateless, except for the configuration of the service itself.
  • This service would be consumed by the on-host agent, when configured with "update strategy: remote manager". The primitives it offers to the agent is answering to:
    • "(on boot) can this node enter steady-state and look for further updates (UnlockIfHeld)?"
    • "(on available update) can this node proceed to finalize update and reboot (RecursiveLock)?"

I sketched an experimental on-host agent in parallel for double-checking, more followups on this later in #83. Huge thanks to @s-urbaniak for quick-pairing on historical locksmith code ❤️.

@lucab lucab added the meeting topics for meetings label Apr 3, 2019
@lucab
Copy link
Contributor

lucab commented Apr 3, 2019

Out of band request: this "containerized locksmith" replacement is going to manage fleet-wide reboot locks, but it is not carrying over all locksmith functionality. As such, it should have its own proper name which is no linked back to the original "locksmith". We recently went through a similar renaming exercise, so we could just pick a name from the list: coreos/afterburn#126

@bgilbert bgilbert removed the meeting topics for meetings label Apr 3, 2019
@LorbusChris
Copy link
Contributor

relock (short for reboot locker)

@LorbusChris
Copy link
Contributor

airlock I like even more I think 🚀

@arithx
Copy link
Contributor

arithx commented Apr 4, 2019

What about Houston? Keeps with a theme of both cities & space.

@ajeddeloh
Copy link
Contributor

If we go with something at least partially descriptive, I like airlock. If we go with a city I'm actually a big fan of Houston over cities in MA.

@lucab
Copy link
Contributor

lucab commented Apr 5, 2019

Chipping in with a +1 for airlock.

@lucab
Copy link
Contributor

lucab commented Apr 10, 2019

Due diligence for naming: there are no airlock related packages in Fedora/Debian/Arch. Googling for "airlock" and "linux"/"fedora"/"coreos" didn't show anything relevant. The only partial match is a company (and their proprietary IAM solution with the same name).

@bgilbert
Copy link
Contributor

Created coreos/airlock.

@lucab
Copy link
Contributor

lucab commented Jun 13, 2019

The components implementing the two ends of this discussion are up at https://github.com/coreos/zincati and https://github.com/coreos/airlock in a minimum-functionality form (additive non-breaking change will happen on each new iteration).

The only remaining piece of work is closing the loop in zincati with coreos/zincati#37. Closing in favor of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants