-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update-agent: rework steady/polling state #233
Conversation
@@ -135,7 +145,7 @@ impl UpdateAgent { | |||
} | |||
|
|||
/// Initialize the update agent. | |||
fn initialize(&mut self) -> ResponseActFuture<Self, Result<(), ()>> { | |||
fn tick_initialize(&mut self) -> ResponseActFuture<Self, Result<(), ()>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the naming changes? Is it to represent that these functions are directly called by the tick handler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes (but mostly: a drive-by change to make code easier to skim/grep).
/// Node steady, agent allowed to check for updates. | ||
ReportedSteady, | ||
/// No further updates available yet. | ||
NoNewUpdate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bikeshed: WDYT about naming the NoNewUpdate
one with the steady keyword since it's likely going to be the stable/steady state in which the agent will mostly sit in? And maybe instead of ReportedSteady
, something like ReadyToCheck
? Makes it a little more obvious too that it hasn't checked for updates yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I'm a dummy so I have already coupled the meaning of "steady" with
Line 25 in 6bad305
static V1_STEADY_STATE: &str = "v1/steady-state"; |
So, to avoid confusion I'd rather stick with the definition of "steady" as "we released the lock, marking the current release green".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is coreos/airlock#1 still open for change? Would it make sense to change the specs here so that we only actually release the lock once we've also been able to check for an update? It intuitively makes sense to me to do that, but I might be missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is coreos/airlock#1 still open for change?
Not really anymore.
Would it make sense to change the specs here so that we only actually release the lock once we've also been able to check for an update?
I don't think so. We don't want to couple the "did the node reboot" question with the "is Cincinnati-service ok" question inside Zincati. Plus, users can enforce stronger pre-conditions via https://github.com/coreos/zincati/blob/master/dist/systemd/system/zincati.service#L9-L10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh OK right. I'd like to chat about this some more, but I don't think it blocks this PR as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forwarded to #239.
src/update_agent/mod.rs
Outdated
#[allow(dead_code)] | ||
fn discriminant(&self) -> u8 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually need this function? The only place that was using it is now gone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, dropped. I was planning to use it for checking the "from" state on transition, but I realized I cannot do it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-note: one approach to explore here is to statically enforce allowed transitions via generics, like https://hoverbear.org/blog/rust-state-machine-pattern/#generically-sophistication.
Rationale overall makes sense to me otherwise! |
This moves metrics logic for the agent, unifying it into a single place.
This splits the "reported steady" and "checked, but no updates available" states, making it easier to track and report upgrade progress.
b91be90
to
878a1a1
Compare
Rebased. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This splits the "reported steady" and "checked, but no updates
available" states, making it easier to track and report upgrade
progress.
Closes: #229