-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(upgrader): introduce disaster recovery request operation #461
base: main
Are you sure you want to change the base?
Conversation
.recovery_requests | ||
.is_empty() | ||
{ | ||
ic_cdk::trap("upgrader cannot be upgraded due to pending disaster recovery requests") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you recover from this state? The only place where recovery_requests
is mutated is in evaluate_requests
, but that only does it if the recovery rule is met. It does call retain
on the requests to clean up the expired ones, but does not actually save it unless the requests meet evaluation.
Also at this point the incompatible stable memory is deserialized, no? So in case the vec is not empty, this has already trapped, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you recover from this state?
The expectation is that the upgrader canister is not itself upgraded if there are pending disaster recoveries.
Also at this point the incompatible stable memory is deserialized, no? So in case the vec is not empty, this has already trapped, no?
That's true, the error message indeed refers to failed deserialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expectation is that the upgrader canister is not itself upgraded if there are pending disaster recoveries.
But it can happen that a request is there for some reason, and then the upgrader can no longer be upgraded until the next recovery request passes.
That's true, the error message indeed refers to failed deserialization.
What I mean is that maybe the is_empty()
call already traps and the subsequent trap is ineffective as execution never gets there. I don't know if is_empty
already deserializes something incompatible, that depends on the internals of the stable memory library. Maybe adopt the custom serde deserializer-based migration from the station? And then the requests can safely be migrated.
This PR introduces a disaster recovery request operation type in the upgrader canister (to later support more operation types beyond the existing install code operation type).
Because this is a breaking change for the upgrader's stable memory layout, the post-upgrade hook prevents an upgrade if there are pending disaster recovery requests (that would fail to deserialize after refactoring their types in this PR):