Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add in-place upgrades #88

Merged
merged 15 commits into from
Mar 5, 2024
Merged

Add in-place upgrades #88

merged 15 commits into from
Mar 5, 2024

Conversation

carlcsaposs-canonical
Copy link
Contributor

Copy link
Contributor

@shayancanonical shayancanonical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me at a high level (can confirm it looks sound without actually manually trying it out). Left a couple nits/questions - please respond as you see appropriate

src/abstract_charm.py Show resolved Hide resolved
if self._upgrade.in_progress:
logger.warning(
"Modifying relations during an upgrade is not supported. The charm may be in a broken, unrecoverable state. Re-deploy the charm"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we bubble this message up to the charm status?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a status that can be acted upon (unless we set blocked and stop the charm from executing)—so that's why I used a log

Also in most upgrades it shouldn't be an issue—I added a log so that if we get a bug report in one of the cases where it will be an issue that we don't waste time debugging a non-supported usage

src/upgrade.py Outdated Show resolved Hide resolved
Copy link
Contributor

@paulomach paulomach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

partial comments

# installation/setup
self._upgrade.set_versions_in_app_databag()

def reconcile(self, event=None) -> None: # noqa: C901
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you break this function (instead of noqa), it's hard to follow when you are not constantly looking at the code.
Maybe split into a function the first block that's evaluating the upgrade states
or move VM/k8s specific tests to the concrete class, where feasible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's easier to read in a single function (since all of this logic is fairly tightly coupled—so any splitting of it would hide the coupling and make explicit assumptions implicit)

I personally think the noqa is fine because ops is forcing this pattern—IMO the "reconcile" function should just be top-level code in charm.py (and a linter wouldn't mind that), but it has to go into a function because of ops—hence the noqa for just this function


def reconcile(self, event=None) -> None: # noqa: C901
"""Handle most events."""
if not self._upgrade:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why return on lacking self._upgrade but had to check implementation. Can you add a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you want me to add a comment about? I think I'm missing what is unclear

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear that not self._upgrade is related to the peer upgrade relation not existing, as per the log message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo the log message is enough—no need to duplicate the log message in a comment

lmk if you disagree and what comment you want me to add

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like

Suggested change
if not self._upgrade:
if not self._upgrade:
# upgrade property is not set while peer upgrade relation not available


def reconcile(self, event=None) -> None: # noqa: C901
"""Handle most events."""
if not self._upgrade:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear that not self._upgrade is related to the peer upgrade relation not existing, as per the log message.

src/machine_upgrade.py Show resolved Hide resolved
carlcsaposs-canonical added a commit to canonical/mysql-router-k8s-operator that referenced this pull request Nov 2, 2023
@carlcsaposs-canonical
Copy link
Contributor Author

@carlcsaposs-canonical todo: sync upgrade.py with k8s (charm versioning with git hash compatibility check)

Copy link
Contributor

@taurus-forever taurus-forever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuck for the unpolite amount of time as we have been focused on PostgreSQL customers launches. It is a time to share some love with MySQL Router VM as well and move from here.

I have made several quick tests and was not able to reproduce issues I was in Autumn... I will continue testing in edge, but it is a time to merge this to unblock COS merges.

Carl, thank you for the great work here and my apologies to long queued merge here! Let's bring M-R-VM to K8s quality level!

if self._upgrade.unit_state == "restarting": # Kubernetes only
if not self._upgrade.is_compatible:
logger.info(
"Upgrade incompatible. If you accept potential *data loss* and *downtime*, you can continue with `resume-upgrade force=true`"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference in resume-upgrade force=true and force-upgrade action?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resume-upgrade force=true is k8s-only
force-upgrade is vm-only

this was done to make implementation more robust & significantly simpler (tried to keep consistent UX as much as possible, but this difference is a reflection of how the vm/k8s upgrade process is significantly different)

@carlcsaposs-canonical carlcsaposs-canonical merged commit 55c5e91 into main Mar 5, 2024
15 of 16 checks passed
@carlcsaposs-canonical carlcsaposs-canonical deleted the upgrade branch March 5, 2024 09:44
carlcsaposs-canonical added a commit to canonical/opensearch-operator that referenced this pull request Apr 15, 2024
carlcsaposs-canonical added a commit to canonical/opensearch-operator that referenced this pull request Apr 17, 2024
carlcsaposs-canonical added a commit to canonical/opensearch-operator that referenced this pull request Apr 18, 2024
carlcsaposs-canonical added a commit to canonical/opensearch-operator that referenced this pull request Apr 22, 2024
carlcsaposs-canonical added a commit to canonical/opensearch-operator that referenced this pull request Apr 23, 2024
carlcsaposs-canonical added a commit to canonical/opensearch-operator that referenced this pull request Apr 29, 2024
Ported from canonical/mysql-router-operator#88

upgrade.py, machine_upgrade.py and event handlers in charm.py are mostly
shared with upgrade.py, machine_upgrade.py and event handlers in
abstract_charm.py/machine_charm.py in MySQL Router VM charm

Non-upgrade event handler guards implemented per
https://docs.google.com/spreadsheets/d/1_qjINUoj5Mi7PlXhCdpwQ0170bIZSgpvNFMAnTOtuQw/edit#gid=0

---------

Co-authored-by: phvalguima <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants