Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(k8s): show Helm events and logs #6626

Merged
merged 1 commit into from
Nov 14, 2024
Merged

Conversation

eysi09
Copy link
Collaborator

@eysi09 eysi09 commented Nov 7, 2024

What this PR does / why we need it:

Previously we would use Garden's own resource monitoring to monitor the
health of a Helm install/upgrade and fail fast if one of the resources
was unhealthy as well as show K8s events and Pod logs.

In commit 7a68373 we replaced that with Helm's own --wait flag.

The reason for that change was because a user reported issues with
Garden returning early from the Helm command in the success case
(see #6078 and #6053 for more details).

The problem with that change is that since we weren't using our own
resource monitoring we stopped showing events and logs when Helm
installs/upgrades fail. Another problem is that Garden would now wait
for the Helm command to complete which in the case of unhealthy
resources means Helm will timeout, with a default of 300 seconds.

This commit fixes that and we try to go for the best of both worlds:

  • We always use the --wait flag but also monitor resources at the same time
  • If the resources are healthy we wait for the Helm command to
    complete (this was the intent with 7a68373)
  • If we detect unhealthy resources we fail fast (as we did before on the
    same major version)
  • We add a flag to overwrite the fail fast behaviour in case a user
    might prefer that

Which issue(s) this PR fixes:

Fixes #6563

Special notes for your reviewer:

@eysi09 eysi09 force-pushed the helm-show-events-and-logs branch 12 times, most recently from 53e6d26 to ac842ca Compare November 12, 2024 14:49
Previously we would use Garden's own resource monitoring to monitor the
health of a Helm install/upgrade and fail fast if one of the resources
was unhealthy as well as show K8s events and Pod logs.

In commit 7a68373 we replaced that with Helm's own `--wait` flag.

The reason for that change was because a user reported issues with
Garden returning early from the Helm command in the success case
(see #6078 and #6053 for more details).

The problem with that change is that since we weren't using our own
resource monitoring we stopped showing events and logs when Helm
installs/upgrades fail. Another problem is that Garden would now wait
for the Helm command to complete which in the case of unhealthy
resources means Helm will timeout, with a default of 300 seconds.

This commit fixes that and we try to go for the best of both worlds:

- We always use the `--wait` flag but also monitor resources at the same time
- If the resources are healthy we wait for the Helm command to
  complete (this was the intent with 7a68373)
- If we detect unhealthy resources we fail fast (as we did before on the
  same major version)
- We add a flag to overwrite the fail fast behaviour in case a user
  might prefer that
@eysi09 eysi09 force-pushed the helm-show-events-and-logs branch from ac842ca to 73836bb Compare November 12, 2024 15:27
@eysi09 eysi09 marked this pull request as ready for review November 12, 2024 15:55
@eysi09 eysi09 requested review from thsig and twelvemo November 12, 2024 15:55
Copy link
Collaborator

@twelvemo twelvemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I agree now, it does combine the best of both worlds (in terms of waiting for resources deployed via helm).

Waiting for the timeout can take awhile so using the default value here is recommended unless you'd like to completely mimic Helm's behaviour and not rely on Garden's resource monitoring.

Note that setting \`atomic\` to \`true\` implies \`waitForUnhealthyResources\`.
`),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great explanation!

log.warn(errorMsg)
log.debug({ error: toGardenError(error) })
}
helmArgs = ["upgrade", releaseName, ...reference, "--install", ...commonArgs]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on always using upgrade --install

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we also did this before, so know change here.

@vvagaytsev vvagaytsev changed the title Helm show events and logs feat(helm): show Helm events and logs Nov 13, 2024
@vvagaytsev vvagaytsev changed the title feat(helm): show Helm events and logs fix(k8s): show Helm events and logs Nov 13, 2024
@vvagaytsev vvagaytsev changed the title fix(k8s): show Helm events and logs feat(k8s): show Helm events and logs Nov 13, 2024
@eysi09 eysi09 added this pull request to the merge queue Nov 14, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Nov 14, 2024
@vvagaytsev vvagaytsev added this pull request to the merge queue Nov 14, 2024
Merged via the queue into main with commit 0f7bf25 Nov 14, 2024
42 checks passed
@vvagaytsev vvagaytsev deleted the helm-show-events-and-logs branch November 14, 2024 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE]: Enhancement of Helm deployment status in Garden CLI
3 participants