Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: report status when stack is locked #807

Merged
merged 8 commits into from
Feb 8, 2025

Conversation

rquitales
Copy link
Member

@rquitales rquitales commented Feb 6, 2025

Proposed changes

  • Surfaces Locked Stack Errors:

    • The agent server now returns a structured response instead of an error when the Pulumi CLI returns a 409 (Conflict) error.
    • Clients can now determine well known Pulumi CLI errors through the PulumiErrorInfo protobuf message
  • Improves Stack CR Status Updates:

    • The error message from UpdateCR.status.message is now correctly propagated to StackCR.status.lastUpdate.message.
    • Ensures locked stack errors are surfaced in the Stack CR's status subresource.

Example Stack CR Status Block

status:
    conditions:
    - lastTransitionTime: "2025-02-06T00:11:09Z"
      message: reconciliation is in progress
      reason: NotReadyInProgress
      status: "False"
      type: Ready
    - lastTransitionTime: "2025-02-06T00:11:09Z"
      message: 4 update failure(s)
      reason: RetryingAfterFailure
      status: "True"
      type: Reconciling
    lastUpdate:
      failures: 4
      generation: 4
      lastAttemptedCommit: sha256:f335a9e0bc445b0dbe3187371f56017bcdd66e23b68c6eda54910eeb48d5e3a0
      lastResyncTime: "2025-02-06T00:26:33Z"
      lastSuccessfulCommit: sha256:f335a9e0bc445b0dbe3187371f56017bcdd66e23b68c6eda54910eeb48d5e3a0
      message: Another update is currently in progress
      name: nginx-stack-194d8a6b139
      state: failed
      type: up
    observedGeneration: 4
    outputs:
      availableReplicas: 1

Example Update CR Status Block

status:
  conditions:
  - lastTransitionTime: "2025-02-06T00:26:33Z"
    message: ""
    observedGeneration: 1
    reason: Complete
    status: "False"
    type: Progressing
  - lastTransitionTime: "2025-02-06T00:26:33Z"
    message: Another update is currently in progress
    observedGeneration: 1
    reason: StackLocked
    status: "True"
    type: Failed
  - lastTransitionTime: "2025-02-06T00:26:33Z"
    message: ""
    observedGeneration: 1
    reason: Updated
    status: "True"
    type: Complete
  endTime: "1970-01-01T00:00:00Z"
  message: Another update is currently in progress
  observedGeneration: 1
  startTime: "1970-01-01T00:00:00Z"

Testing

  • Added envtests to validate that statuses are correctly surfaced.
  • Manually validated on a GKE cluster.

Related Issues

Fixes: #806
Fixes: #736

@rquitales
Copy link
Member Author

rquitales commented Feb 6, 2025

@rquitales rquitales force-pushed the rquitales/report-locked-status branch from 7eeb043 to 4e1abcd Compare February 6, 2025 00:48
@rquitales rquitales force-pushed the rquitales/authz-discard-pod-fix branch from a169516 to 7bf4d36 Compare February 6, 2025 01:36
@rquitales rquitales force-pushed the rquitales/report-locked-status branch from 4e1abcd to 4f7e4fc Compare February 6, 2025 01:36
@rquitales rquitales force-pushed the rquitales/authz-discard-pod-fix branch from 7bf4d36 to b348133 Compare February 6, 2025 18:16
@rquitales rquitales force-pushed the rquitales/report-locked-status branch from 4f7e4fc to b1248c5 Compare February 6, 2025 18:16
@rquitales rquitales requested a review from EronWright February 6, 2025 18:44
@rquitales rquitales self-assigned this Feb 6, 2025
@rquitales rquitales marked this pull request as ready for review February 6, 2025 18:44
@rquitales rquitales force-pushed the rquitales/report-locked-status branch from b1248c5 to 5c17922 Compare February 6, 2025 18:49
@rquitales rquitales force-pushed the rquitales/authz-discard-pod-fix branch from b348133 to 2284f6a Compare February 6, 2025 22:24
@rquitales rquitales force-pushed the rquitales/report-locked-status branch from 5c17922 to ce6238a Compare February 6, 2025 22:24
@rquitales rquitales force-pushed the rquitales/authz-discard-pod-fix branch from 2284f6a to 54c5d23 Compare February 6, 2025 23:10
@rquitales rquitales force-pushed the rquitales/report-locked-status branch from ce6238a to 4353901 Compare February 6, 2025 23:11
@rquitales rquitales force-pushed the rquitales/authz-discard-pod-fix branch from 54c5d23 to f0e65e8 Compare February 6, 2025 23:20
@rquitales rquitales force-pushed the rquitales/report-locked-status branch from 4353901 to ee8b49d Compare February 6, 2025 23:20
@rquitales rquitales force-pushed the rquitales/authz-discard-pod-fix branch from f0e65e8 to 165c95d Compare February 6, 2025 23:23
@rquitales rquitales force-pushed the rquitales/report-locked-status branch from ee8b49d to 71a0984 Compare February 6, 2025 23:23
@rquitales rquitales changed the base branch from rquitales/authz-discard-pod-fix to master February 6, 2025 23:34
@rquitales rquitales force-pushed the rquitales/report-locked-status branch from 71a0984 to 6b85dd2 Compare February 6, 2025 23:34
Copy link

codecov bot commented Feb 6, 2025

Codecov Report

Attention: Patch coverage is 54.79452% with 33 lines in your changes missing coverage. Please review.

Project coverage is 51.48%. Comparing base (e46b376) to head (a3788c0).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...ator/internal/controller/auto/update_controller.go 64.00% 17 Missing and 1 partial ⚠️
agent/pkg/server/pulumi_errors.go 35.71% 8 Missing and 1 partial ⚠️
agent/pkg/server/server.go 25.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #807      +/-   ##
==========================================
+ Coverage   51.38%   51.48%   +0.10%     
==========================================
  Files          30       31       +1     
  Lines        4231     4296      +65     
==========================================
+ Hits         2174     2212      +38     
- Misses       1870     1895      +25     
- Partials      187      189       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

agent/pkg/server/server.go Outdated Show resolved Hide resolved
agent/pkg/proto/agent.proto Outdated Show resolved Hide resolved
operator/internal/controller/auto/update_controller.go Outdated Show resolved Hide resolved
operator/internal/controller/auto/update_controller.go Outdated Show resolved Hide resolved
@rquitales rquitales force-pushed the rquitales/report-locked-status branch from 6b85dd2 to c19278f Compare February 7, 2025 23:14
@@ -612,6 +634,48 @@ func (s streamReader[T]) Result() (result, error) {
return res, fmt.Errorf("didn't receive a result")
}

// setStatusFromGRPCErr sets the Update object status blocks based on the result of a gRPC error.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the core logic for determining if we have a structured error, and uses it for surfacing information about Pulumi errors back to the user.

agent/pkg/server/pulumi_errors.go Show resolved Hide resolved
Copy link
Contributor

@EronWright EronWright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really good

agent/pkg/server/pulumi_errors.go Outdated Show resolved Hide resolved
operator/internal/controller/auto/update_controller.go Outdated Show resolved Hide resolved
@rquitales rquitales force-pushed the rquitales/report-locked-status branch from c19278f to a3788c0 Compare February 8, 2025 00:24
Copy link
Contributor

@EronWright EronWright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rquitales rquitales merged commit 43cd638 into master Feb 8, 2025
11 checks passed
@rquitales rquitales deleted the rquitales/report-locked-status branch February 8, 2025 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update CR failure message is not surfaced to Stack CR object Report status when stack is locked
2 participants