Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch signals when running terraform CLI commands #231

Merged
merged 2 commits into from
Jan 24, 2024

Conversation

bobh66
Copy link
Collaborator

@bobh66 bobh66 commented Jan 15, 2024

Description of your changes

This change executes the terraform CLI command inside a wrapper that will terminate the CLI command if the context terminates via timeout or signal before the command finishes.

This will help with handling provider shutdown cleanly but it may also leave the Workspace in an unknown state if the running command was apply or destroy.

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

This code needs to soak in a test environment for a while but I wanted to get the changes reviewed in the mean time. I'll take the PR out of WIP when the changes have been tested locally.

Test procedure:

  • Create a Workspace that has a terraform time_sleep resource with create_timeout set to 5 minutes and uses the terraform kubernetes backend
  • While tailing the pod log apply the workspace and observe that the reconcile of the workspace starts
  • Retrieve the list of kubernetes lease resources and observe that there is a lease for the specified workspace
  • Terminate the provider pod and observe in the logs that the process gets terminated. Observe that the Lease for the workspace has been deleted.

Copy link
Contributor

@ulucinar ulucinar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bobh66,
Thanks for the PR! Left some minor nits and I believe we are good to go when you complete your tests.

I also did some tests with a simple configuration on linux AMD64 & darwin ARM64 systems with the terraform CLI and the time_resource. My observation is that when the Terraform CLI receives the INT signal, it immediately terminates and cleans up the state lock:

terraform {
  required_providers {
    time = {
      source = "hashicorp/time"
      version = "0.10.0"
    }
  }
}

resource "time_sleep" "wait_30_seconds" {
  create_duration = "300s"
}

After we merge this PR, let's also consider putting the async mode of reconciliation for the provider into our roadmap.

internal/terraform/terraform.go Outdated Show resolved Hide resolved
internal/terraform/terraform.go Outdated Show resolved Hide resolved
internal/terraform/terraform.go Outdated Show resolved Hide resolved
Signed-off-by: Bob Haddleton <[email protected]>
@bobh66 bobh66 changed the title WIP: Catch signals when running terraform CLI commands Catch signals when running terraform CLI commands Jan 24, 2024
@bobh66
Copy link
Collaborator Author

bobh66 commented Jan 24, 2024

I ran the same tests and it does clean up the lock, however the crossplane.io/external-create-pending annotation is left on the Workspace so reconciliation will still be blocked until someone manually removes the annotation.

@bobh66 bobh66 merged commit 8ea3c88 into upbound:main Jan 24, 2024
9 checks passed
@bobh66
Copy link
Collaborator Author

bobh66 commented Jan 24, 2024

After we merge this PR, let's also consider putting the async mode of reconciliation for the provider into our roadmap.

How does async apply to running the CLI command? It seems like we would still be CPU-bound even if we are running the command async.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants