qhub destroy using targets #948

danlester · 2021-11-30T13:31:22Z

Problem

There are many reasons why qhub destroy fails, including timeouts in terraform which are hard to set globally.

Sometimes the Kubernetes cluster starts to be destroyed before Terraform has formally had a chance to destroy some of the software deployed within Kubernetes, resulting in the error that the cluster is not accessible to destroy that software.

Since the Keycloak provider needs to make API calls to Keycloak within the cluster, terraform command can fail terminally if the provider is configured with a Keycloak URL that no longer exists.

Current Solution

In this PR I have changed qhub destroy just to run the reverse of the targetted qhub deploy stages that we already have. One difficulty is in listing out all items that need to be destroyed at each stage - since deploy's final stage is just 'everything else', we need to maintain a long list of all items. I don't believe my list is complete, but it doesn't really matter - the destruction of Kubernetes should destroy everything anyway, and removing most items from the cluster in a separate stage gives breathing room so there is less to remove during the cluster destroy.

The main thing that we need to avoid telling Terraform to destroy e.g. "Kubernetes" and "keycloak-configuration" in the same stage, which of course happens when we just run a straight terraform destroy with no targetting.

An alternative approach to all this may have been to strengthen the 'depends_on' tree but this is split across multiple files and is still likely to run into the timeout problem.

Note it is not possible to 'refresh state' once Keycloak is inaccessible. We should consider removing the terraform refresh command from the destroy procedure anyway.

It is possible to run the old style of qhub destroy without targets by using the flag --full-only.

Types of changes

What types of changes does your code introduce?

Put an x in the boxes that apply

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds a feature)
Breaking change (fix or feature that would cause existing features to not work as expected)
Documentation Update
Code style update (formatting, renaming)
Refactoring (no functional changes, no API changes)
Build related changes
Other (please describe):

Testing

Requires testing

Yes
No

In case you checked yes, did you write tests?

Yes
No

iameskild

LGTM

qhub destroy using targets

fbd4fa5

danlester requested a review from iameskild November 30, 2021 13:31

danlester mentioned this pull request Nov 30, 2021

Split infrastructure into components #847

Closed

iameskild approved these changes Nov 30, 2021

View reviewed changes

iameskild merged commit a5d0190 into main Nov 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qhub destroy using targets #948

qhub destroy using targets #948

danlester commented Nov 30, 2021

iameskild left a comment

qhub destroy using targets #948

qhub destroy using targets #948

Conversation

danlester commented Nov 30, 2021

Problem

Other Solutions

Current Solution

Types of changes

Testing

Requires testing

In case you checked yes, did you write tests?

iameskild left a comment

Choose a reason for hiding this comment