Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent duplicate clusters ending up in Terraform state #264

Merged
merged 3 commits into from
Apr 25, 2023

Conversation

onematchfox
Copy link
Collaborator

This PR aims to address the currently flaky invalid cluster tests in a less invasive manner than #263, which requires further thought with regard to the performance implications.

When looking at this in more detail, I identified 3 separate issues that resulted in these tests being flaky.

  • We shouldn't be testing whether attributes are set on tests that are designed to return errors. This is because the order in which the "duplicate" resources are applied is non-deterministic, and so sometimes resource 1 gets applied, and resource 2 causes the error, and other times it is the other way around. Thus we end up with errors such as:
resource_argocd_cluster_test.go:343: Step 1/1, expected an error with pattern, no match on: Check failed: Check 3/3 error: argocd_cluster.cluster_one: Attribute 'name' expected "90kh6rtvum", got "server"
  • Upsert was set to false when calling Create on the cluster client. This meant that we can end up in a situation whereby two Terraform resources are managing the same resource in ArgoCD. This leads to perpetual diffs on these resources and also results in errors like the following when the resources in the tests are subsequently destroyed.
--- FAIL: TestAccArgoCDCluster_invalidSameServer (7.94s)
    resource_argocd_cluster_test.go:277: Step 3/3, expected an error with pattern, no match on: After applying this test step, the plan was not empty.
        stdout:
        
        
        Terraform used the selected providers to generate the following execution
        plan. Resource actions are indicated with the following symbols:
          ~ update in-place
        
        Terraform will perform the following actions:
        
          # argocd_cluster.cluster_with_trailing_slash will be updated in-place
          ~ resource "argocd_cluster" "cluster_with_trailing_slash" {
                id     = "https://kubernetes.default.svc.cluster.local/server"
                name   = "server"
              ~ server = "https://kubernetes.default.svc.cluster.local" -> "https://kubernetes.default.svc.cluster.local/"
                # (1 unchanged attribute hidden)
        
                # (1 unchanged block hidden)
            }
        
        Plan: 0 to add, 1 to change, 0 to destroy.
    testing_new.go:84: Error running post-test destroy, there may be dangling resources: exit status 1
        
        Error: something went wrong during cluster deletion: rpc error: code = PermissionDenied desc = permission denied
        
        rpc error: code = PermissionDenied desc = permission denied
FAIL
  • Finally, the situation above (whereby two resources end up being created in our tests) is the result of a race condition due to the locking pattern in place on the Create method. If both resources are created simultaneously, then both can pass the "distinct cluster" checks covered by the RLock, allowing both resources to be created and end up in Terraform state. So, both the Get and Create calls need to be covered by a full RW lock as per the implementation in argocd_repository_certificate.go.

Checking on applied attributes causes issues since which resources that gets applied is non-deterministic. I.e. we have no way of knowing if `cluster_one` or `cluster_two` gets applied. This results in these test being flaky. For testing purposes it is sufficient to expect an error.
All other resources managed by this provider have `Upsert: false` when creating.  This is because we want an error to occur if a duplicate cluster is created otherwise we can end up with two different Terraform resources managed the same resource in ArgoCD. Existing clusters must be imported into Terraform state.

Change to set Upsert = true was introduced in argoproj-labs#75. PR is in general unrelated to this functionality.
@onematchfox onematchfox requested review from MrLuje and oboukili April 25, 2023 13:22
Copy link
Collaborator

@oboukili oboukili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great finds, kudos!

@onematchfox onematchfox merged commit cb8528d into argoproj-labs:master Apr 25, 2023
@onematchfox onematchfox deleted the fix/cluster-creation branch April 25, 2023 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants