Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Data Source for mongodbatlas_cluster makes terraform hang indefinitely using version 1.0 #521

Closed
devon65 opened this issue Aug 19, 2021 · 11 comments
Labels

Comments

@devon65
Copy link

devon65 commented Aug 19, 2021

Terraform CLI and Terraform MongoDB Atlas Provider Version

Terraform v1.0.4
on darwin_amd64
+ provider registry.terraform.io/mongodb/mongodbatlas v1.0.0

Terraform Configuration File

terraform {
  required_version = ">= 0.14.4"

  required_providers {
    mongodbatlas = {
      source = "mongodb/mongodbatlas"
      version = "= 1.0"
    }
  }
}

provider "mongodbatlas" {}

data "mongodbatlas_cluster" "cluster" {
  project_id   = var.mongodb_project_id
  name         = var.cluster_name
}

output "cluster_name" {
  value = data.mongodbatlas_cluster.cluster.name
}

variable "mongodb_project_id" {}
variable "cluster_name" {}

Steps to Reproduce

  1. Create a MongoDB cluster
  2. Create an API key with access to your cluster
  3. Pass in your cluster's project id, name and API key info
  4. terraform init
  5. terraform plan or terraform apply

Expected Behavior

The terraform should output my cluster's name in a timely manner (within 60 seconds max).

Actual Behavior

The terraform will hang indefinitely, as far as I can tell. I've allowed it to run for over 5 minutes and it still outputs nothing.

Crash Output

I have created a crash.log for this issue, but I couldn't guarantee that I had removed all sensitive data, so I only included an excerpt of after hitting clt-C. I've duplicated this issue on three different computers (two windows and a mac), so it shouldn't be too difficult to get logs from the replication process.
crash-abridged.log

Additional Context

I have tested this in versions 0.9.0 and 0.9.1 and the mongodbatlas_cluster Data Source works as expected in both versions, but not in 1.0

@nikhil-mongo
Copy link
Collaborator

@devon65 Thank you for sharing the details and log. We will go through it and get back to you.

@themantissa
Copy link
Collaborator

Internal ticket INTMDB-247

@themantissa
Copy link
Collaborator

@devon65 can you provide some more details on what type of cluster and network connections you are creating? We believe this is due to a fix we did based on issue #422. The cluster data source was returning without a connection string because it was returning before the string was available. We added in a timeout but if you do not have a privatelink connection it should return quicker. Hence it would be good to know what you are creating and waiting on.

@devon65
Copy link
Author

devon65 commented Aug 24, 2021

When running the terraform posted in the description, the cluster is already made under a separate terraform plan. I'm just trying to retrieve data from the created cluster. Here's the config I used to create the cluster (cluster creation terraform has already succeeded):

resource "mongodbatlas_cluster" "mongo_cluster" {
  project_id   = var.mongodb_project_id
  name         = var.cluster_name
  cluster_type = var.cluster_type
  provider_region_name = var.cluster_az_region

  auto_scaling_disk_gb_enabled = var.cluster_autoscale_disk_space_enabled
  mongo_db_major_version       = var.cluster_mongodb_major_version

  //Provider Settings "block"
  provider_name               = "AZURE"
  provider_disk_type_name     = var.cluster_disk_type
  provider_instance_size_name = var.cluster_instance_size
}

variable "mongodb_project_id" {
    type = string
}

variable "cluster_name" {
    type = string
}

variable "cluster_type" {
    type = string
    default = "REPLICASET"
}

variable "cluster_az_region" {
    type = string
    default = "US_WEST_2"
}

variable "cluster_autoscale_disk_space_enabled" {
    type = bool
    default = false
}

variable "cluster_mongodb_major_version" {
    type = string
    default = "5.0"
}

variable "cluster_disk_type" {
    type = string
    description = "Determines initial memory size of cluster"
    default = "P2"
}

variable "cluster_instance_size" {
    type = string
    description = "Tier size of the cluster instance"
    default = "M10"
}

@devon65
Copy link
Author

devon65 commented Aug 24, 2021

Another thing I noticed (a separate bug that I still need to create an issue for) is that the container_id return value for the cluster datasource is declared, but never set in the data_source_mongodbatlas_cluster.go file. At least, that's what it looks like. I'm new to go, so I could be wrong, but in the resource_mongodbatlas_cluster.go file, the container_id field is declared and set later.

If your wait command is waiting for all fields to get set in the cluster datasource, it will wait indefinitely for the container_id field to be set. If it's just waiting for the connection string to return, then I'm completely wrong and you can ignore this comment 👌

@nicolas-nannoni
Copy link

I can confirm I see the same behaviour since 1.0.0 with an identical setup. A cluster and PrivateLink endpoint created in a Terraform module (weeks ago), that is then used as data source in another Terraform module used to create a simple database user will hang for about 3 minutes before returning the expected response.

In the trace logs, I see my cluster's data being returned fast, and that contains all the connection strings I want (including the PrivateLink ones). I then see this line:

2021-08-24T13:37:42.063-0700 [INFO]  provider.terraform-provider-mongodbatlas_v1.0.0: 2021/08/24 13:37:42 [DEBUG] MongoDB Atlas API Response Details:
[cluster config]
2021-08-24T13:37:42.064-0700 [INFO]  provider.terraform-provider-mongodbatlas_v1.0.0: 2021/08/24 13:37:42 [DEBUG] Waiting for state to become: [PRIVATE_ENDPOINTS_EXISTS NORMAL]: timestamp=2021-08-24T13:37:42.064-0700

And then following lines that keep on being printed every 5 seconds until the plan is finally made:

2021-08-24T13:38:14.965-0700 [TRACE] dag/walk: vertex "module.atlas.provider[\"registry.terraform.io/mongodb/mongodbatlas\"] (close)" is waiting for "module.atlas.data.mongodbatlas_cluster.cluster (expand)"
2021-08-24T13:38:16.245-0700 [TRACE] dag/walk: vertex "module.atlas.aws_ssm_parameter.root_password (expand)" is waiting for "module.atlas.local.uri_with_creds (expand)"
2021-08-24T13:38:16.524-0700 [TRACE] dag/walk: vertex "module.atlas.local.uri (expand)" is waiting for "module.atlas.data.mongodbatlas_cluster.cluster (expand)"
2021-08-24T13:38:16.524-0700 [TRACE] dag/walk: vertex "module.atlas (close)" is waiting for "module.atlas.aws_ssm_parameter.root_password (expand)"
2021-08-24T13:38:16.524-0700 [TRACE] dag/walk: vertex "module.atlas.local.uri_with_creds (expand)" is waiting for "module.atlas.local.uri (expand)"
2021-08-24T13:38:16.525-0700 [TRACE] dag/walk: vertex "meta.count-boundary (EachMode fixup)" is waiting for "module.atlas (close)"
2021-08-24T13:38:16.525-0700 [TRACE] dag/walk: vertex "root" is waiting for "provider[\"registry.terraform.io/hashicorp/aws\"] (close)"
2021-08-24T13:38:16.525-0700 [TRACE] dag/walk: vertex "provider[\"registry.terraform.io/hashicorp/aws\"] (close)" is waiting for "module.atlas.aws_ssm_parameter.root_password (expand)"

(...)

2021-08-24T13:40:42.067-0700 [INFO]  provider.terraform-provider-mongodbatlas_v1.0.0: 2021/08/24 13:40:42 [DEBUG] MongoDB Atlas API Request Details:
2021-08-24T13:40:43.596-0700 [INFO]  provider.terraform-provider-mongodbatlas_v1.0.0: 2021/08/24 13:40:43 [DEBUG] MongoDB Atlas API Response Details:
[cluster config (identical to the one retrieved 3 minutes earlier)]

The same modules were working fine and were fast before 1.0.0.

@themantissa
Copy link
Collaborator

themantissa commented Aug 24, 2021

@nicolas-nannoni thank you for the additional context and @devon65 as well. It feels like a regression as noted, along with the improvement. I'll have the team pursue further.

@devon65
Copy link
Author

devon65 commented Aug 24, 2021

@nicolas-nannoni You mentioned that you tested it with a cluster that has PrivateLink connection strings. My cluster doesn't have any privateLink Connection strings. By coincidence, I've been messing around with the private link stuff today, and it turns out that the data source will return (after 3 minutes) when there are PrivateLinks present in the cluster, but it will hang indefinitely if the cluster has no PrivateLinks present.

I took a look at the fix that helped to wait for the connection strings, and it looks like that code change is waiting for the PrivateLinks connection strings, which is an empty list whether the cluster has PrivateLinks or not.

At least, that's how I understood the code from a quick glance. Once again, I'm new to Go, so feel free to correct me if I'm wrong.

@devon65
Copy link
Author

devon65 commented Aug 24, 2021

If the data source waits 3 minutes each time, it could be good to have an optional "wait_for_connection_strings" flag.

@themantissa
Copy link
Collaborator

We have a pre-release of 1.0.1 ready - we'll release the GA version tomorrow. If you have time to try it out before then it's here: https://github.com/mongodb/terraform-provider-mongodbatlas/releases/tag/v1.0.1-pre.1

@themantissa
Copy link
Collaborator

Fixed in recent release 1.0.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants