Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud NAT configuration changes incorrectly force new resource #4081

Closed
dijitali opened this issue Jul 23, 2019 · 6 comments · Fixed by GoogleCloudPlatform/magic-modules#2103
Labels

Comments

@dijitali
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
  • If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.

Terraform Version

Terraform v0.12.3

Affected Resource(s)

  • provider.google ~> 2.9.1

Terraform Configuration Files

I have a module that create the following resource and allows enabling/disabling of the Cloud NAT logging via a variable

variable "enable_nat_logging" {
  description = "Whether to export logs"
  type        = bool
  default     = false
}

variable "nat_logging_filter" {
  description = "Specifies the desired filtering of logs on this NAT"
  type        = string
  default     = "ALL"
}

resource "google_compute_router_nat" "advanced-nat" {
  name                               = "nat-1"
  router                             = "${google_compute_router.router.name}"
  region                             = "us-central1"
  nat_ip_allocate_option             = "MANUAL_ONLY"
  nat_ips                            = ["${google_compute_address.address[*].self_link}"]
  source_subnetwork_ip_ranges_to_nat = "LIST_OF_SUBNETWORKS"
  subnetwork {
    name                    = "${google_compute_subnetwork.default.self_link}"
    source_ip_ranges_to_nat = ["ALL_IP_RANGES"]
  }
  log_config {
    filter = var.nat_logging_filter
    enable = var.enable_nat_logging
  }
}

Debug Output

Not relevant.

Panic Output

Not relevant.

Expected Behavior

If I change the value of enable_nat_logging, the Cloud NAT is updated in-place and logs are enabled without the Cloud NAT being recreated.

Actual Behavior

The change to items in the log_config block forces a deletion and recreation:

      + log_config { # forces replacement
          + enable = true # forces replacement
          + filter = "ALL" # forces replacement
        }

Steps to Reproduce

  1. terraform plan -var 'enable_nat_logging=false'
  2. terraform plan -var 'enable_nat_logging=true'

Important Factoids

The same logging change can be enabled and disiabled via the gcloud CLI (and web console) without recreating the Cloud NAT resource:

gcloud compute routers nats update --enable-logging --router my-router --router-region europe-west1 my-nat
gcloud compute routers nats update --no-enable-logging --router my-router --router-region europe-west1 my-nat

References

@ghost ghost added the bug label Jul 23, 2019
@dijitali
Copy link
Author

This issue actually appears to affect more resource attributes than just the log_config block.

In testing, I used the gcloud CLI to enable logging on my Cloud NAT router and it appears to have also modified some of the other port settings from the default that Terraform sets, such as icmp_idle_timeout_sec (see below).

Given that I was able to modify these with the gcloud CLI in testing without recreating the Cloud NAT resource, then Terraform should be able to do the same and these changes should be possible in-place without forcing replacement.

-/+ resource "google_compute_router_nat" "vpc_nat" {
      - icmp_idle_timeout_sec              = 30 -> null # forces replacement
      ~ id                                 = "<OBSCURED FOR PRIVACY>" -> (known after apply)
      - min_ports_per_vm                   = 64 -> null # forces replacement
        name                               = "<OBSCURED FOR PRIVACY>"
        nat_ip_allocate_option             = "MANUAL_ONLY"
        nat_ips                            = [
            "https://www.googleapis.com/compute/v1/projects/<OBSCURED FOR PRIVACY>addresses/<OBSCURED FOR PRIVACY>",
        ]
        project                            = "<OBSCURED FOR PRIVACY>"
        region                             = "<OBSCURED FOR PRIVACY>"
        router                             = "<OBSCURED FOR PRIVACY>"
        source_subnetwork_ip_ranges_to_nat = "<OBSCURED FOR PRIVACY>"
      - tcp_established_idle_timeout_sec   = 1200 -> null # forces replacement
      - tcp_transitory_idle_timeout_sec    = 30 -> null # forces replacement
      - udp_idle_timeout_sec               = 30 -> null # forces replacement

@dijitali dijitali changed the title Cloud NAT Log Configuration incorrectly forces new resource Cloud NAT configuration changes incorrectly force new resource Jul 23, 2019
@emilymye
Copy link
Contributor

This is unfortunately a bit of a misleading message. NATs are what we call a "fine-grained resource", which is simply a resource type that only exists in the provider. It describes a group of subfields on a parent resource that is an actual GCP resource (in this case a Cloud Router). In this case, when we "delete-recreate" a nat, what actually is happening is that we delete the NAT from the router's list of NATs and then add the new NAT. The actual GCP router isn't being recreated.

We could technically do a patch to make it look like an update in Terraform but I don't really think the actual behavior in GCP (patch a list with a new list) will be very different than removing an object from the list and then adding a new one.

@dijitali
Copy link
Author

Thanks for the prompt reply @emilymye!

Something here doesn't feel quite right though. If updating the Cloud NAT configuration (enabling logging for example) does result in unavailability of the Cloud NAT (or Cloud Router) while changes are applied, then it sounds correct that Terraform recreates the resource.

However, if that's the case, how/why is it possible to do so with the gcloud CLI or Web Console without any warning that it will result in brief loss of service?

In our setup we have constant application traffic using the Cloud NAT functionality. If it can't be updated in-place, then this needs probably needs to be raised separately so that it's documented in the gcloud CLI documentation and the Cloud NAT documentation.

Conversely, if it does not result in downtime, then Terraform should reflect that and update in-place. Even if the apply will only take a short time, surely the separate deletion/creation operations has the potential for Terraform to crash or lose network connection between the two resulting in the NAT being unavailable for a longer period?

@emilymye
Copy link
Contributor

We can certainly edit it to be a patch and I'll going to go ahead and do that. In the meantime, if you are worried about this, you can probably use

lifecycle {
  create_before_destroy = true
}

to avoid downtime.

@dijitali
Copy link
Author

Thanks @emilymye - I had a try using create_before_destroy but unfortunately it looks like that would quickly add complexity because I link the google_compute_router_nat resource to the google_compute_router resource and also multiple google_compute_address resources and I'd need to add random suffixes to give unique naming for all of these during the operation.

It's not a blocking issue for me right now: I can make this specific change manually and let terraform state sync correctly on the next apply. Happy to wait for the changes you mentioned in the meantime.

@ghost
Copy link

ghost commented Sep 7, 2019

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Sep 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants