Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: move app_engine sub-block to its own resource #2118

Closed
paddycarver opened this issue Sep 26, 2018 · 14 comments
Closed

Proposal: move app_engine sub-block to its own resource #2118

paddycarver opened this issue Sep 26, 2018 · 14 comments

Comments

@paddycarver
Copy link
Contributor

This is a proposal issue to gather feedback and discussion about our contentious app_engine design. Currently, App Engine applications are created by adding an app_engine sub-block to the google_project resource. This has a few benefits:

  • It clearly illustrates the tight coupling that App Engine applications have to GCP projects.
  • It allows you to delete App Engine applications.

But it comes with drawbacks, too:

  • Deleting App Engine applications involves destroying and recreating your entire project.
  • Project names aren't immediately available for reuse, so recreating your project probably won't work the way you intend it to, unless you plan for it ahead of time.
  • People don't think of App Engine apps as part of the project resource, which leads to surprises when they have an App Engine app and don't include it in their project in Terraform.

What we've heard from feedback is that people don't feel the pros outweigh the cons. As we look into version 2.0.0 of the provider, and have the opportunity to make some changes that we otherwise couldn't, we're reevaluating this decision and seeking input.

My current thought for a design is a google_app_engine_app resource, with the same fields as the current block in google_project. However, because there's no API to delete App Engine applications, any plan that would involve deleting the app--including terraform destroy--would throw an error. To actually delete the app, you'd need to set a special field--possibly called ack_noop_destroy, or some other name we'll decide on--on the google_app_engine_app resource, acknowledging that you understand that just because Terraform says it's deleted does not mean it's actually deleted. This would allow us to get around the problem of Terraform doing something other than what it says in plan, which is an important property of Terraform.

Does this design work for people? Do people have any feedback on this? We'd love any thoughts and experience reports that might influence the design.

Related: #1561 #1973 #1728 #638

@roman-cnd
Copy link

It makes sense to remove app engine from project resource.

App engine is a thing-in-itself. Its able to create an enormous mess with resources available to the application, from tasks to data store tables. No reason to keep given internals resources is pretty much unmanageable.

I'd also put a big warning for users: "resource can't be deleted, terraform is going to ignore destroy. Ask your friendly google account manager to have it implemented properly".

@morgante
Copy link

I've encountered a lot of issues with the current app_engine block implementation and support making it less error-prone/surprising.

How would we deal with people including multiple app_engine configs for a single project? That's something which should be caught before it hits the API IMO.

However, because there's no API to delete App Engine applications, any plan that would involve deleting the app--including terraform destroy--would throw an error.

Let's make sure this error is very descriptive and points to a clear doc explaining the tight coupling and how to get around it.

@paddycarver
Copy link
Contributor Author

paddycarver commented Sep 26, 2018

How would we deal with people including multiple app_engine configs for a single project? That's something which should be caught before it hits the API IMO.

I agree, but unfortunately, I don't think we have any way to accomplish this currently. I imagine it would work the same as any other singleton resource that's defined against an API, like google_project_iam_policy--if you have multiple, you're in for a bad time, sorry? I think the best we can do, unfortunately, is document it.

Let's make sure this error is very descriptive and points to a clear doc explaining the tight coupling and how to get around it.

Agreed. I'm imagining something like:

Error: App Engine applications can't be deleted without deleting your project. Please add ack_noop_delete = true to your google_app_engine_app resource to acknowledge this limitation, and this error will go away. For more information, see https://www.terraform.io/docs/providers/google/r/google_app_engine_app.html#noop-delete.

Wording could be changed and tweaked, but that was my general thought for what the error would be. Also, I'm imagining an error at plan time, not at apply time, which should be a bit less disruptive?

@morgante
Copy link

I agree, but unfortunately, I don't think we have any way to accomplish this currently.

At a minimum, could project be a required field for app_engine (instead of inferring it from the provider/context)? I think being explicit will at least give people somewhat less rope to hang themselves with.

Also, I presume the following will work (modulo race conditions), but want to confirm:

  1. I add two app_engine configs into my Terraform
  2. First one executes successfully and gets created
  3. API call to second one fails (since I already have app engine)
  4. I delete the second one from my config and Terraform doesn't see any changes (since it never successfully created in the first place)

Also, I'm imagining an error at plan time, not at apply time, which should be a bit less disruptive?

Yes, definitely should be a plan-time error.

@paddycarver
Copy link
Contributor Author

Also, I presume the following will work (modulo race conditions), but want to confirm:

That is my understanding. I can't promise that without actually implementing it, but that's the idea. At the very least, I know that if there's an error creating one, removing it from your config won't create an error, because it never wound up in state. I don't know if you'll actually get an error trying to create the second or not, unfortunately--I'll have to see what the API does.

At a minimum, could project be a required field for app_engine (instead of inferring it from the provider/context)?

I think that would require a bit more conversation about its impact on module authors/user experience, but I wouldn't say it's off the table entirely.

@paddycarver paddycarver modified the milestone: 2.0.0 Sep 30, 2018
paddycarver added a commit that referenced this issue Oct 2, 2018
Deprecate the app_engine sub-block of google_project, and create a
google_app_engine_application resource instead. Also, add some tests for
its behaviour, as well as some documentation for it.

Note that this is largely an implementation of the ideas discussed in
 #2118, except we're not using CustomizeDiff to reject deletions without
our special flag set, because CustomizeDiff apparently doesn't run on
Delete. Who knew? This leaves us rejecting the deletion at apply time,
which is less than ideal, but the only other option I see is to silently
not delete the resource, and that's... not ideal, either.

This also stops the app_engine sub-block on google_project from forcing
new when it's removed, and sets it to computed, so users can safely move
from using the sub-block to using the resource without state surgery or
deleting their entire project. This does mean it's impossible to delete
an App Engine application from a sub-block now, but seeing as that was
the same situation before, and we just papered over it by making the
project recreate itself in that situation, and people Were Not Fans of
that, I'm considering that an acceptable casualty.
@paddycarver paddycarver modified the milestones: 2.0.0, 1.19.0 Oct 2, 2018
@leighmhart
Copy link

Right now the current behaviour of wanting to delete an entire project (and sometimes silently without warning) just because App Engine was enabled via Console (and the API was turned on as a result) is, well, impractical at best.

I've just had a production GCP compute engine focussed project destroyed along with 50 servers by accident because a targeted apply of an unrelated DNS Made Easy provider resource using 0.10.7 and the latest provider went wild and did things outside the scope of the -target= specified and destroyed my project without warning.

Thankfully this is easily recovered (for the most part) in GCP, but it could have been a lot worse.

I'm now faced with not being able to upgrade the google provider in many projects because I don't know if people have enabled App Engine or not via console.

Just my $0.02 worth.

@leighmhart
Copy link

and yes, lesson learned, I'm now on 0.11.8 but shouldn't have to be on the bleeding edge to avoid top level resources as important as projects being deleted without warning.

@paddycarver
Copy link
Contributor Author

@allandrick Really sorry for that user experience. That shouldn't have happened, and I'm interested in understanding how it did happen to prevent future errors.

I worked on a PR for this: #2147. Unfortunately, in testing, we don't actually have the ability to override the diff you see when deleting a resource, so we can't force it to fail at plan time. We could force it to fail at apply time if a field isn't set, but my colleagues who maintain other providers have warned me that that has a bunch of hidden, unintuitive problems (like needing to apply before you can delete, and so on).

As a result, #2147 just has a log and a noop for delete on the standalone resource, and some documentation on the page right up front to call out the limitation. For the moment, it's the best we can do. I hear 0.12 is bringing us some new capabilities around this that will let us surface a warning to the end user, which would let us at least put the warning in front of them when they run the command, so I'm satisfied there's a future upgrade path here.

Thank you all for your feedback. If anyone strenuously objects to this path, I'm open to hearing alternatives, but our window of opportunity is closing rapidly. I'm going to close this issue out at the end of the workday today to signal that. I apologise for the little warning; the modification of the plan due to implementation limitations was unexpected, and we're in a time-sensitive period.

@leighmhart
Copy link

@paddycarver so the root cause of my experience was, as it turns out, as follows:

  • I had executed a terraform get -update and terraform plan and the changes included enabling the AppEngine API via the google_project_service resource.
  • AppEngine had previously (2 years ago) been initialised on the Console as a test, it appears.
  • The above-mentioned plan also included some DME provider dme_record entries which I wanted to apply in a targeted way first before applying the rest of the changes, so I executed terraform apply -target=dme_record.dme-ns (the resource name in question) using 0.10.7 and this is where things went awry - instead of just applying the dme resource, it applied all of the google_project_services resources, and ran out of API quota. I waited a moment and then re-ran the same command again, at which point the AppEngine API had already been enabled and that's when the google_project resource destroyed the project because of the app_engine sub-block mismatch.

So, ultimately, the thing I missed between two apply executions was another plan which would have revealed the impending doom. That, together with not using 0.11.x where terraform apply would have prompted for a yes/no confirmation, was the eye of the storm.

So the only likely bug in this case was the fact that the dme provider targeted apply didn't limit itself to the dme resource. Reading the https://www.terraform.io/docs/commands/plan.html#resource-targeting docs, -target is not recommended to use for routine operations. Mea culpa on two counts I guess!

We've updated to 0.11.8 ;-)

@ensonic
Copy link

ensonic commented Oct 22, 2018

How would I migrate?

I have:

resource "google_project" "project" {
   ...
  app_engine {
    location_id = "europe-west"
  }
}

and receive a warning:

Warning: google_project.project: "app_engine": [DEPRECATED] Use the google_app_engine_application resource instead.

If I remove the app_engine block in project and add this instead:

resource "google_app_engine_application" "app" {
  project = "${google_project.project.project_id}"
  location_id = "europe-west"
}

I get an error from terraform apply:

Error: Error applying plan:

1 error(s) occurred:

* google_app_engine_application.app: 1 error(s) occurred:

* google_app_engine_application.app: Error creating App Engine application: googleapi: Error 409: This application already exists and cannot be re-created., alreadyExists

@morgante
Copy link

@ensonic As per #2175, you will need to import your app with terraform import google_app_engine_application.app project-id.

@paddycarver This migration story still feels pretty poor - forcing manual state surgery for a provider upgrade is suboptimal. Isn't it possible to auto-import app engine if the we try to create and it already exists (like how APIs are handled)?

@paddycarver
Copy link
Contributor Author

It's possible, I'm just not sure it's wise. (I also don't really consider this "manual state surgery", because we usually reserve that for editing the JSON file directly or using terraform state mv, but that's really just semantics, and the point stands regardless that requiring import for a provider upgrade isn't ideal. Because it's only going to be required in a major version upgrade, I'm ok with the breaking change, but that doesn't mean I'm not open to doing better. I just want to think about the implications a bit.)

chrisst pushed a commit to chrisst/terraform-provider-google that referenced this issue Nov 9, 2018
Deprecate the app_engine sub-block of google_project, and create a
google_app_engine_application resource instead. Also, add some tests for
its behaviour, as well as some documentation for it.

Note that this is largely an implementation of the ideas discussed in
 hashicorp#2118, except we're not using CustomizeDiff to reject deletions without
our special flag set, because CustomizeDiff apparently doesn't run on
Delete. Who knew? This leaves us rejecting the deletion at apply time,
which is less than ideal, but the only other option I see is to silently
not delete the resource, and that's... not ideal, either.

This also stops the app_engine sub-block on google_project from forcing
new when it's removed, and sets it to computed, so users can safely move
from using the sub-block to using the resource without state surgery or
deleting their entire project. This does mean it's impossible to delete
an App Engine application from a sub-block now, but seeing as that was
the same situation before, and we just papered over it by making the
project recreate itself in that situation, and people Were Not Fans of
that, I'm considering that an acceptable casualty.
@ensonic
Copy link

ensonic commented Nov 15, 2018

I am now putting this into our dev-scripts:

(${TERRAFORM} 2>/dev/null import google_app_engine_application.app ${GCP_PROJECT_ID} || true)

It is a bit ugly.

@ghost
Copy link

ghost commented Nov 16, 2018

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Nov 16, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants