Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash during plan migrating to 0.12 #10135

Closed
ghost opened this issue Sep 17, 2019 · 9 comments · Fixed by #11107
Closed

Crash during plan migrating to 0.12 #10135

ghost opened this issue Sep 17, 2019 · 9 comments · Fixed by #11107
Labels
bug Addresses a defect in current functionality. crash Results from or addresses a Terraform crash or kernel panic. service/iam Issues and PRs that pertain to the iam service.
Milestone

Comments

@ghost
Copy link

ghost commented Sep 17, 2019

This issue was originally opened by @Tirke as hashicorp/terraform#22813. It was migrated here as a result of the provider split. The original body of the issue is below.


Terraform Version

Terraform version: 0.12.8  

Terraform Configuration Files

Module outputting IAM policy

output "invoke-function" {
  value = jsonencode({
    Version   = "2012-10-17"
    Statement = [
      {
        Effect   = "Allow"
        Action   = [
          "lambda:InvokeFunction",
        ]
        Resource = [
          var.RESOURCES
        ],
      }
    ]
  })
}

variable "RESOURCES" {
  type    = list(string)
  default = []
}

Module consuming the policy module

resource "aws_iam_role" "iam-api-authorizer-backend-invoke" {
  name                  = "iam-api-authorizer-backend-invoke"
  force_detach_policies = true

  path               = "/"
  assume_role_policy = jsonencode({
    Version   = "2012-10-17"
    Statement = [
      {
        Effect    = "Allow"
        Action    = "sts:AssumeRole"
        Principal = {
          Service = "apigateway.amazonaws.com"
        }
      },
    ]
  })
}

module "lambda-policy-document" {
  source    = "../module-policy"
  RESOURCES = ["*"]
}

resource "aws_iam_role_policy" "authorizer-da-backend-credentials" {
  name   = "authorizer-da-backend-credentials"
  role   = aws_iam_role.iam-api-authorizer-backend-invoke.id
  policy = module.lambda-policy-document.invoke-function
}

Root consumer

terraform {
  backend "s3" {
    region = "eu-west-1"
    key    = "data-acquisition/da-backend.tfstate"
  }
}

provider "aws" {
  region = "eu-west-1"
}

module "da-external" {
  source = "../external"
}

module "da-authorizer" {
  source = "../authorizer"
}

Crash Output

Full crash output :
https://gist.github.com/Tirke/93c82c9a3ba02eb85196861c9119462a

Relevant bits :

2019-09-16T18:46:17.498+0200 [DEBUG] plugin.terraform-provider-aws_v2.28.1_x4: panic: interface conversion: interface {} is []interface {}, not string
2019-09-16T18:46:17.498+0200 [DEBUG] plugin.terraform-provider-aws_v2.28.1_x4: 
2019-09-16T18:46:17.498+0200 [DEBUG] plugin.terraform-provider-aws_v2.28.1_x4: goroutine 739 [running]:
2019-09-16T18:46:17.498+0200 [DEBUG] plugin.terraform-provider-aws_v2.28.1_x4: github.com/terraform-providers/terraform-provider-aws/vendor/github.com/jen20/awspolicyequivalence.newAWSStringSet(0x4b56420, 0xc0008ff840, 0xc000999630, 0x1, 0x1)
....
019/09/16 18:46:17 [ERROR] module.da-devices: eval: *terraform.EvalReadState, err: rpc error: code = Unavailable desc = transport is closing
2019/09/16 18:46:17 [ERROR] module.da-authorizer: eval: *terraform.EvalDiff, err: rpc error: code = Unavailable desc = transport is closing
....

Context and expected behavior, fix ...

We are currently migrating our big project from tf 0.11 to tf 0.12.
The first few layers went quite well but we hit a crash during the first 0.12 plan on our two biggest layers. I've done a minimal reproduction on our biggest tfstate (5MO mainly because it contains the swagger definition for our API Gateway).
After discussing some problems we have with the refresh phase on this tfstate (see hashicorp/terraform#22617) we decided to rewrite all our 900 data.aws_iam_policy_document to a more simple HCL object + jsonencode. All our previous modules containing data.aws_iam_policy_document now contain JSON encoded outputs like you can see in the first snippet I provided.

And this refactoring seems to be the exact source of our problem. Making a plan from a empty local tfstate works. Making the plan against our existing tfstate doesn't. And from the crash log I heavily suspect that there is something going on between our old data.aws_iam_policy_document and the new JSON encoded modules.

I've already found what seems to be a fix but it would cost us a lot of inefficient rewriting of many modules to switch to a working version. The fix is quite simple, I just need to rewrite the second module with an inline policy :

Module consuming the policy module

resource "aws_iam_role" "iam-api-authorizer-backend-invoke" {
  name                  = "iam-api-authorizer-backend-invoke"
  force_detach_policies = true

  path               = "/"
  assume_role_policy = jsonencode({
    Version   = "2012-10-17"
    Statement = [
      {
        Effect    = "Allow"
        Action    = "sts:AssumeRole"
        Principal = {
          Service = "apigateway.amazonaws.com"
        }
      },
    ]
  })
}

resource "aws_iam_role_policy" "authorizer-da-backend-credentials" {
  name   = "authorizer-da-backend-credentials"
  role   = aws_iam_role.iam-api-authorizer-backend-invoke.id
  policy = jsonencode({
    Version   = "2012-10-17"
    Statement = [
      {
        Effect   = "Allow"
        Action   = [
          "lambda:InvokeFunction",
        ]
        Resource = [
          var.RESOURCES
        ],
      }
    ]
  })
}

I also suspect that doing some surgery in our tfstate could fix things. Something like tf state remove everything related to our old modules containing the data.aws_iam_policy_document but I've not tested that yet because I'm hoping that there is a potential fix to this issue.

@ghost ghost added service/iam Issues and PRs that pertain to the iam service. bug Addresses a defect in current functionality. crash Results from or addresses a Terraform crash or kernel panic. labels Sep 17, 2019
@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Sep 17, 2019
@ewbankkit
Copy link
Contributor

@Tirke I think your problem may be the nested list in the invoke-function output of the module.
You have

        Resource = [
          var.RESOURCES
        ],

and

RESOURCES = "[*]"

when using the module which will result in

        Resource = [
          "[*]"
        ],

or equivalent

  "Resource": [
    ["*"]
  ]

There is a problem https://github.com/jen20/awspolicyequivalence/blob/9ebbf3c225b2b9da629263e13c3015a5de7965d1/aws_policy_equivalence.go#L386 in a dependency that causes the panic on terraform plan if the Resource attribute isn't a string or array of strings (you have an array of array of strings).

Could you please try changing the module output to:

        Resource = var.RESOURCES

Thanks.

@ewbankkit
Copy link
Contributor

Submitted upstream PR jen20/awspolicyequivalence#9 to fix the panic, but this won't fix the problem with nested arrays which will probably result in a run time MalformedPolicyDocument: Syntax errors in policy. error.

@Tirke
Copy link

Tirke commented Oct 19, 2019

Thanks for the PR fixing the panic @ewbankkit. I did found the fix the day after submitting the PR and answered in my original PR. But with your fix, things should be easier for future people encountering that.

@ewbankkit
Copy link
Contributor

@Tirke Yes, I should have read the original issue's comment: hashicorp/terraform#22813 (comment).
Could you please close this issue?
Thanks.

@Tirke
Copy link

Tirke commented Oct 19, 2019

I'm not sure I can as it was opened by hashibot.

@jen20
Copy link
Contributor

jen20 commented Dec 3, 2019

This should be fixed by upgrading to v1.1.0 of awspolicyequivalence.

@bflad bflad removed the needs-triage Waiting for first response or review from a maintainer. label Dec 5, 2019
@bflad bflad added this to the v2.42.0 milestone Dec 5, 2019
bflad pushed a commit that referenced this issue Dec 5, 2019
This brings in the fix from @ewbankkit for a panic in the case of an
incorrect type assertion against nil.

Fixes #10135.
Fixes #10528.
@bflad
Copy link
Contributor

bflad commented Dec 5, 2019

The fix for this has been merged and will release with version 2.42.0 of the Terraform AWS Provider, likely next week. Thanks @ewbankkit and @jen20 for the implementation. 👍

@ghost
Copy link
Author

ghost commented Dec 13, 2019

This has been released in version 2.42.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@ghost
Copy link
Author

ghost commented Mar 28, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. crash Results from or addresses a Terraform crash or kernel panic. service/iam Issues and PRs that pertain to the iam service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants