Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_appautoscaling_policy resource produced new value for was present but now absent #10549

Closed
camlow325 opened this issue Oct 18, 2019 · 9 comments · Fixed by #11222
Closed
Labels
bug Addresses a defect in current functionality.
Milestone

Comments

@camlow325
Copy link
Contributor

camlow325 commented Oct 18, 2019

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

$ terraform -v
Terraform v0.12.3
+ provider.aws v2.26.0
+ provider.null v2.1.2
+ provider.random v2.2.0

Affected Resource(s)

  • aws_appautoscaling_policy

Terraform Configuration Files

locals {
  table_name = "test"

  hash_name = "h1"
  hash_type = "S"

  range_name = "r1"
  range_type = "S"

  rcu_min = 1
  rcu_max = 200

  wcu_min = 1
  wcu_max = 200

  scale_percentage = 75
  scale_cooldown = 60
}

resource "aws_dynamodb_table" "this" {
  name           = local.table_name
  hash_key       = local.hash_name
  range_key      = local.range_name
  read_capacity  = local.rcu_min
  write_capacity = local.wcu_min

  attribute {
    name = local.hash_name
    type = local.hash_type
  }

  attribute {
    name = local.range_name
    type = local.range_type
  }

  ttl {
    attribute_name = "ttl"
    enabled        = true
  }

  // Ignore changes to capacity variables that might be done by autoscaling.
  lifecycle {
    ignore_changes = [read_capacity, write_capacity]
  }
}

data "aws_iam_policy_document" "dynamodb_read" {
  statement {
    actions = [
      "dynamodb:BatchGetItem",
      "dynamodb:GetItem",
      "dynamodb:Query",
      "dynamodb:Scan",
    ]

    resources = [
      aws_dynamodb_table.this.arn,
    ]
    effect = "Allow"
  }
}

resource "aws_iam_policy" "dynamodb_read" {
  name   = "${aws_dynamodb_table.this.name}-dynamodb-read"
  policy = data.aws_iam_policy_document.dynamodb_read.json
}

data "aws_iam_policy_document" "dynamodb_write" {
  statement {
    actions = [
      "dynamodb:BatchWriteItem",
      "dynamodb:DeleteItem",
      "dynamodb:PutItem",
      "dynamodb:UpdateItem",
    ]

    resources = [
      aws_dynamodb_table.this.arn,
    ]
    effect = "Allow"
  }
}

resource "aws_iam_policy" "dynamodb_write" {
  name   = "${aws_dynamodb_table.this.name}-dynamodb-write"
  policy = data.aws_iam_policy_document.dynamodb_write.json
}

resource "aws_appautoscaling_target" "dynamodb_read" {
  max_capacity       = local.rcu_max
  min_capacity       = local.rcu_min
  resource_id        = "table/${aws_dynamodb_table.this.name}"
  scalable_dimension = "dynamodb:table:ReadCapacityUnits"
  service_namespace  = "dynamodb"

  role_arn = aws_iam_role.dynamodb_scaling.arn

  // AWS may attach its own service-linked role to the resource in place of the
  // scaling one defined above. Use `ignore_changes` to avoid reporting that
  // a change would be reapplied on each run.
  lifecycle {
    ignore_changes = [role_arn]
  }
}

resource "aws_appautoscaling_policy" "dynamodb_read" {
  name               = "${aws_dynamodb_table.this.name}-dynamodb-rcw"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.dynamodb_read.resource_id
  scalable_dimension = aws_appautoscaling_target.dynamodb_read.scalable_dimension
  service_namespace  = aws_appautoscaling_target.dynamodb_read.service_namespace

  target_tracking_scaling_policy_configuration {
    target_value       = local.scale_percentage
    scale_in_cooldown  = local.scale_cooldown
    scale_out_cooldown = local.scale_cooldown

    predefined_metric_specification {
      predefined_metric_type = "DynamoDBReadCapacityUtilization"
    }
  }
}

resource "aws_appautoscaling_target" "dynamodb_write" {
  max_capacity       = local.wcu_max
  min_capacity       = local.wcu_min
  resource_id        = "table/${aws_dynamodb_table.this.name}"
  scalable_dimension = "dynamodb:table:WriteCapacityUnits"
  service_namespace  = "dynamodb"

  role_arn = aws_iam_role.dynamodb_scaling.arn

  // AWS may attach its own service-linked role to the resource in place of the
  // scaling one defined above. Use `ignore_changes` to avoid reporting that
  // a change would be reapplied on each run.
  lifecycle {
    ignore_changes = [role_arn]
  }
}

resource "aws_appautoscaling_policy" "dynamodb_write" {
  name               = "${aws_dynamodb_table.this.name}-dynamodb-wcw"
  policy_type        = "TargetTrackingScaling"
  resource_id        = aws_appautoscaling_target.dynamodb_write.resource_id
  scalable_dimension = aws_appautoscaling_target.dynamodb_write.scalable_dimension
  service_namespace  = aws_appautoscaling_target.dynamodb_write.service_namespace

  target_tracking_scaling_policy_configuration {
    target_value       = local.scale_percentage
    scale_in_cooldown  = local.scale_cooldown
    scale_out_cooldown = local.scale_cooldown

    predefined_metric_specification {
      predefined_metric_type = "DynamoDBWriteCapacityUtilization"
    }
  }
}

resource "aws_iam_role" "dynamodb_scaling" {
  name               = "${aws_dynamodb_table.this.name}-dynamodb-scaling"
  path               = "/"
  assume_role_policy = data.aws_iam_policy_document.dynamodb_scaling_assume_role.json
}

data "aws_iam_policy_document" "dynamodb_scaling_assume_role" {
  statement {
    actions = [
      "sts:AssumeRole",
    ]

    principals {
      type = "Service"
      identifiers = [
        "application-autoscaling.amazonaws.com",
      ]
    }

    effect = "Allow"
  }
}

resource "aws_iam_role_policy" "dynamodb_scaling" {
  name   = aws_dynamodb_table.this.name
  role   = aws_iam_role.dynamodb_scaling.id
  policy = data.aws_iam_policy_document.dynamodb_scaling.json
}

data "aws_iam_policy_document" "dynamodb_scaling" {
  statement {
    actions = [
      "dynamodb:DescribeTable",
      "dynamodb:UpdateTable",
      "cloudwatch:PutMetricAlarm",
      "cloudwatch:DescribeAlarms",
      "cloudwatch:GetMetricStatistics",
      "cloudwatch:SetAlarmState",
      "cloudwatch:DeleteAlarms",
    ]

    resources = [
      "*",
    ]
    effect = "Allow"
  }
}

Debug Output

Unfortunately, we've only seen this problem 2 times (in automated CI runs) across hundreds of executions and didn't have debug on at the time. We also haven't been able to reproduce this on-demand yet and so don't have a way to get this output currently.

Here's a subset of the normal apply output that we saw:

aws_dynamodb_table.this: Creation complete after 7s [id=test]
data.aws_iam_policy_document.dynamodb_write: Refreshing state...
data.aws_iam_policy_document.dynamodb_read: Refreshing state...
aws_iam_role.dynamodb_scaling: Creating...
aws_iam_policy.dynamodb_read: Creating...
aws_iam_policy.dynamodb_write: Creating...
aws_iam_role.dynamodb_scaling: Creation complete after 1s [id=test-dynamodb-scaling]
aws_appautoscaling_target.dynamodb_read: Creating...
aws_appautoscaling_target.dynamodb_write: Creating...
aws_iam_role_policy.dynamodb_scaling: Creating...
aws_appautoscaling_target.dynamodb_write: Creation complete after 0s [id=table/test]
aws_appautoscaling_policy.dynamodb_write: Creating...
aws_appautoscaling_target.dynamodb_read: Creation complete after 0s [id=table/test]
aws_appautoscaling_policy.dynamodb_read: Creating...
aws_iam_policy.dynamodb_read: Creation complete after 1s [id=arn:aws:iam::XXXX:policy/test-dynamodb-read]
aws_iam_policy.dynamodb_write: Creation complete after 1s [id=arn:aws:iam::XXXX:policy/test-dynamodb-write]
aws_iam_role_policy.dynamodb_scaling: Creation complete after 1s [id=test-dynamodb-scaling:test]
aws_appautoscaling_policy.dynamodb_write: Creation complete after 1s [id=test-dynamodb-wcw]

...

Error: Provider produced inconsistent result after apply
 
When applying changes to aws_appautoscaling_policy.dynamodb_read,
provider "aws" produced an unexpected new value for was present, but now
absent.
 
This is a bug in the provider, which should be reported in the provider's own
issue tracker.

Panic Output

No panic.

Expected Behavior

Both of the aws_appautoscaling_policy resources in the config should be applied successfully.

Actual Behavior

On a very infrequent basis - only 2 times across hundreds of runs for us so far - the apply results in an error for either the aws_appautoscaling_policy.dynamodb_read or aws_appautoscaling_policy.dynamodb_write resource. Even when the error occurs, the policy resource is actually created properly in the AWS account. If another apply is attempted after the failure, the apply is successful.

Steps to Reproduce

  1. terraform apply
  2. If the apply fails, run aws application-autoscaling describe-scaling-policies --service-namespace dynamodb --policy-names <policy_name> and confirm that the policy is present.
  3. terraform apply again. Observe that the second apply is successful.

Important Factoids

As mentioned above, the problem is highly intermittent. We're running on a bit older version of Terraform and the AWS provider at present in CI but hope to update soon and will post back here if we continue to encounter the issue.

All of the config around the DynamoDB table is listed above just in case that helps provide more context for what might be necessary to reproduce the problem - sorry for the size. In addition to updating to newer versions of Terraform and the AWS provider, we're going to attempt to make the following config changes to see if these help eliminate the problem:

  1. Remove the custom IAM role attached to the aws_appautoscaling_target resources. We only have the ignore_changes block for role_arn argument because AWS appears to apply its own role to the resources anyway, as I believe is described in aws_appautoscaling_target role_arn is ignored or doesn't work #5023.
  2. Temporarily try to force a dependency relationship between the two aws_appautoscaling_policy resources - on the theory that having the resources applied in parallel might lead to the problem.
@ghost ghost added service/applicationautoscaling service/dynamodb Issues and PRs that pertain to the dynamodb service. service/iam Issues and PRs that pertain to the iam service. labels Oct 18, 2019
@camlow325 camlow325 changed the title app_autoscaling_policy resource produced new value for was present but now absent aws_autoscaling_policy resource produced new value for was present but now absent Oct 18, 2019
@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Oct 18, 2019
@camlow325 camlow325 changed the title aws_autoscaling_policy resource produced new value for was present but now absent aws_appautoscaling_policy resource produced new value for was present but now absent Oct 20, 2019
@camlow325
Copy link
Contributor Author

Small update: we have since updated the Terraform AWS provider version to v2.33.0 - along with the other two changes above - removing the custom IAM role and forcing a dependency relationship between the two aws_appautoscaling_policy resources. Despite that, we have seen the problem again:

When applying changes to aws_appautoscaling_policy.dynamodb_write,
provider "aws" produced an unexpected new value for was present, but now
absent.

@camlow325
Copy link
Contributor Author

We were able to observe an instance of this failure with debug logging enabled during a terraform apply. Here are some snippets of that around the failure.

The PutScalingPolicy call to AWS appears to be successful:

2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:52 [DEBUG] ApplicationAutoScaling PutScalingPolicy: {
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   PolicyName: "test-dynamodb-wcw",
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   PolicyType: "TargetTrackingScaling",
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   ResourceId: "table/test",
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   ScalableDimension: "dynamodb:table:WriteCapacityUnits",
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   ServiceNamespace: "dynamodb",
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   TargetTrackingScalingPolicyConfiguration: {
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:     DisableScaleIn: false,
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:     PredefinedMetricSpecification: {
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:       PredefinedMetricType: "DynamoDBWriteCapacityUtilization"
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:     },
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:     ScaleInCooldown: 60,
2019-12-06T21:46:52.566Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:     ScaleOutCooldown: 60,
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:     TargetValue: 75
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   }
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: }
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:52 [DEBUG] Waiting for state to become: [success]
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:52 [DEBUG] [aws-sdk-go] DEBUG: Request autoscaling/PutScalingPolicy Details:
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: ---[ REQUEST POST-SIGN ]-----------------------------
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: POST / HTTP/1.1
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Host: autoscaling.us-west-2.amazonaws.com
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: User-Agent: aws-sdk-go/1.25.36 (go1.13.3; linux; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.12.3 (+https://www.terraform.io)
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Length: 431
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Authorization: AWS4-HMAC-SHA256 Credential=XXX, SignedHeaders=content-length;content-type;host;x-amz-date;x-amz-target, Signature=XXX
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Type: application/x-amz-json-1.1
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: X-Amz-Date: 20191206T214652Z
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: X-Amz-Target: AnyScaleFrontendService.PutScalingPolicy
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Accept-Encoding: gzip
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 
2019-12-06T21:46:52.567Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: {"PolicyName":"test-dynamodb-wcw","PolicyType":"TargetTrackingScaling","ResourceId":"table/test","ScalableDimension":"dynamodb:table:WriteCapacityUnits","ServiceNamespace":"dynamodb","TargetTrackingScalingPolicyConfiguration":{"DisableScaleIn":false,"PredefinedMetricSpecification":{"PredefinedMetricType":"DynamoDBWriteCapacityUtilization"},"ScaleInCooldown":60,"ScaleOutCooldown":60,"TargetValue":75}}
...
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:52 [DEBUG] [aws-sdk-go] DEBUG: Response autoscaling/PutScalingPolicy Details:
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: ---[ RESPONSE ]--------------------------------------
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: HTTP/1.1 200 OK
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Connection: close
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Length: 1262
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Type: application/x-amz-json-1.1
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Date: Fri, 06 Dec 2019 21:46:52 GMT
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: X-Amzn-Requestid: XXX
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: -----------------------------------------------------
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:52 [DEBUG] [aws-sdk-go] {"Alarms":[{"AlarmARN":"arn:aws:cloudwatch:us-west-2:XXX:alarm:TargetTracking-table/test-AlarmHigh-XXX","AlarmName":"TargetTracking-table/test-AlarmHigh-XXX"},{"AlarmARN":"arn:aws:cloudwatch:us-west-2:XXX:alarm:TargetTracking-table/test-AlarmLow-XXX","AlarmName":"TargetTracking-table/test-AlarmLow-XXX"},{"AlarmARN":"arn:aws:cloudwatch:us-west-2:XXX:alarm:TargetTracking-table/test-ProvisionedCapacityHigh-XXX","AlarmName":"TargetTracking-table/test-ProvisionedCapacityHigh-XXX"},{"AlarmARN":"arn:aws:cloudwatch:us-west-2:XXX:alarm:TargetTracking-table/test-ProvisionedCapacityLow-XXX","AlarmName":"TargetTracking-table/test-ProvisionedCapacityLow-XXX"}],"PolicyARN":"arn:aws:autoscaling:us-west-2:XXX:scalingPolicy:XXX:resource/dynamodb/table/test:policyName/test-dynamodb-wcw"}

A subsequent DescribeScalingPolicies call to AWS for the policy, however, returns a 200 OK response but with an empty list of policies, which appears to result in the apply failure:

2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:52 [INFO] ApplicationAutoScaling scaling PolicyARN: arn:aws:autoscaling:us-west-2:XXX:scalingPolicy:0cbeff23-1214-4e31-8d34-5748427a5606:resource/dynamodb/table/test:policyName/test-dynamodb-wcw
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:52 [DEBUG] Waiting for state to become: [success]
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:52 [DEBUG] Application AutoScaling Policy Describe Params: {
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   PolicyNames: ["test-dynamodb-wcw"],
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   ResourceId: "table/test",
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   ScalableDimension: "dynamodb:table:WriteCapacityUnits",
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4:   ServiceNamespace: "dynamodb"
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: }
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:52 [DEBUG] [aws-sdk-go] DEBUG: Request autoscaling/DescribeScalingPolicies Details:
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: ---[ REQUEST POST-SIGN ]-----------------------------
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: POST / HTTP/1.1
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Host: autoscaling.us-west-2.amazonaws.com
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: User-Agent: aws-sdk-go/1.25.36 (go1.13.3; linux; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.12.3 (+https://www.terraform.io)
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Length: 177
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Authorization: AWS4-HMAC-SHA256 Credential=XXX, SignedHeaders=content-length;content-type;host;x-amz-date;x-amz-target, Signature=XXX
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Type: application/x-amz-json-1.1
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: X-Amz-Date: 20191206T214652Z
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: X-Amz-Target: AnyScaleFrontendService.DescribeScalingPolicies
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Accept-Encoding: gzip
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: {"PolicyNames":["test-dynamodb-wcw"],"ResourceId":"table/test","ScalableDimension":"dynamodb:table:WriteCapacityUnits","ServiceNamespace":"dynamodb"}
2019-12-06T21:46:52.968Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: -----------------------------------------------------
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:53 [DEBUG] [aws-sdk-go] DEBUG: Response autoscaling/DescribeScalingPolicies Details:
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: ---[ RESPONSE ]--------------------------------------
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: HTTP/1.1 200 OK
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Connection: close
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Length: 22
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Type: application/x-amz-json-1.1
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Date: Fri, 06 Dec 2019 21:46:52 GMT
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: X-Amzn-Requestid: XXX
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: -----------------------------------------------------
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:53 [DEBUG] [aws-sdk-go] {"ScalingPolicies":[]}
2019-12-06T21:46:53.009Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/06 21:46:53 [WARN] Application AutoScaling Policy (test-dynamodb-wcw) not found, removing from state
2019/12/06 21:46:53 [DEBUG] module.generic_data_pipeline.module.dynamodb_table.aws_appautoscaling_policy.dynamodb_write: apply errored, but we're indicating that via the Error pointer rather than returning it: Provider produced inconsistent result after apply: When applying changes to module.generic_data_pipeline.module.dynamodb_table.aws_appautoscaling_policy.dynamodb_write, provider "aws" produced an unexpected new value for was present, but now absent.

In other cases where the apply succeeds, the response to the DescribeScalingPolicies call includes a policy:

2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: ---[ REQUEST POST-SIGN ]-----------------------------
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: POST / HTTP/1.1
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Host: autoscaling.us-west-2.amazonaws.com
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: User-Agent: aws-sdk-go/1.25.36 (go1.13.3; linux; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.12.3 (+https://www.terraform.io)
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Length: 176
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Authorization: AWS4-HMAC-SHA256 Credential=XXX, SignedHeaders=content-length;content-type;host;x-amz-date;x-amz-target, Signature=XXX
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Type: application/x-amz-json-1.1
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: X-Amz-Date: 20191207T033546Z
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: X-Amz-Target: AnyScaleFrontendService.DescribeScalingPolicies
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Accept-Encoding: gzip
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 
2019-12-07T03:35:46.852Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: {"PolicyNames":["test-dynamodb-rcw"],"ResourceId":"table/test","ScalableDimension":"dynamodb:table:ReadCapacityUnits","ServiceNamespace":"dynamodb"}
...
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/07 03:35:46 [DEBUG] [aws-sdk-go] DEBUG: Response autoscaling/DescribeScalingPolicies Details:
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: ---[ RESPONSE ]--------------------------------------
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: HTTP/1.1 200 OK
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Connection: close
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Length: 1746
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Content-Type: application/x-amz-json-1.1
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: Date: Sat, 07 Dec 2019 03:35:46 GMT
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: X-Amzn-Requestid: XXX
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: -----------------------------------------------------
2019-12-07T03:35:46.994Z [DEBUG] plugin.terraform-provider-aws_v2.37.0_x4: 2019/12/07 03:35:46 [DEBUG] [aws-sdk-go] {"ScalingPolicies":[{"Alarms":[{"AlarmARN":"arn:aws:cloudwatch:us-west-2:XXX:alarm:TargetTracking-table/test-AlarmHigh-XXX","AlarmName":"TargetTracking-table/test-AlarmHigh-XXX"},{"AlarmARN":"arn:aws:cloudwatch:us-west-2:XXX:alarm:TargetTracking-table/test-AlarmLow-XXX","AlarmName":"TargetTracking-table/test-AlarmLow-XXX"},{"AlarmARN":"arn:aws:cloudwatch:us-west-2:XXX:alarm:TargetTracking-table/test-ProvisionedCapacityHigh-XXX","AlarmName":"TargetTracking-table/test-ProvisionedCapacityHigh-XXX"},{"AlarmARN":"arn:aws:cloudwatch:us-west-2:XXX:alarm:TargetTracking-table/test-ProvisionedCapacityLow-XXX","AlarmName":"TargetTracking-table/test-ProvisionedCapacityLow-XXX"}],"CreationTime":1.575689746374E9,"PolicyARN":"arn:aws:autoscaling:us-west-2:XXX:scalingPolicy:XXX:resource/dynamodb/table/test:policyName/test-dynamodb-rcw","PolicyName":"test-dynamodb-rcw","PolicyType":"TargetTrackingScaling","ResourceId":"table/test","ScalableDimension":"dynamodb:table:ReadCapacityUnits","ServiceNamespace":"dynamodb","TargetTrackingScalingPolicyConfiguration":{"DisableScaleIn":false,"PredefinedMetricSpecification":{"PredefinedMetricType":"DynamoDBReadCapacityUtilization"},"ScaleInCooldown":60,"ScaleOutCooldown":60,"TargetValue":75.0}}]}

This may suggest an eventual consistency issue on the AWS side. To compensate for this, maybe it would make sense for the DescribeScalingPolicies call to be repeated up to some number of times if the policy list in the response is empty?

@bflad bflad added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. service/dynamodb Issues and PRs that pertain to the dynamodb service. service/iam Issues and PRs that pertain to the iam service. labels Dec 9, 2019
@bflad
Copy link
Contributor

bflad commented Dec 9, 2019

Hi @camlow325 👋 Thank you for the additional details, they will be super helpful for fixing this issue. As you are suggesting and looking at your log output, this is likely an eventual consistency issue within AWS.

The current flow of the aws_appautoscaling_policy resource is to call PutScalingPolicy during resource creation, then call the resource read function to ensure that Terraform created the resource as expected. The resource read function however does not account for this scenario of the API returning no policies (albeit temporarily) during resource creation and instead will dutifully remove the Terraform resource from the state triggering the Provider produced inconsistent result after apply error you see.

The quickest fix for this could be introducing additional retry logic in the resource read function to handle the special new resource scenario here:

https://github.com/terraform-providers/terraform-provider-aws/blob/2cd7beee14a0746d6e925da718c2121e96379da7/aws/resource_aws_appautoscaling_policy.go#L298-L320

e.g.

	err := resource.Retry(2*time.Minute, func() *resource.RetryError {
		var err error
		p, err = getAwsAppautoscalingPolicy(d, meta)
		if err != nil {
			if isAWSErr(err, applicationautoscaling.ErrCodeFailedResourceAccessException, "") {
				return resource.RetryableError(err)
			}
			return resource.NonRetryableError(err)
		}
		if d.IsNewResource() && p == nil {
			return resource.RetryableError(&resource.NotFoundError{})
		}
		return nil
	})

If you or someone else is willing to submit the above, the maintainers can review the potential fix. Thanks!

camlow325 added a commit to camlow325/terraform-provider-aws that referenced this issue Dec 10, 2019
References:
* hashicorp#10549

The AWS application-autoscaling service has eventual consistency
considerations. The `aws_appautoscaling_policy` resource immediately
tries to read a scaling policy after creation. If the policy is not
found, the application-autoscaling service returns a 200 OK with an
empty list of scaling policies. Since no scaling policy is present, the
`aws_appautoscaling_policy` resource removes the created resource from
state, leading to a "produced an unexpected new value for was present,
but now absent" error.

With the changes in this commit, the empty list of scaling policies in
the response for the newly created resource will result in a
NotFoundError being returned and a retry of the read request. A
subsequent retry should hopefully be successful, leading to the state
being preserved.

Output from acceptance testing:

```
> make testacc TEST=./aws TESTARGS='-run=TestAccAWSAppautoScalingPolicy_'
...
--- PASS: TestAccAWSAppautoScalingPolicy_multiplePoliciesSameName (34.27s)
--- PASS: TestAccAWSAppautoScalingPolicy_dynamoDb (36.59s)
--- PASS: TestAccAWSAppautoScalingPolicy_multiplePoliciesSameResource (37.73s)
--- PASS: TestAccAWSAppautoScalingPolicy_disappears (68.94s)
--- PASS: TestAccAWSAppautoScalingPolicy_basic (72.25s)
--- PASS: TestAccAWSAppautoScalingPolicy_scaleOutAndIn (74.31s)
--- PASS: TestAccAWSAppautoScalingPolicy_spotFleetRequest (77.85s)
--- PASS: TestAccAWSAppautoScalingPolicy_ResourceId_ForceNew (88.97s)
```
@camlow325
Copy link
Contributor Author

camlow325 commented Dec 10, 2019

@bflad Awesome - thanks much for taking a look at this. Your suggested code change makes a lot of sense to me. I went ahead and put up a PR with it.

bflad pushed a commit that referenced this issue Dec 20, 2019
…#11222)

References:
* #10549

The AWS application-autoscaling service has eventual consistency
considerations. The `aws_appautoscaling_policy` resource immediately
tries to read a scaling policy after creation. If the policy is not
found, the application-autoscaling service returns a 200 OK with an
empty list of scaling policies. Since no scaling policy is present, the
`aws_appautoscaling_policy` resource removes the created resource from
state, leading to a "produced an unexpected new value for was present,
but now absent" error.

With the changes in this commit, the empty list of scaling policies in
the response for the newly created resource will result in a
NotFoundError being returned and a retry of the read request. A
subsequent retry should hopefully be successful, leading to the state
being preserved.

Output from acceptance testing:

```
> make testacc TEST=./aws TESTARGS='-run=TestAccAWSAppautoScalingPolicy_'
...
--- PASS: TestAccAWSAppautoScalingPolicy_multiplePoliciesSameName (34.27s)
--- PASS: TestAccAWSAppautoScalingPolicy_dynamoDb (36.59s)
--- PASS: TestAccAWSAppautoScalingPolicy_multiplePoliciesSameResource (37.73s)
--- PASS: TestAccAWSAppautoScalingPolicy_disappears (68.94s)
--- PASS: TestAccAWSAppautoScalingPolicy_basic (72.25s)
--- PASS: TestAccAWSAppautoScalingPolicy_scaleOutAndIn (74.31s)
--- PASS: TestAccAWSAppautoScalingPolicy_spotFleetRequest (77.85s)
--- PASS: TestAccAWSAppautoScalingPolicy_ResourceId_ForceNew (88.97s)
```
@bflad bflad added this to the v2.44.0 milestone Dec 20, 2019
@bflad
Copy link
Contributor

bflad commented Dec 20, 2019

The fix for this has been merged and will release with version 2.44.0 of the Terraform AWS Provider, early next month. Thanks to @camlow325 for the implementation. 👍

@ghost
Copy link

ghost commented Jan 10, 2020

This has been released in version 2.44.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

@Matty9191
Copy link

I'm still seeing this issue with the latest version of terraform and aws provider:

module.eks_node_group.aws_iam_role_policy_attachment.ClusterAutoScaler_polattach: Creating...

Error: Provider produced inconsistent result after apply

When applying changes to
module.eks_node_group.aws_iam_role_policy_attachment.ClusterAutoScaler_polattach,
provider "registry.terraform.io/-/aws" produced an unexpected new value for
was present, but now absent.

This is a bug in the provider, which should be reported in the provider's own
issue tracker.

$ terraform version
Terraform v0.12.19

  • provider.aws v2.46.0
  • provider.http v1.1.1
  • provider.local v1.4.0

I can reproduce it on demand and will be glad to provide additional debugging info

@camlow325
Copy link
Contributor Author

The fix for this which landed in the 2.44.0 provider release was only for the aws_appautoscaling_policy resource. We've since upgraded to using the newer AWS provider versions and have not seen any new occurrences of this issue.

It seems possible that the aws_iam_role_policy_attachment resource could be affected by the same problem. I don't see any retry logic for read failures in its resourceAwsIamRolePolicyAttachmentRead function. We've been encountering a similar issue for the aws_cloudwatch_log_stream resource as well and filed a separate issue, #11611, and potential PR to fix.

I'd suggest trying to reproduce the failure with TF_LOG enabled and see if the HTTP request pattern for the failure looks similar to the one I mentioned in this comment. I'd also suggest filing a separate issue for the aws_iam_role_policy_attachment failure and include any relevant debug log output.

@ghost
Copy link

ghost commented Jan 26, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Jan 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality.
Projects
None yet
3 participants