Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: r/aws_db_instance_automated_backups_replication: unexpected state 'Pending' #32597

Closed
ixl-cchew opened this issue Jul 19, 2023 · 25 comments · Fixed by #31600
Closed

[Bug]: r/aws_db_instance_automated_backups_replication: unexpected state 'Pending' #32597

ixl-cchew opened this issue Jul 19, 2023 · 25 comments · Fixed by #31600
Labels
bug Addresses a defect in current functionality. service/rds Issues and PRs that pertain to the rds service.
Milestone

Comments

@ixl-cchew
Copy link

ixl-cchew commented Jul 19, 2023

Terraform Core Version

1.5.3

AWS Provider Version

5.8.0

Affected Resource(s)

aws_db_instance_automated_backups_replication

Expected Behavior

Creating the db_instance_automated_backups_replication resource should complete without any error

Actual Behavior

After creating a primary instance from this module; when creating db_instance_automated_backups_replication resource, it errors out.

Relevant Error/Panic Output Snippet

Error: waiting for DB instance automated backup (arn:aws:rds:us-west-1:<accountnumber>:auto-backup:ab-<string>) create: unexpected state 'Pending', wanted target 'replicating'. last error: %!s(<nil>)

Terraform Configuration Files

provider "aws" {
    profile = "default"
    region = "us-west-2"
    assume_role {
        role_arn     = "<arn of assumed role>"
        session_name = "TerraformAdminSession"
    }

}

provider "aws" {
    profile = "default"
    region = "us-west-1"
    assume_role {
        role_arn     = "<arn of assumed role>"
        session_name = "TerraformAdminSession"
    }
    alias = "cross_replica"
}

module "master" {
    source  = "terraform-aws-modules/rds/aws"
    version = "6.1.0"

  identifier = "test-master"

  engine               = "postgres"
  engine_version       = "14.6"
  family               = "postgres14"
  instance_class       = "db.m5.xlarge"

  allocated_storage     = "2000"
  max_allocated_storage = 3000
  db_name  = "replicaPostgresql"
  username = "replica_postgresql"
  port     = "5432"

  multi_az               = false

  maintenance_window              = "Mon:00:00-Mon:03:00"
  backup_window                   = "03:00-06:00"
  enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]

  backup_retention_period = 1
  skip_final_snapshot     = true
  deletion_protection     = false

}

resource "aws_kms_key" "default" {
  description = "Encryption key for automated backups"

  provider = aws.cross_replica
}

resource "aws_db_instance_automated_backups_replication" "cross_region_replication" {
    depends_on = [ module.master ]
    source_db_instance_arn = module.master.db_instance_arn
    kms_key_id             = aws_kms_key.default.arn
    
    provider = aws.cross_replica
}

Steps to Reproduce

  1. Need to have your ~/.aws/credentials with the appropriate access key to assume a role to the account with the [default] profile
  2. Run terraform init and apply

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

@ixl-cchew ixl-cchew added bug Addresses a defect in current functionality. needs-triage Waiting for first response or review from a maintainer. labels Jul 19, 2023
@github-actions
Copy link

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@github-actions github-actions bot added service/kms Issues and PRs that pertain to the kms service. service/rds Issues and PRs that pertain to the rds service. labels Jul 19, 2023
@ixl-cchew
Copy link
Author

Just want to note that it looks like the resource does get created and originally it will be in the Pending state and then switch over to the Replicating state. However, the error message is concerning to the person running the terraform.

@ixl-cchew
Copy link
Author

It seems like there is an issue probably because the terraform did not complete because the next time I do a apply, it says the resource is tainted and must be replaced. When it goes destroys the resource and recreate it, I get an error stating

│ Error: starting RDS instance automated backups replication: InvalidDBInstanceState: DB instance is already replicating backups to this region.
│       status code: 400, request id: 2670dc0e-e499-4cac-bd17-40c53c5daa06

but the replication has already been deleted.

@taylorbartley
Copy link

taylorbartley commented Jul 19, 2023

I'm having the exact same issue, with a slightly different setup. My DB already exists in us-east-1, and I'm adding backup replication to us-west-1.

Terraform v1.5.3
on linux_amd64

  • provider registry.terraform.io/hashicorp/aws v5.8.0
    I've also tested with versions 4.9.0 and 4.23.0.
resource "aws_db_instance_automated_backups_replication" "dr_default" {
  source_db_instance_arn = var.source_db_instance_arn
  retention_period       = 1
  kms_key_id             = "arn:aws:kms:us-west-1:01234567890:key/abcd123456"
}

When I apply, it shows only the replication to be created.

  # module.replication.aws_db_instance_automated_backups_replication.dr_default will be created
  + resource "aws_db_instance_automated_backups_replication" "dr_default" {

But then I get the same error output, with a tainted status, and also the replication is eventually successful.

I can untaint the replication resource, and everything seems fine.

A similar workflow occurs on import->update.
As a sanity check, I setup cross region snapshot replication manually in console and then I was able to import successfully (with key arn:aws:kms:us-west-1:01234567890:key/abcd123456). After that, on apply there were no changes, so everything is fine.
However, if I then change the retention, which would force a recreate, the apply fails after destroying and then attempting to create.

 unexpected state 'Pending', wanted target 'replicating'.

Update to add one more behavior. After an untaint, if I remove the aws_db_instance_automated_backups_replication from configuration and apply, sometimes, but not every time, I'll have this error:

aws_db_instance_automated_backups_replication.dr_default: Destroying... [id=arn:aws:rds:us-west-1:01234567890:auto-backup:ab-abcd123456]
╷
│ Error: error stopping RDS instance automated backups replication (arn:aws:rds:us-west-1:01234567890:auto-backup:ab-abcd123456): 
  InvalidDBInstanceState: DB Instance arn:aws:rds:us-east-1:01234567890:db:drtest is not replicating to the current region.
│       status code: 400, request id: 988bf200-7e95-414c-9fea-da4f9d1c6de8

@taylorbartley
Copy link

It appears that the cloudtrail response is definitely "status": "Pending" as just the base return.

The example referenced in the CLI docs reference 'pending' even though the API defines the responses as active, retained, and creating.

https://awscli.amazonaws.com/v2/documentation/api/latest/reference/rds/start-db-instance-automated-backups-replication.html
https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_DBInstanceAutomatedBackup.html

@al-nwokedike-imprivata
Copy link

I am also experiencing a similar issue with the following error:
Error: waiting for DB instance automated backup (arn:aws:rds:??:??:auto-backup:ab-??) create: unexpected state 'Pending', wanted target 'replicating'. last error: %!s(<nil>)

@ixl-cchew
Copy link
Author

ixl-cchew commented Jul 20, 2023

Thank you two for sharing your experience!

It looks like there needs to be a change over here? To not assert for the 'replicating' status?

@ZakariaHili
Copy link

I'm facing the same issue with 5.6.2.

@al-nwokedike-imprivata
Copy link

I was able to resolve temporarily by removing the state and importing it back. Since the resource is already replicating, plan shows no changes afterwards

@ixl-cchew
Copy link
Author

ixl-cchew commented Jul 21, 2023

I was able to resolve temporarily by removing the state and importing it back. Since the resource is already replicating, plan shows no changes afterwards

So you did a terraform state rm [module].[resourcename] and then just apply it again?

@taylorbartley
Copy link

I was able to resolve temporarily by removing the state and importing it back. Since the resource is already replicating, plan shows no changes afterwards

So you did a terraform state rm [module].[resourcename] and then just apply it again?

I was having a similar result if I used untaint. You can have mixed results after that, if you try to apply another change too quickly, if the DB isn't actually finished modifying.

@al-nwokedike-imprivata
Copy link

I was able to resolve temporarily by removing the state and importing it back. Since the resource is already replicating, plan shows no changes afterwards

So you did a terraform state rm [module].[resourcename] and then just apply it again?

terraform state rm [module].[resourcename], then terraform import [module].[resourcename] [resource id]
You can get the id from the console since the resource has already been created

@vladelleus
Copy link

I am facing the same issue with cross-region automated backups. It failed but also enables the cross-region feature on the target automated backup and starts to creating replica snapshots in another region.

@ixl-cchew
Copy link
Author

I was able to resolve temporarily by removing the state and importing it back. Since the resource is already replicating, plan shows no changes afterwards

So you did a terraform state rm [module].[resourcename] and then just apply it again?

terraform state rm [module].[resourcename], then terraform import [module].[resourcename] [resource id] You can get the id from the console since the resource has already been created

This worked! Thank you! And also just for clarity sake, it seems like the [resource id] is simply the ARN of the backup.

@vladelleus
Copy link

vladelleus commented Jul 25, 2023

This worked! Thank you! And also just for clarity sake, it seems like the [resource id] is simply the ARN of the backup.

I also think it would be better for this resource to contain information about the snapshots it created, not simply the id of the automated backups :(

@ixl-cchew
Copy link
Author

Although there is a workaround, we would still want a fix for this, please.

@justinretzolk justinretzolk removed service/kms Issues and PRs that pertain to the kms service. needs-triage Waiting for first response or review from a maintainer. labels Jul 26, 2023
@lilia-a-b
Copy link

I'm also facing the same error ,and looking forward to get a fix on this issue as soon as possible

@ewbankkit ewbankkit changed the title [Bug]: [Bug]: r/aws_db_instance_automated_backups_replication: unexpected state 'Pending' Jul 31, 2023
@github-actions github-actions bot added the service/kms Issues and PRs that pertain to the kms service. label Jul 31, 2023
@ewbankkit
Copy link
Contributor

It looks like the AWS RDS API status codes change capitalization without any notification 😢.
We can reproduce this in our CI testing.

@taylorbartley
Copy link

It looks like the AWS RDS API status codes change capitalization without any notification 😢. We can reproduce this in our CI testing.

That was my guess, since there were no issues and then suddenly several in one day. Very nice of them to do that, if so.

I just don't know enough of GO to try to provide a change myself.

Thank you two for sharing your experience!

It looks like there needs to be a change over here? To not assert for the 'replicating' status?

But I do believe this is the correct location in code

@ewbankkit
Copy link
Contributor

I am working on a fix...

@ewbankkit ewbankkit removed the service/kms Issues and PRs that pertain to the kms service. label Jul 31, 2023
@github-actions github-actions bot added this to the v5.11.0 milestone Jul 31, 2023
@sportymsk
Copy link

@ewbankkit - will this fix be propagated to 4.x versions?

@toadjaune
Copy link

I was able to resolve temporarily by removing the state and importing it back. Since the resource is already replicating, plan shows no changes afterwards

So you did a terraform state rm [module].[resourcename] and then just apply it again?

terraform state rm [module].[resourcename], then terraform import [module].[resourcename] [resource id] You can get the id from the console since the resource has already been created

The following also works as a workaround :

terraform untaint aws_db_instance_automated_backups_replication.your_resource_name

@lilia-a-b
Copy link

how can I get this update?

@github-actions
Copy link

github-actions bot commented Aug 3, 2023

This functionality has been released in v5.11.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@github-actions
Copy link

github-actions bot commented Sep 3, 2023

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/rds Issues and PRs that pertain to the rds service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants