Skip to content

Commit

Permalink
enos: poweroff and terminate instances when shutting them down (#28316)
Browse files Browse the repository at this point in the history
Previously our `shutdown_nodes` modules would halt the machine. While
this is useful for simulating a failure it makes cleaning up the halted
machines very slow in AWS.

Instead, we now poweroff the machines and utilize EC2's instance
poweroff handling to immediately terminate the instances.

I've test both scenarios locally utilizing the change and both still
work as expected. I also timed before and after and this change saves 5
MINUTES in total runtime (~40%) for the PR replication scenario. I assume
it yields similar results for autopilot.

Signed-off-by: Ryan Cragun <[email protected]>
  • Loading branch information
ryancragun authored Sep 9, 2024
1 parent 899c18b commit 0764d7d
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 8 deletions.
2 changes: 1 addition & 1 deletion enos/modules/shutdown_multiple_nodes/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ variable "old_hosts" {

resource "enos_remote_exec" "shutdown_multiple_nodes" {
for_each = var.old_hosts
inline = ["sudo shutdown -H --no-wall; exit 0"]
inline = ["sudo shutdown -P --no-wall; exit 0"]

transport = {
ssh = {
Expand Down
2 changes: 1 addition & 1 deletion enos/modules/shutdown_node/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ variable "host" {
}

resource "enos_remote_exec" "shutdown_node" {
inline = ["sudo shutdown -H --no-wall; exit 0"]
inline = ["sudo shutdown -P --no-wall; exit 0"]

transport = {
ssh = {
Expand Down
15 changes: 9 additions & 6 deletions enos/modules/target_ec2_instances/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -186,12 +186,15 @@ resource "aws_security_group" "target" {
resource "aws_instance" "targets" {
for_each = local.instances

ami = var.ami_id
iam_instance_profile = aws_iam_instance_profile.target.name
instance_type = local.instance_type
key_name = var.ssh_keypair
subnet_id = data.aws_subnets.vpc.ids[tonumber(each.key) % length(data.aws_subnets.vpc.ids)]
vpc_security_group_ids = [aws_security_group.target.id]
ami = var.ami_id
iam_instance_profile = aws_iam_instance_profile.target.name
// Some scenarios (autopilot, pr_replication) shutdown instances to simulate failure. In those
// cases we should terminate the instance entirely rather than get stuck in stopped limbo.
instance_initiated_shutdown_behavior = "terminate"
instance_type = local.instance_type
key_name = var.ssh_keypair
subnet_id = data.aws_subnets.vpc.ids[tonumber(each.key) % length(data.aws_subnets.vpc.ids)]
vpc_security_group_ids = [aws_security_group.target.id]

tags = merge(
var.common_tags,
Expand Down

0 comments on commit 0764d7d

Please sign in to comment.