-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configurable local exec command for waiting until cluster is healthy #701
Configurable local exec command for waiting until cluster is healthy #701
Conversation
@maganuk @barryib @dpiddockcmp @RothAndrew . Would you guys be kind enough to look at his and shed some insight on how I've approached it? I am open to suggestions and discussions that drive to a meaningful solution. I am fairly new new to terraform, so your reviews would be greatly appreciated. |
Hey, I don't know whats changed in v8 but v7 use to work just fine without waiting for the endpoint to be active. Maybe because I was doing my own Aws auth, and there is an option to skip Aws auth. For such cases, can we not make this check optional? For this, could you add a count variable and have a variable like Wait_For_Endpoint_Ready. Default can be true. Your approach seems fine. There was a similar thing done in one of the previous versions of this module. |
It looks like aws_auth.tf had local-exec to update kube config map if manage_aws_auth was set. This local-exec has been removed in this revision 9363662. @stijndehaes can elaborate a little on that. The check for healthy cluster was added by @shaunc as a part of #639. They seem to be related though. @stijndehaes @shaunc should be able to shed some light on this? Thanks in advance. |
* Configurable local exec command for waiting until cluster is healthy * readme * line feeds * format * fix readme * fix readme * change log
* changelog * changelog
We did try terraform specific (like using http resource for health check) -- couldn't get to work. I think the maintainers decided that windows probably had other things broken as well; but certainly PR that allowed customizing command would seem reasonable to me (I am not a maintainer). I started using after the switch from local_exec to kubernetes, so I can't speak to that except that the maintainers do want to shift to terraform-supported if possible. (Just we couldn't figure out how to wait for cluster using only terraform.) |
Since we’re only using this for Manage_AWS_Auth, can we make the local_exec
conditional based on the value of Manage_AWS_Auth?
…On Tue, 21 Jan 2020 at 01:16, Shaun Cutts ***@***.***> wrote:
We did try terraform specific (like using http resource for health check)
-- couldn't get to work. I think the maintainers decided that windows
probably had other things broken as well; but certainly PR that allowed
customizing command would seem reasonable to me (I am not a maintainer). I
started using after the switch from local_exec to kubernetes, so I can't
speak to that except that the maintainers do want to shift to
terraform-supported if possible. (Just we couldn't figure out how to wait
for cluster using only terraform.)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#701?email_source=notifications&email_token=ADZIOFG7QDYN5TRNQBUOQBDQ6X5PXA5CNFSM4KJFOLXKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJNU2SI#issuecomment-576408905>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADZIOFD4VMBHJAEJEYZVUBDQ6X5PXANCNFSM4KJFOLXA>
.
|
You'll have to get a maintainer to chime in. IMO terraform-built resources are generally supposed to be "ready to use" when apply completes, so they can be used as part of larger builds. From that perspective, waiting for the cluster to be up should be the "normal case", irrespective of what it is used for internally. But having some option to change (or possibly skip) the health check doesn't seem unreasonable. |
This approach with the provisioner "local-exec" {
command = <<EOT
${var.wait_for_cluster_cmd == "" ? "until curl -k -s $${ENDPOINT}/healthz >/dev/null; do sleep 4; done" : var.wait_for_cluster_cmd}
EOT
}
That's also a good idea. EDIT: maybe don't even need the ugly |
@max-rocket-internet let me explore your advice. :) Thanks for chiming in.
|
On further reading, looks like count can be used with local-exec (meta-arguments, I will try that) |
cluster.tf
Outdated
command = <<EOT | ||
until curl -k -s ${aws_eks_cluster.this[0].endpoint}/healthz >/dev/null; do sleep 4; done | ||
EOT | ||
command = var.manage_aws_auth ? var.wait_for_cluster_cmd : "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@max-rocket-internet this would optionally perform a no-op if manage_aws_auth is disabled? I may have to use count, but was unsure on wiring dependencies on the kubernetes_config_map resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like it but have you tested this in both situations? No errors or issues with having a command of ""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get this error if manage_aws_auth = false
:
Error: local-exec provisioner command must be a non-empty string
So I think you need to do this:
resource "null_resource" "wait_for_cluster" {
count = var.manage_aws_auth ? 1 : 0
depends_on = [
aws_eks_cluster.this[0]
]
provisioner "local-exec" {
command = var.wait_for_cluster_cmd
environment = {
ENDPOINT = aws_eks_cluster.this[0].endpoint
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I wanted to in my initial attempt, but I was not sure how this would be included as a dependency in the aws_auth module. I will try this and test without aws_auth. Thanks again.
EOT | ||
command = var.manage_aws_auth ? var.wait_for_cluster_cmd : "" | ||
environment = { | ||
ENDPOINT = aws_eks_cluster.this[0].endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is close to what you were expressing? Not sure :) but it does seem to work, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is much cleaner IMO 😃
@max-rocket-internet I believe I have addressed all your concerns. I tested the changes without aws managed authentication too. I am not sure what is happening with docs linter? |
Anyone has any requests for me? I believe I have addressed the changes requested so far. Can someone help me with doc linter issue. It seems like its failing for everyone. |
I tested with basic example with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sanjeevgiri, well done!
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
PR o'clock
Description
Currently executing this module for windows fails due to local-exec command that waits until the cluster is healthy. The command being used
until curl -k -s %CLUSTER_HEALTH_ENDPOINT% >/dev/null; do sleep 4; done
only works in *nix systems. We could default to using this command, however, it would be great if non *nix users would have to option to specify custom command that would achieve the same. (Even better would be terraform specific resource that would allow us to wait until a http url becomes available :)).This PR attempts to add the ability to define os-specific commands to wait for a healthy cluster to be available.
Checklist
#680