-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: Post "http://localhost/api/v1/namespaces/kube-system/configmaps": dial tcp 127.0.0.1:80: connect: connection refused #911
Comments
If you have |
I think this is a problem with the k8s provider it self... I have it configured correctly and randomly fails to connect: hashicorp/terraform#4149 I found a comment in the provider Golang code that explains the problem: |
I get this as well, but when I tried to disable the cluster after creation (so destroy it) the plan fails.
|
I encountered the similar issue: |
@dpiddockcmp It would help if you were just a tad bit more specific. What precisely in the README's example configuration solves this problem? Is it the version number which is called out explicitly in the readme the important part? Meaning we cant take the latest version for some reason? The concat functions in use? I tried a copy paste of that readme and i get:
|
As a workaround... I was able to use the AWS CLI to write my config, after my first deployment was only partially successful...
Putting this ^ before the apply step (obviously only works after a successful creation of the cluster when the update to the aws-auth CM simply failed) worked for me... But if we ever burn the infra to the ground, we need to do this multi step process again. |
Ran into this as well... given the relative instability around the lifecycle of EKS using this module I'm probably going to consider separating it from other infra in the vpc. |
You need to copy the two data sources and the kubernetes provider block from the usage example. Assuming your module definition was called data "aws_eks_cluster" "cluster" {
name = module.eks.cluster_id
}
data "aws_eks_cluster_auth" "cluster" {
name = module.eks.cluster_id
}
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
load_config_file = false
version = "~> 1.12"
} |
@dpiddockcmp yep I get that, the problem is encountered if you set the create_eks flag to false to destroy the cluster and then set it back to true. I think I hit some other funkiness where the state file even showed the correct host and CA cert but the provider was using the local host and missing CA entirely. Will see if I can get a more specific set of repro steps. |
I had the same problem. you have to delete the cluster manually because the terraform just says that it already exists instead of deleting then recreating. Once you delete the cluster you get this error. I resolved the problem by running terraform state rm module.saasoptics.module.eks.kubernetes_config_map.aws_auth[0] |
Yeah, looked into this a bit. It's because terraform tries to refresh the config map resource before deleting it -- however the cluster's already been destroyed. This module essentially needs to ensure that destruction of the config map happens before cluster destruction if that's possible. Otherwise the manual removal of the configmap from the state seems like the best solution here. An alternative workaround to cleanly remove the cluster (you must not have gotten yourself into the state where you have the localhost error for this to work):
|
this fixed it for me
thanks, @cidesaasoptics |
I got the same error after the following sequence:
I was able to complete the creation by setting |
Same issue here, in my pipeline the kubeconfig is not present during apply as it's a new ci run in a fresh ci container. EDIT: The workaround mentioned worked for me too: First apply the pipeline with |
if you are deleting and know the config map is gone, creating a listener and giving the terraform client a I don't see any real difference in the net effect between this and the
|
That's it. In my case first I ran |
I guess, root cause of the problem is insufficient validity time out value of cluster token which is only 15 minutes or very long cluster creation times of EKS which sometimes take more than 15 minutes, which ever you choose. If internal data resources such as
How ever In those cases token is invalidated and have to be refreshed, but in the same terraform file this is not possible because data refresh occurs at the very beginning of the TF process. My final Workaround to guarantee "Unauthorized" error free updates I'm using Terragrunt to overcome terraforms many of those weaknesses such as dynamic dependencies, counts... which should be calculated before TF is applied I changed K8s provider authorization configuration as follows enabling kubeconfig file instead of using internal token mechanism. data "aws_eks_cluster" "cluster" {
count = var.cluster_id != "" ? 1 : 0
name = var.cluster_id
}
data "aws_eks_cluster_auth" "cluster" {
count = var.cluster_id != "" ? 1 : 0
name = var.cluster_id
}
provider "kubernetes" {
config_path = "${path.module}/.kubeconfig"
config_context = element(concat(data.aws_eks_cluster.cluster[*].arn, list("")), 0)
// host = element(concat(data.aws_eks_cluster.cluster[*].endpoint, list("")), 0)
// cluster_ca_certificate = base64decode(element(concat(data.aws_eks_cluster.cluster[*].certificate_authority.0.data, list("")), 0))
// token = element(concat(data.aws_eks_cluster_auth.cluster[*].token, list("")), 0)
// load_config_file = false
}
Notice that data resource For every request which is sent to kubernetes api: aws --profile $profile eks update-kubeconfig --kubeconfig .kubeconfig --name $cluster > /dev/null 2>&1 || true Which:
Where:
This solution worked well for my deployments which are held separately outside of the POSSIBLE SOLUTION: disable config map in this module : Then take New TF file to be excuted after cluster creation. resource "kubernetes_config_map" "aws_auth" {
#count = var.create_eks ? 1 : 0
metadata {
name = "aws-auth"
namespace = "kube-system"
labels = merge(
{
"app.kubernetes.io/managed-by" = "Terraform"
"terraform.io/module" = "terraform-aws-modules.eks.aws"
},
var.aws_auth_config.additional_labels
)
}
data = var.aws_auth_config.data
}
variable "aws_auth_config" {
description = "aws_auth_config data"
type = any
}
This should be added as extra_outputs.tf in your original TG environment which creates cluster. output "aws_auth_config" {
value = {
additional_labels = var.aws_auth_additional_labels
data = {
mapRoles = yamlencode(
distinct(concat(
local.configmap_roles,
var.map_roles,
))
)
mapUsers = yamlencode(var.map_users)
mapAccounts = yamlencode(var.map_accounts)
}
}
} terragrunt.hcl for applying aws_auth seperately locals {
profile="<my_aws_profile>"
cluster_name="<my-cluster-name>"
}
terraform {
source = "${get_parent_terragrunt_dir()}/modules/terraform-aws-eks-auth"
before_hook "refresh_kube_token" {
commands = ["apply", "plan","destroy","apply-all","plan-all","destroy-all","init", "init-all"]
execute = ["aws", "--profile", local.profile, "eks", "update-kubeconfig", "--kubeconfig", ".kubeconfig", "--name", local.cluster_name]
}
}
# Inputs to passed to the TF module
inputs = {
aws_auth_config = dependency.cluster.outputs.aws_auth_config
}
dependency "cluster" {
config_path = "../cluster"
} Output: [terragrunt] 2021/01/18 13:16:08 Executing hook: refresh_eks_token
[terragrunt] 2021/01/18 13:16:08 Running command: aws --profile my-profile eks update-kubeconfig --kubeconfig .kubeconfig --name my-cluster
Updated context arn:aws:eks:eu-central-1:767*****7216:cluster/my-cluster in /Users/***/dev/repo/live/infrastructure/.terragrunt-cache/767****216/dynamic/eu-central-1/shared/k8s/auth/zlnrmfl5SmuqAD673E7b9AUFDew/NYFQi6hkBT1xdxho53XJtCNnpYs/.kubeconfig
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# kubernetes_config_map.aws_auth will be created
+ resource "kubernetes_config_map" "aws_auth" {
+ data = {
+ "mapAccounts" = jsonencode([])
+ "mapRoles" = <<-EOT
- "groups":
- "system:bootstrappers"
- "system:nodes"
"rolearn": "arn:aws:iam::760******216:role/my-cluster20210118081306354500000009"
"username": "system:node:{{EC2PrivateDNSName}}"
- "groups":
- "system:masters"
"rolearn": "arn:aws:iam::292******551:role/MyTestRole"
"username": "MyTestRole"
EOT
+ "mapUsers" = jsonencode([])
}
+ id = (known after apply)
+ metadata {
+ generation = (known after apply)
+ labels = {
+ "app.kubernetes.io/managed-by" = "Terraform"
+ "terraform.io/module" = "terraform-aws-modules.eks.aws"
}
+ name = "aws-auth"
+ namespace = "kube-system"
+ resource_version = (known after apply)
+ self_link = (known after apply)
+ uid = (known after apply)
}
}
Plan: 1 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
kubernetes_config_map.aws_auth: Creating...
kubernetes_config_map.aws_auth: Creation complete after 1s [id=kube-system/aws-auth] Conclusion: Happy Coding |
This also happens when the EKS cluster is deleted out from under terraform, since it is try to talk to the K8S api endpoint which no longer exists. I have seen this is some dev workflows. The command from above |
I can confirm successful creation and destroys and everything using the In the new v2 of the Kubernetes provider, there is a dedicated example on how to use it with EKS, which I just copy/pasted 🙂 |
Unfortunately I've just hit the same issue with module version v14.0.0 and terraform 0.14.5. Still trying to find a fix. Deleting kubernetes_config_map from the state doesn't work. |
I figured out how to pin the version of the Kube provider in Terraform 14:
|
It happened to me, after I destroyed the cluster successfully only the configmap resource is still there, then when I try to run the
The cluster doesn't exist anymore so will never succeed the destroy. I manually removed (https://www.terraform.io/docs/cli/commands/state/rm.html)
|
Same thing here and manual remove worked. I wonder if a depends_on is missing. |
Same here, fixed by manually removing state |
so does anyone know the root cause of this? I've seen this issue on POST and DELETE for the configmap |
It looks to me that the problem comes when you have more clusters defined in ~/.kube/config. It seems this module ignores the current-context and then fails to read the configuration properly. I use this module for quite a long time and upgraded it lot of times but I never had kubernetes provider and it should continue working without it. It should just read the ~/.kube/config and respect the current context or something like that. I also renamed context name in my kube config, this may be another reason, but still if this module reads the current context it should have correct data. This may be general terraform problem, though, maybe not this module. |
@acim interesting as long as you configure kubernetes provider to use the context that corresponds to the terraform config files not just the current context of kubectl (which could lead to modifying kubernetes resources in wrong kubernetes cluster). Eg
|
This makes sense, thank you :) |
Thank you @matthewmrichter. This solved my problems too. |
Thanks, will consider changing to that, although it's surprising that the simple
|
issue with this approach is that you must set your cluster/context before running terraform, so most probably your scenario is:
so between step 2 and 3 you changed kubernetes context/cluster so terraform provider relying on
which doesn't reflect your EKS which you trying to modify |
I don't think the problem is a mismatch between the current-context set in kubeconfig and the "config_context" in terraform. What would be the point of specifying I also just tried setting different contexts in kubeconfig and terraform in a multi-cluster configuration and could not reproduce the issue. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
/remove_stale |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue was automatically closed because of stale in 10 days |
This issue was automatically closed because of stale in 10 days |
Any ideas why I'm getting this error when trying to destroy the cluster? |
Okay so I'm using Terratest and I get this issue sometimes, I've now fixed it for the second time so I'll share my solution here since I have not seen a solution for persons using Terratest. So for those unfamiliar, Terratest is a testing framework that automates testing IaC focusing on Terraform and Container Orchestration services. A common testing practice is to add in a unique identifier to the Terratest scripts to make every EKS cluster created unique. This ensures that if multiple tests are run, they can be run in isolation with each other, creating new EKS clusters with unique names every time a test is ran. They call this namespacing in the Terratest docs. Anyway, if something goes wrong with the cleanup then you probably haven't been able to destroy the EKS cluster and tried to do a What you need to do is go to the EKS dashboard on aws console, copy the name of the cluster, and paste it in your terraform HCL files/template files in the TL;DR: Once the EKS cluster name on the AWS dashboard is the same as the one in your |
I also found that eks-k8s-role-mapping doesn't work. My fix is to wait for it to fail (not ideal), then create a ~/.kube/config using Then adding the following (per a suggestion above) seems to solve it. But of course this isn't possible until the cluster is built so not really ideal. |
If it helps anybody, the root cause of my issue was due to differences between the context name (alias) and the cluster name. In particular:
Shouldn't there be a clear failure when the provider references a non existent context or cluster? This would mirror
|
This:
is still not working even with
It errors with:
I can see the issue was just left to expire due to inactivity, wasn't this a bug worth attention? UPDATE: Some observations I don't have any issues upon initial
and if I change the
in order to trigger a change I get the above error. Until then everything is fine. For the record, using the
makes no difference I get the same error in both cases. |
Same bug here, when I tried to change the subnets for the eks, terraform plan want to replace the eks, but finally throw this error.
I'm testing with the last version currently available, "18.26.6" , can reproduce the error using the example on the repo: |
changing subnet Ids on the cluster is a destructive operation. This is not controlled by the module but the AWS EKS API |
I dont know why this issue is closed, cause the error is still here. I've got
I'am creating a cluster under assumed role's aws provider. I'm very curious why it tries connect to localhost? |
@a0s its not a module issue, its a mix of provider + user configuration issue |
@bryantbiggs thanks for still looking at this and trying to help. Does this mean we need to raise this issue with AWS EKS team? Because from terraform and this module's perspective even in a case of destructive action like destroying and re-creating the cluster we as users expect |
@bryantbiggs please ignore my previous message I should have read your last message before replying, sorry about that :-/ For the others following on, using P.S. This of course is hardly an acceptable solution (more like a workaround) since there are for sure many other modules in people's project and running the plan with |
No worries |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
I am started getting this issue:
All my code were working fine but as I upgraded my terraform versions, providers version. I started getting above issue.
version on which everything was working:
provider:-
aws: 2.49
kubernetes: 1.10.0
helm: 0.10.4
eks: 4.0.2
others:-
terraform:0.11.13
kubectl: 1.11.7
aws-iam-authenticator:0.4.0-alpha.1
Now my versions
terraform:0.12.26
kubectl: 1.16.8
aws-iam-authenticator:0.5.0
eks.yaml
In the above code write_kubeconfig = "false" and creating a local file kubeconfig. I am using this file in helm and kubernetes provider.
provider.yaml
`provider "aws" {
region = var.region
version = "~> 2.65.0"
assume_role {
role_arn = "arn:aws:iam::${var.target_account_id}:role/terraform"
}
}
provider "kubernetes" {
config_path = "./.kube_config.yaml"
version = "~> 1.11.3"
}
provider "helm" {
version = "~> 1.2.2"
kubernetes {
config_path = "./.kube_config.yaml"
}
}`
On terraform apply, script is not able to create module.eks.kubernetes_config_map.aws_auth[0]:
I tried some of the suggestion mentioned here but didn't worked for me
#817
The text was updated successfully, but these errors were encountered: