-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade from 1.5.9 to 1.6.0 breaks the EFS #1111
Comments
This is similar to an issue we are seeing, so I'll some additional context. We have a Cilium network policy to allow the controller egress access to AWS but not IMDS. CSI nodes do not have egress access to anything. The controller is using IRSA. The controller logs indicate that it will use Kubernetes for metadata, but when trying to provision or delete a PV it is reaching out to IMDS and timing out. |
@david-a-morgan and others experiencing the issue: how are you installing the driver? Through This recent commit removed A while ago, this commit was merged which allows us to pull EC2 info from Kubernetes instead of IMDS if IMDS is enabled. However, it requires the I'll open a PR to add it in. However, this brings up two additional points:
I'll open up issues on the project to track the above two items. |
We install the driver using Helm. I did notice the new Here are more details as to what we are experiencing:
In both cases Hubble shows that there are repeated attempts to egress to IMDS even when the controller is using Kubernetes metadata. |
/reopen |
@RyanStan: You can't reopen an issue/PR unless you authored it or you are a collaborator. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@Ashley-wenyizha: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Ok, I think I figured out the issue. The CSI Driver uses the region it pulls from Kubernetes metadata to build a client to the EFS API. This is working as expected. However, the utility that the csi driver uses under the hood for performing mounts to EFS, efs-utils, requires IMDS to find the region, which it then uses to construct the DNS name for the mount target. The reason that I didn't run into this issue when initially trying to recreate it was because my efs-utils configuration file had been hardcoded with the correct region, so IMDS was not needed. The immediate solution here is to add And for the long term solution: We will also need to update our testing infra to test against a IMDS disabled cluster. |
@david-a-morgan and others that experienced this issue: Were you performing a cross-region mount? Also, were you using IRSA with your Node Daemonset Pods (e.g. annotating them with an IAM Role)? I assume the answer to this second question is no, because our current documentation doesn't list this as a requirement, but this will need to change. I was looking into this a bit more to try, and I found that the watchdog process should overwrite the region in the efs-utils configuration with the |
it seems that the Region value is not wired into the config template
|
I am still facing this issue, below are the details: I am using HELM to install the driver:
Error: Output: Error retrieving region. Please set the "region" parameter in the efs-utils configuration file. |
/kind bug
What happened?
After upgrading from 1.5.9 -> 1.6.0, started getting errors
Output: Error retrieving region. Please set the "region" parameter in the efs-utils configuration file.
What you expected to happen?
EFS should get mounted
How to reproduce it (as minimally and precisely as possible)?
Uprgade EFS from 1.5.9 --> 1.6.0
Anything else we need to know?:
Did verify the IAM policy it does have "ec2:DescribeAvailabilityZones"
On Side note we use cilium, and did see hosnetwork was removed in 1.6.0 Helm Chart deployment
Environment
kubectl version
): 1.24Please also attach debug logs to help us better diagnose
The text was updated successfully, but these errors were encountered: