Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CARM ack-role-account-map ConfigMap updates are not propagated to ACK controllers #2088

Closed
itaiatu opened this issue Jun 13, 2024 · 13 comments
Closed
Assignees
Labels
area/runtime Issues or PRs as related to controller runtime, common reconciliation logic, etc kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@itaiatu
Copy link

itaiatu commented Jun 13, 2024

Describe the bug
CARM ack-role-account-map ConfigMap updates are not propagated to ACK controllers.
Maybe related to #2011.

Steps to reproduce
I performed the following scenario:

  • CARM ack-role-account-map ConfigMap
apiVersion: v1
data:
  "00000000000": arn:aws:iam::00000000000:role/X-ack-iam-controller
  "11111111111": arn:aws:iam::11111111111:role/X-ack-iam-controller-assumedrole
kind: ConfigMap
metadata:
  name: ack-role-account-map
  namespace: ack-iam-controller
  • Changed the value of 11111111111 in the data field in the to
"11111111111": arn:aws:iam::11111111111:role/X-ack-iam-controller-assumedrole2
  • IAM controller detected the change
{"level":"debug","ts":"2024-06-13T09:56:32.974Z","logger":"cache.account","msg":"updated account config map","name":"ack-role-account-map"}
  • After changing the value of "11111111111" AWS subscription id to a dummy value (adding the 2 at the end of the value), I still managed to create a Role and a Policy in the ns-ack-test namespace (see below the Namespace spec). This is not the desired behaviour, since that role doesn't exist (with the 2 character at the end of the string).

  • After I restarted the IAM controller's pod, got 403s, the expected behaviour, since that role is not a valid one

{"level":"debug","ts":"2024-06-13T10:54:38.873Z","logger":"ackrt","msg":"<<< rm.sdkFind","kind":"Policy","namespace":"ns-ack-test","name":"ack-test-policy","account":"11111111111","role":"arn:aws:iam::11111111111:role/X-ack-iam-controller-assumedrolee","region":"us-west-2","is_adopted":false,"generation":1,"error":"AccessDenied: User: arn:aws:sts::00000000000:assumed-role/X-ack-iam-controller/1718276077280567570 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::11111111111:role/X-ack-iam-controller-assumedrolee\n\tstatus code: 403, request id: ad4f99b2-5b84-403e-ac16-ee1e56293ef0"}
  • Vice-versa, if I start with a CARM wrong configuration (wrong assumedrole names), I get 403s as expected, and after I fix the role name in ack-role-account-map to match the correct assumedrole, I still receive 403s when trying to create resources.

  • This makes me think that although the log messages say that the runtime cache is updated, the change is not propagated to ACK controllers.

Expected outcome
When ack-role-account-map is edited, ACK controllers will use the updated values from the data field.

Environment

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    services.k8s.aws/owner-account-id: "11111111111"
  name: ns-ack-test
spec:
  finalizers:
    - kubernetes
  • Kubernetes version: 1.28
  • Using EKS (yes/no), if so version? 1.28
  • AWS service targeted (S3, RDS, etc.): Multiple (at least IAM, EC2)
@a-hilaly a-hilaly added kind/bug Categorizes issue or PR as related to a bug. area/runtime Issues or PRs as related to controller runtime, common reconciliation logic, etc priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Jun 17, 2024
@a-hilaly a-hilaly self-assigned this Aug 6, 2024
@a-hilaly
Copy link
Member

a-hilaly commented Aug 6, 2024

@itaiatu thank you for opening this issue. I was able to reproduce the problem. It appears that the code we generate to operate "ResourceManagers" uses caching to avoid re-initializing the SDK client at every reconciliation loop - the cache uses $accountID/$roleARN as the key/index, causing the controller to always reuse the same cache for every of accountID and region combination. Will send a fix to introduce a better mechanism here.

/cc @adriananeci @mumlawski @mattzesh @victorvarza

@mumlawski
Copy link

We see this problem fixed now in IAM controller v1.3.10.

@itaiatu
Copy link
Author

itaiatu commented Aug 20, 2024

I tested again with EC2 (v1.2.17) and IAM (v1.3.10) and the problem is still there.
cc: @a-hilaly

@swapnachagam
Copy link

Is there an update on this solution please?

@itaiatu
Copy link
Author

itaiatu commented Sep 4, 2024

Tested it now with the EC2 controller, version 1.2.19 and it works.

@itaiatu
Copy link
Author

itaiatu commented Sep 4, 2024

I am going to keep this issue opened for a while to see if the problem is solved for all controllers, but I assume everything should be ok, because the change was done in the runtime package, so it will be contained in all ACK controller repos.

@a-hilaly
Copy link
Member

a-hilaly commented Sep 4, 2024

@itaiatu Indeed the fix made it to the core runtime and all of the controllers have been patched.
cc @swapnachagam @mumlawski

@swapnachagam
Copy link

swapnachagam commented Sep 5, 2024

Hi @a-hilaly @itaiatu I tested for IAM Controller using version 1.3.12. I have updated the configmap to a different role than the one originally which is required for cross account mapping but controller is still running with the previous role only. I don't see anhy reconcilation happening in the controller end
As soon as i restart, it is picking the new role though. Let me know what logs i should share here

I see that logs being updated for s3 Controller but i don't see the same for IAM/CLOUDWATCHLOGS/LAMBDA CONTROLLERS

{"level":"debug","ts":"2024-09-05T15:16:02.761Z","logger":"cache.account","msg":"updated account config map","name":"ack-role-account-map"}

@a-hilaly
Copy link
Member

a-hilaly commented Sep 5, 2024

@swapnachagam Updating the role in CARM, doesn't necessarily trigger a reconciliation. You'll have to manually trigger the reconciliation of wait for the periodic reconcile.

@swapnachagam
Copy link

Thank you.. It is working by triggering a reconciliation

@swapnachagam
Copy link

Is there a way instead of us manually triggering a reconcilation, can ack-controllers have a second watch for these configmap updates and reconcile itself in a low priority queue?
We have many controllers and also lot of AWS accounts(~100) which have to be onboarded

@a-hilaly
Copy link
Member

@swapnachagam I'm not sure whether we need to implement this, I imagine controllers managing a lot of resource would reconcile everything even for a simple CARM configMap modification. But i'm not opposed to it. Can you please open a Github issue describing this feature? we'll discuss it in our next community meeting https://github.com/aws-controllers-k8s/community/?tab=readme-ov-file#community-meeting

For now i'll close this as it's resolved

@swapnachagam
Copy link

Thank you. I will raise a separate requst for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/runtime Issues or PRs as related to controller runtime, common reconciliation logic, etc kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

No branches or pull requests

4 participants