-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add/hpc7g node arm support #6743
Conversation
@@ -57,6 +57,7 @@ spec: | |||
- g4dn.metal | |||
- g5.48xlarge | |||
- hpc6a.48xlarge | |||
- hpc7g.16xlarge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
V - lets update with all hpc7g types here since they all have EFA. Add hpc7g.8xlarge, hpc7g.4xlarge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can do! I didn’t want to add them before being able to test. Any updates on if it might be possible to have an ARM build for the plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! And per our discussion, I can offer to do a PR to add these images here https://github.com/aws-samples/efa-device-plugin-helm/blob/main/aws-efa-k8s-device-plugin after we finish up here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay opened, albeit we should not merge it! aws-samples/efa-device-plugin-helm#1
This is working now and can be reviewed. I changed nothing, and I have no idea why it's working. I think it might be related to the aws metadata (describe-instances) that sets conditions for the efa / network devices. If it wasn't completed/ready on the first tries, maybe that could lead to this outcome? |
994a2d2
to
2adbe41
Compare
Why was this closed? |
@vsoch, apologies again, it was closed by the stale bot. I have reopened it now and applied a label that should prevent it from being automatically closed. Please give us some time to review it, the team is occupied with other deliverables. |
Thank you! |
Please re-open again, thank you! |
2adbe41
to
85112d8
Compare
Hi @cPu1 could you please give feedback to the CI errors? I'm seeing them show up in other PRs and it looks to be that an incorrect function signature is being used, for example: GetOutpostInstanceTypes(context.backgroundCtx,*outposts.GetOutpostInstanceTypesInput,func(*outposts.Options))
0: context.backgroundCtx{emptyCtx:context.emptyCtx{}}
1: &outposts.GetOutpostInstanceTypesInput{OutpostId:(*string)(0xc0000542c0), MaxResults:(*int32)(nil), NextToken:(*string)(nil), noSmithyDocumentSerde:document.NoSerde{}}
2: (func(*outposts.Options))(0x7ec320)
The closest call I have is:
GetOutpostInstanceTypes(string,string)
0: "mock.Anything"
1: "mock.Anything" and notably we don't touch the relevant code here. Is there another PR that is fixing these CI issues we should watch? Thanks! |
@vsoch could you please rebase with |
Signed-off-by: sochat1 <[email protected]>
Signed-off-by: sochat1 <[email protected]>
Signed-off-by: vsoch <[email protected]>
85112d8
to
f95076b
Compare
All set! |
Description
Problem: The new hpc7g images use the graviton2 processor (arm) but are not detected as such by eskctl. In addition, as we have been discussing in #6222, the daemon set for the efa device driver does not work with
runAsNonRoot
set to true. I believe this tweak is close, however the final step (I think) needs to also be to provide an ARM build for the driver itself. I believe this is proprietary code, so I wanted to ask here first about that. I was able to figure out the container entrypoint and output, in case that helps:Note that online examples for efa (e.g., this repository) is not exactly what we want - the Dockerfile will build a container that can use EFA but not one that has that particular executable.
Let me know how you would like to proceed!