Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] IAM roles are not passed on to MPI Launcher and worker pods #3043

Closed
2 tasks done
bimtauer opened this issue Nov 1, 2022 · 6 comments
Closed
2 tasks done

[BUG] IAM roles are not passed on to MPI Launcher and worker pods #3043

bimtauer opened this issue Nov 1, 2022 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@bimtauer
Copy link
Contributor

bimtauer commented Nov 1, 2022

Describe the bug

An IAM role defined in the launch dialog, in a launchplan or in the k8s defaults will not be passed on to an MPI Job's launcher and worker pods and thus results in S3 errors if iam role authentication is used.

This also concerns the default IAM role if declared which according to my understanding is usually applied together with other defaults as part of AddObjectMetadata.

Good news is all annotations are properly set on the top level MPIJob manifest - however they are not set on the PodTemplates for the worker and launcher pods withing that generated manifest.

Expected behavior

A default IAM role (defined as a default annotation) or an IAM role provided in the launch dialog should be passed down to an MPI Jobs worker and launcher pods

Additional context to reproduce

Run a workflow with an MPI task and specify a custom iam role upon execution.

Screenshots

image

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@bimtauer bimtauer added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Nov 1, 2022
@eapolinario eapolinario removed the untriaged This issues has not yet been looked at by the Maintainers label Nov 4, 2022
@hamersaw
Copy link
Contributor

hamersaw commented Nov 8, 2022

@bimtauer thanks for filing this issue! It looks like this is not just and MPI job issue but should probably be propagated through to the pytorch and tensorflow plugins as well.

So the AddObjectMetadata you mentioned is only meant to be executed on the top-level k8s resource as you observed. Do you expect it to add labels / annotations / etc to the underlying worker pods (etc) as well?

As far as adding a service account this should be a very easy fix. We are already doing this in the ray and spark tasks. You should be able to call the flytek8s.GetServiceAccountNameFromTaskExecutionMetadata function to retrieve the correct account.

Is this something you might be able to contribute? I can offer a speedy review :)

@bimtauer
Copy link
Contributor Author

bimtauer commented Nov 9, 2022

@hamersaw thanks for looking into this. And yes we expect to run into the same issue with Spark jobs or any other complex manifest rendered by Flyte that itself declares own PodTemplates.

My first thought went back to AddObjectmetadata since I in the past once submitted a proposal for implementing some templating logic there - I think since then you have created a more powerful templating logic here. But I guess both of those paths are "top level" in the sense that they currently only modify k8s pods directly described by Flyte.

Service accounts are definitely needed as well - we use those to authenticate pods with Vault. But afaik kube2iam relies solely on a matching annotation, so for aws access we'd need those to be propagated down as well.

I suppose this here is where the MPI spec is first put together. Maybe you can use your flytek8s.BuildPodWithSpec(podTemplate, podSpec, primaryContainerName) logic directly there (and in other plugins) to create the pod templates inside those plugin specific manifests?

@hamersaw
Copy link
Contributor

@bimtauer ah OK, sorry for some reason I read "IAM role" and replaced it with "ServiceAccount". But I think I better understand this now. Also, we recently improved the configuration of the spark plugin, so this is something that the community expects as we get deeper. The proposed fix for this seems to be relatively similar. I think we could get the IAM role annotations included relatively easy, but a better fix is a lot more work. Just to make sure I fully understand the scope here, we are proposing to do a few things:

  1. Use the DefaultPodTemplate work to serve as a base for the launcher and worker MPI job pods. It looks like this should be as easy as calling the BuildPodWithSpec function on the launcher and worker PodSpecs. Then when constructing the PodTemplate we could use the ObjectMeta and PodSpec from the resulting Pod or change the BuildPodWithSpec to not build the Pod but just return ObjectMeta and PodSpec.
  2. Apply the correct configuration from the k8s plugin configuration and task context to update things like annotations, labels, etc. This will be very similar to AddObjectMetadata field in FlytePropeller but will NOT set things like Name or inject finalizers or assign owner references because Flyte is not the managing entity for these Pods.
  3. Apply this to all of the k8s operators, ex pytorch, tensorflow.

I think all of this is certainly reasonable! Let me know if I'm understanding this correctly.

@bimtauer
Copy link
Contributor Author

@hamersaw Thanks for this summary, this sounds very much like what we need.

One small point that still confuses me is when you would recommend using PodTemplates VS using the k8s plugin config fields.

My understanding is that I can use PodTemplates to do anything I could have otherwise done in the k8s plugin config.yaml (i.e default annotations, labels, tolerations etc.) and much more - I can create PodTemplates namespace-specific which means I can create different defaults for each project and domain.

If the above is right then I would prefer only using PodTemplates in the future, so no need for point 2. at least from us.

@hamersaw
Copy link
Contributor

@bimtauer so the default PodTemplate work was done because we believe it is a much better mechanism than exposing configuration through the k8s plugin. So you are 100% correct, we are hoping users slowly transition from k8s plugin config fields to entirely using PodTemplates.

@hamersaw
Copy link
Contributor

Fixed by flyteorg/flyteplugins#297

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants