-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(bff): use 'kubeflow-userid' header to authorize BFF endpoints #599
Conversation
Signed-off-by: Eder Ignatowicz <[email protected]>
@Griffin-Sullivan @alexcreasy or @lucferbux, One thing that I didn't do is test with the real model registry (not the mocked one). Can you please do this when you try this PR? I've tested with kubernetes (not our env test mock) but not inside the cluster (waiting for @Griffin-Sullivan PR)
|
Signed-off-by: Eder Ignatowicz <[email protected]>
This is governed by the @kubeflow/wg-manifests-leads KF/Platform team, and ought to be cross checked with them, imho. edit: Slack channel if preferred: https://cloud-native.slack.com/archives/C073W572LA2
Indeed because Model Registry is meant as a platform-level service. |
Happy to help, maybe a thread could help in slack to sort some things out faster. But could also do a review here, if it could help |
"model registries access are not restricted by namespace" maybe I misunderstood something, but we definitely need a way to limit access to models. Namespaces can be from different departments that are not allowed to access other departments models. For me that is a core criteria for graduation. |
It is my understanding that if you deploy into a (user) namespace, is considered a tenant, hence you'd have access to registry and metadata of that instance only if indeed you have access to that ns. It is my understanding also that if you consider model registry a platform service, you'd make said instance available to the whole kubeflow scope. |
Yes, available to all kubeflow namespaces, but with different permissions per namespace. |
I'm not sure I fully follow @juliusvonkohout 100%, in such case pardon me. The main example we have, also in the manifests, is for a common Model Registry to the whole KF platform scope. This is helpful as MR is touted as a single pane of glass and shared platform service. I think this was accepted a while back. As a platform service. I think this is a sensible default example given the scope of MR. Of course, downstream admins/distros, may opt for a different setup. In summary, I don't see a reason to apply different tenancy design than what is architected in the linked doc already. Hope this clarifies? |
Also, to substantiate (in the hope it helps) with a picture what I've summarized in #599 (comment) which is my understanding of how the tenancy design from docs is respected wrt to model registry backend deployment(s). The Nothing stops admin/distros to have additional instances, for instance serving different scope of their orgs at the "platform level" (here pictured as As mentioned, we comply tenancy design from docs with admin/distros can opt to have one or more instances in their designated KF ("user") namespaces, here in this example diagram since both personas are represented to have access to both t_A and t_B, can accordingly access MR REST service. ie both can also access registries X and Y. This is wrt Model Registry backend. As mentioned, I encourage the discussion as the deployment model of UI may differ. |
@tarilabs so as far as i understand you plan to have no user isolation at all and no multi-tenancy. Everything is just shared and accessible/abusable from each namespace/department/user That is strictly against what we have so far in kubeflow where we have a shared instance with proper multi-tenancy support (KFP, notebooks, Kserve, Volumes, Tensorboards, Katib, ...). This way we can have scalable zero-overhead namespaces. It also sounds a bit contradictory to what is stated in the first post of this PR |
I'm not sure where you get this understanding? |
per namespace overhead is exactly what we want to prevent, that is no shared instance. |
I'm not following. When I have a KServe workload (Isvc that translates in case of Raw to the Deployment directly), that is deployed to the designated ns, unless I recall this incorrectly (but I'm taking just 1 example) 🤔 |
The inferenceservice itself yes. Also for KFP the pipelines. But not the kserve or pipelines controller/shared deployments. Only the actual workload. In the end you probably will have a menu entry in the centraldashboard sidebar and based on the namespace you are in, you can see different things. You should not be able to see things from other peoples namespaces if you are not a member. One shared global "default" space as for example shared Kubeflow Pipelines is also fine. So based on what i have read now we just had a missunderstanding about the shared instance part and the per namespace parts. |
that is my impression too, I was a bit caught by surprised based on a previous comment 😅 ; we've complied with the direction of the general design of KF, but we've also taken the feedback based on the specific project used/reused (mlmd, but also about the mysql like per the other thread you noticed, but I'm digressing...). definitely there are (as always) details for improvements too, which can be discussed/planned accordingly imho |
Created this slack thread to chat about it: https://cloud-native.slack.com/archives/C073W572LA2/p1733158941696939 |
Hi there, I'm leaving here the outcome of the conversation too, so we'll follow the logic we currently have in midstream tweaking it a little bit for kubeflow and the ability to have multi tenancy, this is a temporary solution as we can do some follow ups, but so far for the 1.10 release the plan would be:
|
@lucferbux thanks! I'll implement on this way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, we can wait till have the other enhancements or just merge this and do a follow up.
``` | ||
# GET /v1/healthcheck | ||
curl -i localhost:4000/api/v1/healthcheck | ||
curl -i -H "kubeflow-userid: [email protected]" localhost:4000/api/v1/healthcheck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We plan go get the swagger automatically generated, but can you tweak the swagger spec to add that info (meaning that we need that header), try securitySchemes
@@ -32,6 +32,8 @@ npm run build | |||
|
|||
This is the default context for running a local UI. Make sure you build the project using the instructions above prior to running the command below. | |||
|
|||
You will need to inject your requests with a kubeflow-userid header for authorization purposes. For example, you can use the [Header Editor](https://chromewebstore.google.com/detail/eningockdidmgiojffjmkdblpjocbhgh) extension in Chrome to set the kubeflow-userid header to [email protected]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the standalone app there's an enhancement im thinking, mocking a login so it can be injected in the header, I can refine that later, but this is just fine.
@@ -0,0 +1,21 @@ | |||
apiVersion: rbac.authorization.k8s.io/v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of things for this:
admin-rbac.yaml
should be a base file now that RBAC by serviceaccount is mandatory.- So
user-rbac
should only havekubeflo-dashboard-rbac.yaml
(I would rename it to something likedefault-user-role.yaml
or something like that. - If you wanna be pro-kustomize, just change
kustomization.yaml
, adding- ../base
as resource, so this is a proper kustomize overlay - If so, change the deploy script to only deploy the overlay and everything will be deployed.
- admin-rbac.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just what I commented in the other review.
|
||
# Step 5: Port-forward the service | ||
echo "Port-fowarding Model Registry UI..." | ||
echo "Port-forwarding Model Registry UI..." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow I thought I fixed it, thanks.
/approve |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lucferbux The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
We can fix those nits in a follow up pr |
Description
In Kubeflow's Central Dashboard, the kubeflow-userid header carries the user's email address for authentication purposes. When a user accesses applications through the dashboard, each request includes this header, allowing the application to identify the user.
On the BFF, we are using kubeflow-userid header to do Authorization via SubjectAccessReview on all endpoints. For all endpoints we only authorize requests from users that can "get", "list" services via ClusterRole.
One thing that we need to double-check with other people from community (@tarilabs please chime in), is that instead of a ClusterRole, we should not do SubjectAccessReview for a given namespace. I've implemented in a 'cluster level' access because my understanding is that model registries access are not restricted by namespace.
So in short, the BFF authorize every user that has get/list for services authorization cluster-wide.
Another important aspect is that for ALL kubernetes APIs calls (including SubjectAccessReview), the BFF is using the service account token for it (after SubjectAccessReview check).
As Kubeflow don't have any bearerToken information, I've removed this piece for codebase. I'm always happy to reintroduce it as soon as have this need, i.e. if any distribution needs to do authentication differently.
In this PR:
How Has This Been Tested?
Using the Header Editor Chrome Extension:
With wrong user name:

With right user name:

Merge criteria:
DCO
check)If you have UI changes