Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose generic OpenShift routes for KServe InferenceServices #84

Merged
merged 8 commits into from
Oct 6, 2023

Conversation

vaibhavjainwiz
Copy link
Member

@vaibhavjainwiz vaibhavjainwiz commented Sep 18, 2023

Description

With #59 we had implemented a hot fix by exposing predictor URL directly to user but to provide proper fix we need to implement a generic openshift route for kserve inference service.

How Has This Been Tested?

apiVersion: dscinitialization.opendatahub.io/v1
kind: DSCInitialization
metadata:
  name: example
spec:
  devFlags:
    manifestsUri: https://github.com/vaibhavjainwiz/odh-manifests/tarball/generic_route

If everything work as mentioned in above step then we are good to go :)

Expected behavior

A new route should be created in istio-system namespace with name same as InferenceService. If new route is created then we are good to go :)

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@Jooho
Copy link
Contributor

Jooho commented Sep 18, 2023

I am testing this PR but I have a couple of questions.

  1. what images should be used for this test?
    when I followed your test steps, it deployed quay.io/opendatahub/odh-model-controller:stable
    I don't think it is your test image for this PR.
  2. What is the expected result from these test steps?
    Where should I check? Route?

Copy link
Contributor

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, add unit tests.

controllers/kserve_inferenceservice_controller.go Outdated Show resolved Hide resolved
controllers/reconcilers/kisvc_route_reconciler.go Outdated Show resolved Hide resolved
controllers/kserve_inferenceservice_controller.go Outdated Show resolved Hide resolved
@vaibhavjainwiz
Copy link
Member Author

vaibhavjainwiz commented Sep 19, 2023

  1. what images should be used for this test?
    when I followed your test steps, it deployed quay.io/opendatahub/odh-model-controller:stable

To test this PR you need to use odh-model-controller image created by openshift-ci for this corresponding PR.
quay.io/opendatahub/odh-model-controller:pr-84
I am not sure how we could provide overriden value of odh-model-controller image through v2 operator. @Jooho Could you please with this configuration.

  1. What is the expected result from these test steps?
    Where should I check? Route?

With this PR, new more generic route would be created in istio-system namespace which should be use to query model.

Signed-off-by: Vaibhav Jain <[email protected]>
@Jooho
Copy link
Contributor

Jooho commented Sep 20, 2023

To test this PR you need to use odh-model-controller image created by openshift-ci for this corresponding PR.
quay.io/opendatahub/odh-model-controller:pr-84
I am not sure how we could provide overriden value of odh-model-controller image through v2 operator. @Jooho Could you please with this configuration.

I'm not entirely certain about the testing approach you ultimately used, but if testers or reviewers are expected to rely on the PR image, it would greatly help if you could provide clear and accurate test steps in the description. This would make it much more straightforward to conduct testing using the provided information.

With this PR, new more generic route would be created in istio-system namespace which should be use to query model.

As testers, it's essential that the PR description provides clear guidance on what needs to be verified using the 'Expected result' or 'Expected behavior' sections. Testing can be a time-consuming process for reviewers and testers, so it's crucial for PR senders to provide detailed clarification in this regard.

@vaibhavjainwiz
Copy link
Member Author

vaibhavjainwiz commented Sep 20, 2023

On my local, I follow following step for testing :

  • Install the Kserve controller using odh operator
  • Remove ODH operator
  • Set deployment replicas of odh-model-controller to 0
  • Run odh-model-controller from local in debug mode.

To support testing for reviewer, I had created a image of odh-model-controller from local and push it in quay.io.
I had also updated the image reference in odh-manifest repo so if deploy their cluster with https://github.com/vaibhavjainwiz/odh-manifests/tarball/generic_route in DSCInitialization then everything should work.

@heyselbi heyselbi linked an issue Sep 20, 2023 that may be closed by this pull request
@vaibhavjainwiz
Copy link
Member Author

@Jooho
I had added Expected behavior section in Issue Description.

@vaibhavjainwiz
Copy link
Member Author

/test

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 20, 2023

@vaibhavjainwiz: The /test command needs one or more targets.
The following commands are available to trigger required jobs:

  • /test fvt
  • /test images
  • /test pr-image-mirror
  • /test unit

Use /test all to run all jobs.

In response to this:

/test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vaibhavjainwiz
Copy link
Member Author

/test all

@vaibhavjainwiz
Copy link
Member Author

@israel-hdez I had the unit testcases. Could you please review.

Copy link
Contributor

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vaibhavjainwiz I'm happy with the code.

I still want to test before approving. I didn't have time today.

If you want an approval sooner, maybe work with @rpancham and test it together?
Otherwise, I can try on my Tuesday.

controllers/inferenceservice_controller.go Outdated Show resolved Hide resolved
controllers/kserve_inferenceservice_controller.go Outdated Show resolved Hide resolved
@israel-hdez israel-hdez dismissed their stale review September 25, 2023 23:21

Code looks OK.

Signed-off-by: Vaibhav Jain <[email protected]>
@israel-hdez
Copy link
Contributor

/retest

Copy link
Contributor

@israel-hdez israel-hdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested. So far, looks good, with some issues:

  • It seems RBAC is missing in this PR. After installing and creating an ISVC I got spec.host: Forbidden: you do not have permission to set the host field of the route, which I fixed by modifying the Role to add something like this: Route's new admission routes/custom-host doesn't work for legacy API openshift/origin#14904 (comment).
    • I remember this RBAC would need to go in some kubebuilder comments, to let it generate them for you. I don't know how to do this.
  • Similar permission denied error when trying to delete the ISVC and, thus, trying to delete the route.
  • Name conflict. Having two ISVCs with same name on different namespaces, will lead to a conflict in the route.
  • Schema mismatch: The ISVC can state that protocol is plain HTTP, while the route will be created for HTTPS.
    • Perhaps for our demo/sample setup, this is error config on KServe side (i.e. is config error on my testing). Yet, I think we want to honor the schema in the ISVC, just in case some user wants to use plain HTTP.

@vaibhavjainwiz
Copy link
Member Author

vaibhavjainwiz commented Sep 28, 2023

I tested. So far, looks good, with some issues:

  • It seems RBAC is missing in this PR. After installing and creating an ISVC I got spec.host: Forbidden: you do not have permission to set the host field of the route, which I fixed by modifying the Role to add something like this: Route's new admission routes/custom-host doesn't work for legacy API openshift/origin#14904 (comment).

    • I remember this RBAC would need to go in some kubebuilder comments, to let it generate them for you. I don't know how to do this.
  • Similar permission denied error when trying to delete the ISVC and, thus, trying to delete the route.

Strange these RBAC are already present in code. You should not see this error. I will debug it more.

// +kubebuilder:rbac:groups=route.openshift.io,resources=routes,verbs=get;list;watch;create;update;patch

  • Name conflict. Having two ISVCs with same name on different namespaces, will lead to a conflict in the route.

I updated route name to include isvc.Name + isvc.Namespace so that It would not cause.

  • Schema mismatch: The ISVC can state that protocol is plain HTTP, while the route will be created for HTTPS.

    • Perhaps for our demo/sample setup, this is error config on KServe side (i.e. is config error on my testing). Yet, I think we want to honor the schema in the ISVC, just in case some user wants to use plain HTTP.

Are you talking about below config?
https://github.com/opendatahub-io/kserve/blob/1529b712ae536286fe939e7b6ad426305bf168f1/config/configmap/inferenceservice.yaml#L452

@israel-hdez
Copy link
Contributor

Strange these RBAC are already present in code. You should not see this error. I will debug it more.

// +kubebuilder:rbac:groups=route.openshift.io,resources=routes,verbs=get;list;watch;create;update;patch

Yes, some RBAC is there. Although, it is missing delete privileges.
It is also missing the privilege to set custom hosts, which is a different resource.

  • Schema mismatch: The ISVC can state that protocol is plain HTTP, while the route will be created for HTTPS.

    • Perhaps for our demo/sample setup, this is error config on KServe side (i.e. is config error on my testing). Yet, I think we want to honor the schema in the ISVC, just in case some user wants to use plain HTTP.

Are you talking about below config? https://github.com/opendatahub-io/kserve/blob/1529b712ae536286fe939e7b6ad426305bf168f1/config/configmap/inferenceservice.yaml#L452

Yes. That configuration is reflected in the ISVC. So, you don't need to look at the ConfigMap. Just check the schema set in the ISVC.

Signed-off-by: Vaibhav Jain <[email protected]>
@vaibhavjainwiz
Copy link
Member Author

@israel-hdez I had incoporated all review comments. Could you please review again.
I had made the change in odh-model-controller to honor user specified url scheme but some change are also required in kserve as well because url in InferenceService status is set by KServe controller itself. I will create a seperate ticket in KServe to fix it.

@israel-hdez
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Oct 6, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 6, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: israel-hdez, vaibhavjainwiz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Oct 6, 2023
@openshift-ci openshift-ci bot merged commit f076018 into opendatahub-io:main Oct 6, 2023
1 check passed
@Jooho Jooho added the odh-1.11 label Oct 16, 2023
israel-hdez added a commit to israel-hdez/kserve that referenced this pull request Oct 24, 2023
Partial revert of
opendatahub-io/odh-manifests#916, because
opendatahub-io/odh-model-controller#84 has been
completed.

Signed-off-by: Edgar Hernández <[email protected]>
@vaibhavjainwiz vaibhavjainwiz deleted the generic_route branch December 1, 2023 09:44
dtrifiro pushed a commit that referenced this pull request Aug 8, 2024
Add the vllm image and add the missing @ caikit-standalone-image
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose generic OpenShift routes for KServe InferenceServices
3 participants