Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable performing inferences for gRPC (grpcurl) #183

Merged
merged 41 commits into from
Dec 21, 2023

Conversation

ymoatti
Copy link
Contributor

@ymoatti ymoatti commented Dec 11, 2023

The documentation and the scripts have multiple bugs which prevent using gRPC (grpcurl) for performing inferences and make difficult the usage of HTTP (curl).
This PR has for goal to fix the documentation, the scripts and related yaml files.

Description

Current documentation and scripts need to be fixed for HTTP (curl) as we have several problems such as:
a. Generic “storageUri” value in caikit-tgis-isvc.yaml instead of actual value
b. Error in how KSVC_HOSTNAME is computed in deploy-remove.md (specifies "caikit-example-isvc-predictor" instead of "caikit-tgis-example-isvc-predictor")
c. Missing serviceAccountName field in caikit-tgis-isvc.yaml

In addition, since Knative has no support for multiple ports managed by Knative Serving, we could not create ISVCs for more than a single protocol, thus rendering the usage of gRPC impossible with current scripts and documentation.

Our solution, beyond simple fixes of previously mentioned problems was to build a ServingRuntime specific for each protocol chosen with the adequate ports and details.
The user is required to specify the targeted protocol, either by mentioning its value as a variable (for step-to-step) or by passing it as an argument to the deploy_model script.
Similarly, the inference-call which replaces both http-call and grpc-call is passed the targeted protocol and will perform the inference accordingly.
We have chosen to well separate the HTTP versus gRPC implementations by having separate namespaces (resp. kserve-demo-http and kserve-demo-grpc). This permits to create/delete models for these protocols independently each of the other.
Finally, the launch of the metric service/monitor which appeared in the deploy-model.sh script but not in the step-by-step instructions have been commented out in deploy-model.sh. If judged necessary, they can be reactivated in which case the step-by-step documentation should also be modified.

How Has This Been Tested?

We first applied the script-based installation of Kserve and dependencies which was not changed.
On that basis we performed many possible combinations of:
manual deployments of models (http or grpc)
scripted deployments models (http or grpc)
manual invocations of inferences (http or grpc)
scripted invocations of inferences (http or grpc)
scripted deletion of models (http or grpc)

As said, we tried various combinations of the basic possible actions as mentioned above and checked the expected outcome (either success of inference invocation or disappearance of targeted namespace / artifacts for model deletion).

The main platform on which we tested was AWS ROSA.

Other areas of the code should not be affected since we only change documentation and scripts.

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious). The rational is that testing can be done following the instructions in the documentation.
  • The developer has manually tested the changes and verified that the changes work

Copy link
Contributor

openshift-ci bot commented Dec 11, 2023

Hi @ymoatti. Thanks for your PR.

I'm waiting for a opendatahub-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

demo/kserve/deploy-remove.md Outdated Show resolved Hide resolved
demo/kserve/scripts/test/deploy-model.sh Outdated Show resolved Hide resolved
demo/kserve/scripts/test/inference-call.sh Outdated Show resolved Hide resolved
@Jooho
Copy link
Contributor

Jooho commented Dec 11, 2023

Thanks for your PR.

This issue happens because we are moving forward :)

Please see the inline comments:

a. Generic “storageUri” value in caikit-tgis-isvc.yaml instead of actual value

This script is not for various situations. It is kind of a quick start like a demo so I don't see any reason to use generic value.

b. Error in how KSVC_HOSTNAME is computed in deploy-remove.md (specifies "caikit-example-isvc-predictor" instead of "caikit-tgis-example-isvc-predictor")

This should be a bug.

c. Missing serviceAccountName field in caikit-tgis-isvc.yaml

There are 2 ways to set storage-config using annotation or json style. Originally, Kserve uses annotation way so upstream documentation has the information only but for modelmeh, KServe accepts json style way.OpenDataHub will support modelmesh and kserve both so we are trying to move to json style from the annotation way.
so we don't use serviceAccountName for isvc.

Plus, I will review your PR.

@erezh16
Copy link

erezh16 commented Dec 12, 2023

Thanks for your PR.

This issue happens because we are moving forward :)

Please see the inline comments:

a. Generic “storageUri” value in caikit-tgis-isvc.yaml instead of actual value

This script is not for various situations. It is kind of a quick start like a demo so I don't see any reason to use generic value.

b. Error in how KSVC_HOSTNAME is computed in deploy-remove.md (specifies "caikit-example-isvc-predictor" instead of "caikit-tgis-example-isvc-predictor")

This should be a bug.

c. Missing serviceAccountName field in caikit-tgis-isvc.yaml

There are 2 ways to set storage-config using annotation or json style. Originally, Kserve uses annotation way so upstream documentation has the information only but for modelmeh, KServe accepts json style way.OpenDataHub will support modelmesh and kserve both so we are trying to move to json style from the annotation way. so we don't use serviceAccountName for isvc.

Plus, I will review your PR.

For item a, I think we can keep the original generic YAML but we should add the concrete Minio as a working example, especially given that the original scenario setup begins with an explicit example of setting up Minio for object storage.

Item b - bug, fixed by the PR, and we further replace KSVC-based setup with ISVC-based setup Ok.

Item c - bug, fixed by the PR. Agreed.

Only point remaining - not sure what is JSON-style adding of storage-config. @Jooho Can you give an example?

@ymoatti ymoatti requested a review from Xaenalt December 13, 2023 15:31
@ymoatti ymoatti requested review from dtrifiro and Jooho December 13, 2023 17:01
@erezh16
Copy link

erezh16 commented Dec 17, 2023

Folks, we made some progress with the JSON configuration but it's still not working and the year is ending, so this PR might have to delay quite a bit. Given that JSON config was not part of the original issues for this PR, can we agree to merge this PR using the SA-associated secret (as was the original code before the PR) just so that people can use the deploy-remove scenario without issues? This is already solved and can be reviewed and merged within a day's work.

The new stuff, such as JSON config and using ISVC instead of KSVC to define the virtual host, can be deferred to a later PR.

@ymoatti ymoatti requested review from Xaenalt and dtrifiro December 17, 2023 16:58
@dtrifiro
Copy link
Contributor

added a few nits, I'll give the last word to @Xaenalt

@openshift-ci openshift-ci bot removed the lgtm label Dec 19, 2023
@openshift-ci openshift-ci bot added the lgtm label Dec 21, 2023
Copy link
Contributor

openshift-ci bot commented Dec 21, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dtrifiro, Xaenalt, ymoatti

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 594c854 into opendatahub-io:main Dec 21, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants