api: initial skeleton of LLMRoute and LLMBackend #20

mathetake · 2024-12-03T22:04:07Z

This adds the skeleton API of LLMRoute and LLMBackend.
These two resources would be the foundation for the future
iterations, such as authn/z, token-based rate limiting,
schema transformation and more advanced thingy like #10

Note: we might / will break APIs if necessity comes up until
the initial release.

part of #13

cc @yuzisun

Signed-off-by: Takeshi Yoneda <[email protected]>

mathetake · 2024-12-03T22:09:46Z

cc @arkodg

mathetake · 2024-12-03T22:13:47Z

cc @sanjeewa-malalgoda (i would appreciate it if you can ping other folks from your org)

Signed-off-by: Takeshi Yoneda <[email protected]>

sanjeewa-malalgoda · 2024-12-04T04:36:21Z

@mathetake We reviewed this and it looks good for us.

mathetake

Reviewed myself🏃

mathetake · 2024-12-04T04:51:56Z

api/v1alpha1/api.go

+	// APISchema specifies the API schema of the input that the target Gateway(s) will receive.
+	// Based on this schema, the ai-gateway will perform the necessary transformation to the
+	// output schema specified in the selected LLMBackend during the routing process.
+	APISchema LLMAPISchema `json:"inputSchema"`


Suggested change

APISchema LLMAPISchema `json:"inputSchema"`

APISchema LLMAPISchema `json:"apiSchema"`

Are we going to allow different APISchema for different LLM route if we define at the route level?

If my understanding is correct, this indicates which vendor and model the route belongs to. Therefore, a single LLMRoute can correspond to only one vendor and model

Are we going to allow different APISchema for different LLM route if we define at the route level?

I think the answer would be no at the moment since i am not sure why a user wants to access API like that - clients need to use different set of API clients and switching it from their end depending on a path? idk.

api/v1alpha1/api.go

Signed-off-by: Takeshi Yoneda <[email protected]>

mathetake · 2024-12-04T16:09:52Z

on second thought i kept the field name inputSchema and outputSchema for LLMRoute and LLMBackend respectively, instead of the same apiSchema as I left comment by myself yesterday. I think it's almost ready to go

Signed-off-by: Takeshi Yoneda <[email protected]>

codefromthecrypt

drive-by notes

api/v1alpha1/api.go

codefromthecrypt · 2024-12-05T04:38:10Z

manifests/charts/ai-gateway-helm/crds/aigateway.envoyproxy.io_llmroutes.yaml

+                  Based on this schema, the ai-gateway will perform the necessary transformation to the
+                  output schema specified in the selected LLMBackend during the routing process.
+
+                  Currently, the only supported schema is OpenAI as the input schema.


would it be less problematic to remove this text constraint and introduce bedrock later once supported?

the constraint will be enforced in the cel validation rule that happens at k8s API server - will do the follow up soon

good comment anyways adrian as usual

Co-authored-by: Adrian Cole <[email protected]> Signed-off-by: Takeshi Yoneda <[email protected]>

Signed-off-by: Takeshi Yoneda <[email protected]>

This commit is a follow up on #20. Basically, this makes LLMRoute a pure "addition" to the existing standardized HTTPRoute. This makes it possible to configure something like ``` kind: LLMRoute metadata: name: llm-route spec: inputSchema: OpenAI httpRouteRef: name: my-llm-route --- kind: HTTPRoute metadata: name: my-llm-route spec: matches: - headers: key: x-envoy-ai-gateway-llm-model value: llama3-70b backendRefs: - kserve: weight: 20 - aws-bedrock: weight: 80 ``` where LLMRoute is purely referencing HTTPRoute and users can configure whatever routing condition in a standardized way via HTTPRoute while leveraging the LLM specific information, in this case x-envoy-ai-gateway-llm-model header. In the implementation, though it's not merged yet, we have to do the routing calculation in the extproc by actually analyzing the referenced HTTPRoute, and emulate the behavior in order to do the transformation. The reason is that the routing decision is made at the very end of filter chain in general, and by the time we invoke extproc, we don't have that info. Furthermore, `x-envoy-ai-gateway-llm-model` is not available before extproc. As a bonus of this, we no longer need TargetRef at LLMRoute level since that's within the HTTPRoute resources. This will really simplify the PoC implementation. --------- Signed-off-by: Takeshi Yoneda <[email protected]>

api: initial skeleton of LLMRoute and LLMBackend

4ada96d

Signed-off-by: Takeshi Yoneda <[email protected]>

mathetake requested review from aabchoo, missBerg and wengyao04 as code owners December 3, 2024 22:04

more

6cee7bc

Signed-off-by: Takeshi Yoneda <[email protected]>

aabchoo approved these changes Dec 3, 2024

View reviewed changes

mathetake added 3 commits December 3, 2024 14:39

Adds schema

897fb3b

Signed-off-by: Takeshi Yoneda <[email protected]>

wording

c06c6c8

Signed-off-by: Takeshi Yoneda <[email protected]>

more wording

d3d0de4

Signed-off-by: Takeshi Yoneda <[email protected]>

sanjeewa-malalgoda approved these changes Dec 4, 2024

View reviewed changes

Krishanx92 approved these changes Dec 4, 2024

View reviewed changes

mathetake commented Dec 4, 2024

View reviewed changes

yuzisun reviewed Dec 4, 2024

View reviewed changes

api/v1alpha1/api.go Outdated Show resolved Hide resolved

yuzisun reviewed Dec 4, 2024

View reviewed changes

api/v1alpha1/api.go Outdated Show resolved Hide resolved

mathetake added 2 commits December 4, 2024 07:51

review: the comment

1286fdb

Signed-off-by: Takeshi Yoneda <[email protected]>

clarify not arbitrary input is supported

865ec35

Signed-off-by: Takeshi Yoneda <[email protected]>

arkodg mentioned this pull request Dec 5, 2024

Support custom backendRefs via extensions envoyproxy/gateway#4762

Open

zirain approved these changes Dec 5, 2024

View reviewed changes

merge

f29cbd6

Signed-off-by: Takeshi Yoneda <[email protected]>

codefromthecrypt reviewed Dec 5, 2024

View reviewed changes

mathetake and others added 2 commits December 5, 2024 07:43

Update api/v1alpha1/api.go

721da55

Co-authored-by: Adrian Cole <[email protected]> Signed-off-by: Takeshi Yoneda <[email protected]>

schema

54ec90c

Signed-off-by: Takeshi Yoneda <[email protected]>

mathetake merged commit e95a824 into main Dec 5, 2024
5 checks passed

mathetake deleted the apidefinitions-llv branch December 5, 2024 17:34

mathetake mentioned this pull request Dec 9, 2024

api: make LLMRoute reference HTTPRoute #39

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

api: initial skeleton of LLMRoute and LLMBackend #20

api: initial skeleton of LLMRoute and LLMBackend #20

mathetake commented Dec 3, 2024 •

edited

Loading

mathetake commented Dec 3, 2024

mathetake commented Dec 3, 2024

sanjeewa-malalgoda commented Dec 4, 2024

mathetake left a comment

mathetake Dec 4, 2024

yuzisun Dec 4, 2024

Krishanx92 Dec 4, 2024

mathetake Dec 4, 2024

mathetake commented Dec 4, 2024

codefromthecrypt left a comment

codefromthecrypt Dec 5, 2024

mathetake Dec 5, 2024

mathetake Dec 5, 2024

	APISchema LLMAPISchema `json:"inputSchema"`
	APISchema LLMAPISchema `json:"apiSchema"`

api: initial skeleton of LLMRoute and LLMBackend #20

api: initial skeleton of LLMRoute and LLMBackend #20

Conversation

mathetake commented Dec 3, 2024 • edited Loading

mathetake commented Dec 3, 2024

mathetake commented Dec 3, 2024

sanjeewa-malalgoda commented Dec 4, 2024

mathetake left a comment

Choose a reason for hiding this comment

mathetake Dec 4, 2024

Choose a reason for hiding this comment

yuzisun Dec 4, 2024

Choose a reason for hiding this comment

Krishanx92 Dec 4, 2024

Choose a reason for hiding this comment

mathetake Dec 4, 2024

Choose a reason for hiding this comment

mathetake commented Dec 4, 2024

codefromthecrypt left a comment

Choose a reason for hiding this comment

codefromthecrypt Dec 5, 2024

Choose a reason for hiding this comment

mathetake Dec 5, 2024

Choose a reason for hiding this comment

mathetake Dec 5, 2024

Choose a reason for hiding this comment

mathetake commented Dec 3, 2024 •

edited

Loading