Is there a way to retrieve key-value cache values using onnxruntime-genai? #416
Tabrizian
started this conversation in
New features / APIs
Replies: 2 comments
-
Hi @Tabrizian, this is an API that we are looking at for another customer. I will move it into the Discussions forum where we can continue the conversation. If you have specific use cases, please feel free to add them there! |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks @natke. The use-case I have is to be able to manually control which set of requests are included in every batch. Being able to obtain key-value cache parameters can help us avoid redundant computation when swapping sequences in and out of the request batch. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Can we retrieve the key-value cache values using this package for re-use across different generations?
Beta Was this translation helpful? Give feedback.
All reactions