-
Notifications
You must be signed in to change notification settings - Fork 889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support non-Llama models #965
Comments
|
Should include updates to the Philosophy section here as well: https://llama-stack.readthedocs.io/en/latest/introduction/index.html#our-philosophy |
Maybe enabling the usage of non-Llama generally as simple as removing the llama-stack/llama_stack/providers/utils/inference/model_registry.py Lines 74 to 77 in df864ee
model.provider_resource_id and assume the user knows enough to register a model id that matches something valid in their backend provider.
That would let us still take advantage of the aliases for all well-known Llama models, but let users attempt to use other models that their providers support without throwing an error. |
Non-Llama models are supported now in several inference providers (including remote vLLM) after merging #1200. |
🚀 Describe the new functionality needed
We currently have a workaround to support non-Llama models through the remote vLLM provider but it would be great to support this officially.
For inline vLLM provider, this is work-in-progress: #880
Let's use this issue to discuss any proposals and technical considerations.
The text was updated successfully, but these errors were encountered: