Support non-Llama models #965

terrytangyuan · 2025-02-05T03:01:57Z

🚀 Describe the new functionality needed

We currently have a workaround to support non-Llama models through the remote vLLM provider but it would be great to support this officially.

For inline vLLM provider, this is work-in-progress: #880

Let's use this issue to discuss any proposals and technical considerations.

sanderjson · 2025-02-05T12:47:57Z

@sanderjson Stay tuned. until now we have exclusively supported llama models only, but we will be making a few changes to enable other models soon -- as early as next couple days. (last week)

#721 (comment)

nathan-weinberg · 2025-02-06T18:08:08Z

Should include updates to the Philosophy section here as well: https://llama-stack.readthedocs.io/en/latest/introduction/index.html#our-philosophy

bbrowning · 2025-02-14T22:47:05Z

Maybe enabling the usage of non-Llama generally as simple as removing the ValueError thrown here:

llama-stack/llama_stack/providers/utils/inference/model_registry.py

Lines 74 to 77 in df864ee

    
           raise ValueError( 
        
               f"Model '{model.provider_resource_id}' is not available and no llama_model was specified in metadata. " 
        
               "Please specify a llama_model in metadata or use a supported model identifier" 
        
           )

At that point in the code we've checked the model aliases across the providers, didn't find a matching alias, so instead of raising the error we could just fallback to using the provided model.provider_resource_id and assume the user knows enough to register a model id that matches something valid in their backend provider.

That would let us still take advantage of the aliases for all well-known Llama models, but let users attempt to use other models that their providers support without throwing an error.

terrytangyuan · 2025-02-22T04:21:43Z

Non-Llama models are supported now in several inference providers (including remote vLLM) after merging #1200.

terrytangyuan added the enhancement New feature or request label Feb 5, 2025

terrytangyuan mentioned this issue Feb 5, 2025

Support non-Llama models #964

Closed

cdoern mentioned this issue Feb 7, 2025

refactor: support downloading any model from HF #1001

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support non-Llama models #965

Support non-Llama models #965

terrytangyuan commented Feb 5, 2025 •

edited

Loading

sanderjson commented Feb 5, 2025 •

edited

Loading

nathan-weinberg commented Feb 6, 2025

bbrowning commented Feb 14, 2025

terrytangyuan commented Feb 22, 2025 •

edited

Loading

Support non-Llama models #965

Support non-Llama models #965

Comments

terrytangyuan commented Feb 5, 2025 • edited Loading

🚀 Describe the new functionality needed

sanderjson commented Feb 5, 2025 • edited Loading

nathan-weinberg commented Feb 6, 2025

bbrowning commented Feb 14, 2025

terrytangyuan commented Feb 22, 2025 • edited Loading

terrytangyuan commented Feb 5, 2025 •

edited

Loading

sanderjson commented Feb 5, 2025 •

edited

Loading

terrytangyuan commented Feb 22, 2025 •

edited

Loading