Support DeepSeek-VL2 models with MoE and MLA in ExecuTorch #8132
Labels
module: llm
Issues related to LLM examples and apps, and to the extensions/llm/ code
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
🚀 The feature, motivation and pitch
DeepSeek recently released their Mixture-of-Experts Vision-Language Models, DeepSeek-VL2
The Tiny and Small versions are suitable for on-device usage. The unique features of this model include:
Alternatives
There are distilled reasoning models for DeepSeek R1 (it's mentioned in this #7981). However, those models use the same architecture as the targeting models (Llama and Qwen). They don't have the three unique features mentioned above. Especially, MoE and MLA look promising for on-device inference efficiency.
Additional context
No response
RFC (Optional)
Suggested process:
cc @mergennachin @cccclai @helunwencser @dvorjackz
The text was updated successfully, but these errors were encountered: