Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support DeepSeek-VL2 models with MoE and MLA in ExecuTorch #8132

Open
iseeyuan opened this issue Feb 2, 2025 · 0 comments
Open

Support DeepSeek-VL2 models with MoE and MLA in ExecuTorch #8132

iseeyuan opened this issue Feb 2, 2025 · 0 comments
Labels
module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@iseeyuan
Copy link
Contributor

iseeyuan commented Feb 2, 2025

🚀 The feature, motivation and pitch

DeepSeek recently released their Mixture-of-Experts Vision-Language Models, DeepSeek-VL2

DeepSeek-VL2-Tiny, DeepSeek-VL2-Small and DeepSeek-VL2, with 1.0B, 2.8B and 4.5B activated parameters respectively.

The Tiny and Small versions are suitable for on-device usage. The unique features of this model include:

  • Mixture-of-Experts (MoE)
  • Multi-head Latent Attention (MLA) mechanism for KV Cache efficiency
  • Multimodal image understanding

Alternatives

There are distilled reasoning models for DeepSeek R1 (it's mentioned in this #7981). However, those models use the same architecture as the targeting models (Llama and Qwen). They don't have the three unique features mentioned above. Especially, MoE and MLA look promising for on-device inference efficiency.

Additional context

No response

RFC (Optional)

Suggested process:

cc @mergennachin @cccclai @helunwencser @dvorjackz

@iseeyuan iseeyuan added the module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code label Feb 2, 2025
@iseeyuan iseeyuan changed the title Support DeepSeek-VL2 Tiny and Small models in ExecuTorch Support DeepSeek-VL2 models with MoE and MLA in ExecuTorch` Feb 2, 2025
@iseeyuan iseeyuan changed the title Support DeepSeek-VL2 models with MoE and MLA in ExecuTorch` Support DeepSeek-VL2 models with MoE and MLA in ExecuTorch Feb 2, 2025
@digantdesai digantdesai added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: llm Issues related to LLM examples and apps, and to the extensions/llm/ code triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

2 participants