You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
head_dim in mixtral model is forced to have the value of hidden_size // num_heads. However, this it not the case in llama model or even in mistral model. So, it will be a good minor feature to support manual head_dim setting for mixtral model as well!
Motivation
manual head_dim in llama or mistral model
Your contribution
PR
The text was updated successfully, but these errors were encountered:
Feature request
transformers/src/transformers/models/mixtral/modeling_mixtral.py
Line 284 in 816f442
head_dim
inmixtral
model is forced to have the value ofhidden_size // num_heads
. However, this it not the case inllama
model or even inmistral
model. So, it will be a good minor feature to support manualhead_dim
setting formixtral
model as well!Motivation
head_dim
in llama or mistral modelYour contribution
PR
The text was updated successfully, but these errors were encountered: