-
Notifications
You must be signed in to change notification settings - Fork 38
Conversation
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MOE only applies to llama now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
MOE only applies to mixtral, mixtral is only different from ffn compared to llama |
Could you post some performance data for this new model here? |
I will add it to ci and use ci to test performance |
If the GGUF version is updated in the future, please add the GGUF format for the mixtral if possible. |
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
<style>
</style>
|
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
Signed-off-by: intellinjun <[email protected]>
<style>
</style>
|
<style> </style> use mul_mat_id_silu fusion |
Signed-off-by: intellinjun <[email protected]>
<style> </style> It's SiLu that slows down the first-token inference, right? |
<style> </style> Silu slows down the next token inference ,first token is due to a problem with the thread setup in mul_mat_id |
<style> </style> Nice! |
Signed-off-by: intellinjun <[email protected]>
Type of Change
feature or bug fix or documentation or others
Model enabling
Description
Enable Mixtral-8x7B