-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2x7b model gives error about MoE quant and context memory pool #1291
Comments
Can you try requantizing that model? The tools can be found here: https://kcpptools.concedo.workers.dev/ |
Gave it a pass through re-quantize but still got the same error as before:
if it helps this is the command I ran for the re-quantize pass (assuming I did it right, but if not let me know)
|
The output file will have a different name - make sure you pick the right one. By default it'll probably be named ggml-model or something. It does not overwrite the original model. |
ok yup went back and ran it again to be sure, used the output ggml-model-Q6_K.gguf that it created after re-quantizing, same error message. |
FWIW, I just tested https://huggingface.co/tensorblock/Blue-Orchid-2x7b-GGUF Q3_K_S on 1.81, and it worked fine, with no warning. |
interesting yup downloaded 1.81 and tried the one you linked worked just fine. looking in LM studio looks like the one I was downloading came from LoneStriker so makes me wonder if there's something different with how that one was put together that causes the MoE error. But ran the Q6_K quant on the benchmark and came back with great results on Vulkan (RX 7900 XTX)
|
Describe the Issue
Any time I try to run a 2x7b model koboldcpp will error out with the following info:
!!!!!! WARNING: Using extremely outdated MoE quant. Please update it! Attempting to apply hacky kcpp fallback, using last ctx:0x1fe24dc94a0 ggml_new_object: not enough space in the context's memory pool (needed 209520, available 209088) ggml/src/ggml.c:1600: GGML_ASSERT(obj_new) failed
Doing 7b, 13b, or 20b doesn't give any issue at all. Tried it with 4k context, 6k, 8k and 12k context, same error no matter what with 2x7b models.
Additional Information:
Windows 11
AMD Ryzen 7 5700G
RX 7900 XTX (24GB) (Vulkan API)
KoboldCPP 1.80
Tested with InfinitiKuno-2x7B and Blue-Orchid-2x7B both throw the same error
KCPPS attached:
defaults3_win_amd_2x7B.zip
Of note, tested loading the model directly in LM Studio, works just fine loads it with no problem, even tried manually setting KoboldCPP to MoE Experts 2, and still errors out
The text was updated successfully, but these errors were encountered: