2x7b model gives error about MoE quant and context memory pool #1291

cmdrscotty · 2024-12-29T20:41:55Z

Describe the Issue
Any time I try to run a 2x7b model koboldcpp will error out with the following info:

!!!!!! WARNING: Using extremely outdated MoE quant. Please update it! Attempting to apply hacky kcpp fallback, using last ctx:0x1fe24dc94a0 ggml_new_object: not enough space in the context's memory pool (needed 209520, available 209088) ggml/src/ggml.c:1600: GGML_ASSERT(obj_new) failed

Doing 7b, 13b, or 20b doesn't give any issue at all. Tried it with 4k context, 6k, 8k and 12k context, same error no matter what with 2x7b models.

Additional Information:
Windows 11
AMD Ryzen 7 5700G
RX 7900 XTX (24GB) (Vulkan API)

KoboldCPP 1.80

Tested with InfinitiKuno-2x7B and Blue-Orchid-2x7B both throw the same error

KCPPS attached:
defaults3_win_amd_2x7B.zip

Of note, tested loading the model directly in LM Studio, works just fine loads it with no problem, even tried manually setting KoboldCPP to MoE Experts 2, and still errors out

The text was updated successfully, but these errors were encountered:

LostRuins · 2024-12-30T03:50:50Z

Can you try requantizing that model? The tools can be found here: https://kcpptools.concedo.workers.dev/

cmdrscotty · 2024-12-30T22:16:54Z

Gave it a pass through re-quantize but still got the same error as before:

!!!!!! WARNING: Using extremely outdated MoE quant. Please update it! Attempting to apply hacky kcpp fallback, using last ctx:0x17532644e90 ggml_new_object: not enough space in the context's memory pool (needed 209520, available 209088) ggml/src/ggml.c:1600: GGML_ASSERT(obj_new) failed

if it helps this is the command I ran for the re-quantize pass (assuming I did it right, but if not let me know)

.\quantize_gguf.exe --allow-requantize .\Blue-Orchid-2x7b-Q6_K.gguf 18 8

LostRuins · 2024-12-31T09:11:57Z

The output file will have a different name - make sure you pick the right one. By default it'll probably be named ggml-model or something. It does not overwrite the original model.

cmdrscotty · 2024-12-31T16:37:11Z

ok yup went back and ran it again to be sure, used the output ggml-model-Q6_K.gguf that it created after re-quantizing, same error message.

wbruna · 2025-01-04T23:54:16Z

FWIW, I just tested https://huggingface.co/tensorblock/Blue-Orchid-2x7b-GGUF Q3_K_S on 1.81, and it worked fine, with no warning.

cmdrscotty · 2025-01-05T01:39:30Z

interesting yup downloaded 1.81 and tried the one you linked worked just fine.

looking in LM studio looks like the one I was downloading came from LoneStriker so makes me wonder if there's something different with how that one was put together that causes the MoE error.

But ran the Q6_K quant on the benchmark and came back with great results on Vulkan (RX 7900 XTX)

Benchmark Completed - v1.81 Results:
======
Flags: NoAVX2=False Threads=15 HighPriority=True Cublas_Args=None Tensor_Split=None BlasThreads=15 BlasBatchSize=512 FlashAttention=False KvCache=0
Timestamp: 2025-01-05 01:36:57.140774+00:00
Backend: koboldcpp_vulkan.dll
Layers: 35
Model: Blue-Orchid-2x7b-Q6_K
MaxCtx: 12288
GenAmount: 100
-----
ProcessingTime: 23.252s
ProcessingSpeed: 524.17T/s
GenerationTime: 4.061s
GenerationSpeed: 24.62T/s
TotalTime: 27.313s
Output:  1 1 1 1
-----
===
Press ENTER key to exit.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2x7b model gives error about MoE quant and context memory pool #1291

2x7b model gives error about MoE quant and context memory pool #1291

cmdrscotty commented Dec 29, 2024 •

edited

Loading

LostRuins commented Dec 30, 2024

cmdrscotty commented Dec 30, 2024

LostRuins commented Dec 31, 2024

cmdrscotty commented Dec 31, 2024

wbruna commented Jan 4, 2025

cmdrscotty commented Jan 5, 2025 •

edited

Loading

2x7b model gives error about MoE quant and context memory pool #1291

2x7b model gives error about MoE quant and context memory pool #1291

Comments

cmdrscotty commented Dec 29, 2024 • edited Loading

LostRuins commented Dec 30, 2024

cmdrscotty commented Dec 30, 2024

LostRuins commented Dec 31, 2024

cmdrscotty commented Dec 31, 2024

wbruna commented Jan 4, 2025

cmdrscotty commented Jan 5, 2025 • edited Loading

cmdrscotty commented Dec 29, 2024 •

edited

Loading

cmdrscotty commented Jan 5, 2025 •

edited

Loading