-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
finetune crashes on assert_shape_2d for Mistral based models #3404
Comments
Not to put this bug report off-track but it may be a more general problem. Regular inference mode using server.cpp (compiled with and using clBLAST on RX 580 gpu) also crash on this set of asserts with the mistral models. commit bc39553
|
It seems the shape of so I just commented these lines and maybe it works llama.cpp/examples/finetune/finetune.cpp Lines 335 to 336 in 40e07a6
trying to fine-tune a model on cpu:
|
the shapes for init model of gqa models was wrong
…example * 'master' of github.com:ggerganov/llama.cpp: (24 commits) convert : fix Baichuan2 models by using vocab size in config.json (ggerganov#3299) readme : add project status link ggml : fix build after ggerganov#3329 llm : add Refact model (ggerganov#3329) sync : ggml (conv 1d + 2d updates, UB fixes) (ggerganov#3468) finetune : readme fix typo (ggerganov#3465) ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (ggerganov#3453) main : consistent prefix/suffix coloring (ggerganov#3425) llama : fix session saving/loading (ggerganov#3400) llama : expose model's rope_freq_scale in the API (ggerganov#3418) metal : alibi for arbitrary number of heads (ggerganov#3426) cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (ggerganov#3273) Work on the BPE tokenizer (ggerganov#3252) convert : fix vocab size when not defined in hparams (ggerganov#3421) cmake : increase minimum version for add_link_options (ggerganov#3444) CLBlast: Add broadcast support for matrix multiplication (ggerganov#3402) gguf : add BERT, MPT, and GPT-J arch info (ggerganov#3408) gguf : general usability improvements (ggerganov#3409) cmake : make CUDA flags more similar to the Makefile (ggerganov#3420) finetune : fix ggerganov#3404 (ggerganov#3437) ...
the shapes for init model of gqa models was wrong
I'm getting this, and it crashes out. main: init model Has the fix for this been pushed or am I just getting the same issue. Trying to finetune synthia 1.3 (Mistral model version) |
@Drael64 that looks like a different issue, please open a new issue and include instructions to reproduce it. |
Okay did that. Might be a mess as I have no idea what I am doing. |
commit 40e07a6
GGML_ASSERT: common/train.cpp:192: tensor->ne[1] == ne1
This crash comes from assert_shape_2d function.
I tried synthia-7b-v1.3.Q8_0.gguf from TheBloke and original Mistral model on 2 Intel based Macbooks, same crash.
% uname -v
Darwin Kernel Version 21.6.0: Fri Sep 15 16:17:23 PDT 2023; root:xnu-8020.240.18.703.5~1/RELEASE_X86_64
MacBook Pro Mid 2015 (Intel)
Python 3.11.4
GNU Make 3.81
Logs:
The text was updated successfully, but these errors were encountered: